← All posts
5 min read

Group By and Aggregate Without Writing SQL

Total revenue by category. Average order value by region. Count of support tickets by priority level. These are the questions you ask every day, and they all have the same shape: take a column, group the rows by its values, and calculate something for each group. In SQL, that's GROUP BY. In a spreadsheet, it's a pivot table.

In ExploreMyData, you have three ways to do this - each suited to different situations. This post walks through all three so you can pick the right one.

Approach 1: Pivot operation (the visual GROUP BY)

The Pivot operation is the most direct way to aggregate. It works like a spreadsheet pivot table, but it's generating SQL under the hood.

Example: total revenue by product category

You have a table with columns: order_id, category, revenue, region. You want total revenue per category.

  1. Open the Pivot operation from the toolbar.
  2. Set value column to "revenue".
  3. Set pivot column (the grouping column) to "category".
  4. Set aggregate function to SUM.
  5. Apply.

The result collapses your detail rows into one row per category with the summed revenue:

categorysum_revenue
Electronics142,500
Clothing87,300
Home & Garden63,200

Under the hood, this is SELECT category, SUM(revenue) FROM ... GROUP BY category. You can see the exact SQL in the pipeline card. For more on Pivot specifically, see the pivot tables post.

Pivot operation configuration

  • Value column: revenue
  • Pivot column (group by): category
  • Aggregate function: SUM

Generated SQL: SELECT category, SUM(revenue) AS sum_revenue FROM data GROUP BY category ORDER BY sum_revenue DESC

The tradeoff: Pivot collapses your rows. You go from 10,000 order rows to 3 category rows. That's often what you want for a summary. But if you need the aggregate value alongside each detail row - say, each order's revenue plus the category total - you need a different approach.

Approach 2: Window functions (aggregate without collapsing)

Window functions compute an aggregate but attach it to every row instead of collapsing groups. This is SUM(revenue) OVER (PARTITION BY category) in SQL - and it's available as a visual operation.

Example: each order with its category total

  1. Open the Window Function operation.
  2. Set function type to "running_total" (which uses SUM).
  3. Set the column to "revenue".
  4. Set partition by to "category".
  5. Name the output column "category_total".

Now every row has its original data plus a new column showing the category total:

order_idcategoryrevenuecategory_total
1001Electronics250142,500
1002Clothing7587,300
1003Electronics430142,500

This is powerful because you can then add a calculated column for percentage of category: revenue / category_total * 100. You keep the detail rows and add context. The window functions guide covers all six function types in depth.

Approach 3: SQL operation (for complex cases)

Sometimes you need GROUP BY with conditions - like "total revenue by category, but only for categories with more than 10 orders." That's a HAVING clause, and the visual operations don't directly support it. This is where the SQL operation comes in.

  1. Open the SQL operation from the toolbar.
  2. Write your query against the current table (referenced as the table name shown in the pipeline).
  3. Apply.

For the HAVING example:

SELECT category, SUM(revenue) as total, COUNT(*) as orders FROM {table} GROUP BY category HAVING COUNT(*) > 10

Other cases where SQL is the better choice:

  • Multiple aggregates in one query (SUM, AVG, COUNT, MIN, MAX together)
  • GROUP BY on an expression (like grouping by month extracted from a date)
  • Combining GROUP BY with JOINs across tables

You can also use the AI SQL feature - type something like "total revenue and average order value by category, only categories with more than 10 orders" and the app generates the SQL for you. See the SQL guide for more on writing queries directly.

categorytotal_revenueorders
Electronics142,50047
Clothing87,30031
Home & Garden63,20022
Sports41,80018

Result of GROUP BY category HAVING COUNT(*) > 10 - only categories with more than 10 orders appear.

When to use which approach

Here's the decision framework:

  • Pivot - you want a summary table. One row per group, one aggregate value. This is your default choice for "total X by Y" questions. Quick, visual, no SQL needed.
  • Window function - you want the aggregate alongside every detail row. Use this for percentage-of-total calculations, comparisons to group averages, or any time you need the original rows preserved.
  • SQL operation - you need HAVING clauses, multiple aggregates, grouped expressions, or anything else the visual operations don't cover. It's the escape hatch, and it's always there.

These three approaches aren't mutually exclusive. You can pivot first to get a summary, then use the result as input to another pipeline step. Or add a window function for context, then filter on it. They're all just pipeline steps that chain together.

A practical example: sales report

Say you're building a report from raw order data. You might combine all three approaches:

  1. Filter - remove cancelled orders
  2. Window function - add category_total to each row
  3. Calculated column - compute pct_of_category
  4. Pivot - summarize by region for the final output

Each step builds on the last. The pipeline sidebar shows every step, the SQL for each one, and you can edit or reorder any of them. That's the power of composing simple operations into complex analysis. The multi-step pipeline post walks through a full example.

Start aggregating your data →

Try it yourself

No sign-up, no upload, no tracking.

Open ExploreMyData