Calculating Percentages and Ratios Across Columns
You have a sales report. Each row is an order with a product category and revenue amount. Your manager wants to know: what percentage of total revenue does each category represent?
Sounds simple until you try to do it. The percentage for any single row depends on the sum of all rows. You can't compute it with just the data in that row. In a spreadsheet, you'd put a SUM formula somewhere, then divide each cell by that fixed reference. It works, but it's fragile and ugly.
In SQL, this is what window functions are for. In ExploreMyData, you don't have to write the SQL yourself.
Simple ratio between two columns
Let's start with the straightforward case. Your data already has both numbers in the same row. Say you
have completed_orders and
total_orders per region.
You want a completion rate.
Open Math from the Transform group. Select
completed_orders as the first column,
total_orders as the second,
and divide. Name the result
completion_rate.
That gives you a decimal like 0.847. To make it a percentage, add another Math step: multiply completion_rate by 100. Or use Add Column to do it in one shot:
ROUND(completed_orders * 100.0 / total_orders, 1) AS completion_pct
Note the 100.0 instead of
100. If both columns are integers,
integer division would truncate the result. Multiplying by a decimal forces floating-point math.
A small detail that trips people up constantly.
| region | completed_orders | total_orders | completion_pct |
|---|---|---|---|
| Northeast | 847 | 1000 | 84.7 |
| Southeast | 612 | 700 | 87.4 |
| Midwest | 430 | 550 | 78.2 |
| West | 910 | 980 | 92.9 |
Expression: ROUND(completed_orders * 100.0 / total_orders, 1) AS completion_pct
Percentage of a grand total (this is the tricky one)
Back to the original problem. You have individual order rows with a revenue column. You want each row to show what percentage of total revenue it represents. The denominator isn't in any single row - it's the sum across all rows.
This is where window functions come in. A window function computes a value across a set of rows
but returns a result for each individual row. In this case, you want
SUM(revenue) OVER() -
the total revenue, repeated on every row.
In ExploreMyData, open Window from the Transform group. Select
revenue as the column, choose
SUM as the function, and leave the partition and order fields
empty. Name the result total_revenue.
The generated SQL:
SUM(revenue) OVER() AS total_revenue
Now every row has the same grand total value. Next, add a column for the percentage:
ROUND(revenue * 100.0 / total_revenue, 2) AS revenue_pct
Two pipeline steps. Each order now shows its share of total revenue.
| order_id | category | revenue | total_revenue | revenue_pct |
|---|---|---|---|---|
| 101 | Electronics | 12500 | 45000 | 27.78 |
| 102 | Clothing | 8200 | 45000 | 18.22 |
| 103 | Electronics | 9800 | 45000 | 21.78 |
| 104 | Sports | 14500 | 45000 | 32.22 |
Step 1: SUM(revenue) OVER() AS total_revenue - same grand total on every row. Step 2: ROUND(revenue * 100.0 / total_revenue, 2) AS revenue_pct.
Percentage within a group
The question changes slightly. Instead of "what percentage of all revenue is this order?", you want "what percentage of its category's revenue is this order?"
Same Window operation, but this time set the partition by field to
category. This tells DuckDB to
compute a separate sum for each category.
SUM(revenue) OVER(PARTITION BY category) AS category_total
Now a row in "Electronics" gets the Electronics total, and a row in "Clothing" gets the Clothing total. Divide as before:
ROUND(revenue * 100.0 / category_total, 2) AS pct_of_category
This is genuinely hard to do in a spreadsheet. You'd need SUMIF or a pivot table. Here it's two steps.
Handling division by zero
If any group has zero total revenue (or your denominator column has zeros), you'll get an error or infinity. Protect against this with a conditional expression in Add Column:
CASE WHEN total_revenue = 0 THEN 0 ELSE ROUND(revenue * 100.0 / total_revenue, 2) END AS revenue_pct
This returns 0 instead of blowing up. Not glamorous, but necessary for real data where edge cases always show up.
Verifying the percentages add up
A good sanity check: your percentage column should sum to 100 (or close to it, given rounding). Run a quick Group By on the percentage column with SUM to confirm. If you partitioned by category, group by category and sum the percentages - each group should total 100.
If it doesn't, check for NULL revenue values. NULLs are excluded from SUM but still produce rows, which means the percentages of the non-NULL rows won't add up to 100. Fill missing revenue values with 0 first if this matters for your analysis.