← All posts
5 min read

Calculating Percentages and Ratios Across Columns

You have a sales report. Each row is an order with a product category and revenue amount. Your manager wants to know: what percentage of total revenue does each category represent?

Sounds simple until you try to do it. The percentage for any single row depends on the sum of all rows. You can't compute it with just the data in that row. In a spreadsheet, you'd put a SUM formula somewhere, then divide each cell by that fixed reference. It works, but it's fragile and ugly.

In SQL, this is what window functions are for. In ExploreMyData, you don't have to write the SQL yourself.

Simple ratio between two columns

Let's start with the straightforward case. Your data already has both numbers in the same row. Say you have completed_orders and total_orders per region. You want a completion rate.

Open Math from the Transform group. Select completed_orders as the first column, total_orders as the second, and divide. Name the result completion_rate.

That gives you a decimal like 0.847. To make it a percentage, add another Math step: multiply completion_rate by 100. Or use Add Column to do it in one shot:

ROUND(completed_orders * 100.0 / total_orders, 1) AS completion_pct

Note the 100.0 instead of 100. If both columns are integers, integer division would truncate the result. Multiplying by a decimal forces floating-point math. A small detail that trips people up constantly.

region completed_orders total_orders completion_pct
Northeast847100084.7
Southeast61270087.4
Midwest43055078.2
West91098092.9

Expression: ROUND(completed_orders * 100.0 / total_orders, 1) AS completion_pct

Percentage of a grand total (this is the tricky one)

Back to the original problem. You have individual order rows with a revenue column. You want each row to show what percentage of total revenue it represents. The denominator isn't in any single row - it's the sum across all rows.

This is where window functions come in. A window function computes a value across a set of rows but returns a result for each individual row. In this case, you want SUM(revenue) OVER() - the total revenue, repeated on every row.

In ExploreMyData, open Window from the Transform group. Select revenue as the column, choose SUM as the function, and leave the partition and order fields empty. Name the result total_revenue.

The generated SQL:

SUM(revenue) OVER() AS total_revenue

Now every row has the same grand total value. Next, add a column for the percentage:

ROUND(revenue * 100.0 / total_revenue, 2) AS revenue_pct

Two pipeline steps. Each order now shows its share of total revenue.

order_id category revenue total_revenue revenue_pct
101Electronics125004500027.78
102Clothing82004500018.22
103Electronics98004500021.78
104Sports145004500032.22

Step 1: SUM(revenue) OVER() AS total_revenue - same grand total on every row. Step 2: ROUND(revenue * 100.0 / total_revenue, 2) AS revenue_pct.

Percentage within a group

The question changes slightly. Instead of "what percentage of all revenue is this order?", you want "what percentage of its category's revenue is this order?"

Same Window operation, but this time set the partition by field to category. This tells DuckDB to compute a separate sum for each category.

SUM(revenue) OVER(PARTITION BY category) AS category_total

Now a row in "Electronics" gets the Electronics total, and a row in "Clothing" gets the Clothing total. Divide as before:

ROUND(revenue * 100.0 / category_total, 2) AS pct_of_category

This is genuinely hard to do in a spreadsheet. You'd need SUMIF or a pivot table. Here it's two steps.

Handling division by zero

If any group has zero total revenue (or your denominator column has zeros), you'll get an error or infinity. Protect against this with a conditional expression in Add Column:

CASE WHEN total_revenue = 0 THEN 0 ELSE ROUND(revenue * 100.0 / total_revenue, 2) END AS revenue_pct

This returns 0 instead of blowing up. Not glamorous, but necessary for real data where edge cases always show up.

Verifying the percentages add up

A good sanity check: your percentage column should sum to 100 (or close to it, given rounding). Run a quick Group By on the percentage column with SUM to confirm. If you partitioned by category, group by category and sum the percentages - each group should total 100.

If it doesn't, check for NULL revenue values. NULLs are excluded from SUM but still produce rows, which means the percentages of the non-NULL rows won't add up to 100. Fill missing revenue values with 0 first if this matters for your analysis.

Start calculating percentages →

Try it yourself

No sign-up, no upload, no tracking.

Open ExploreMyData