Splitting Multi-Value Cells into Separate Rows
Your product data has a "tags" column. Each cell contains something like "electronics, sale, featured". Three values crammed into one cell, separated by commas.
This is fine for display. It's terrible for analysis. You can't count how many products are tagged "sale" because "sale" isn't a distinct value - it's buried inside a string. You can't filter by a single tag, group by tag, or join on tag. The data is locked up.
What you need is one row per tag, where the product appears three times (once for each tag) with all its other columns duplicated. This operation is called unnesting, and it's one of the most useful transformations for anyone working with real-world data.
What multi-value cells look like
Here's the starting point:
| product_id | name | tags |
|---|---|---|
| 101 | Wireless Mouse | electronics, sale, featured |
| 102 | Running Shoes | sports, new-arrival |
| 103 | Coffee Maker | kitchen, sale, bestseller, featured |
Three rows, but the tags column is doing too much. You can't answer "how many products are on sale?" without parsing strings.
Unnest: one row per value
In ExploreMyData, open the
toolbar and select Unnest from the Transform group.
Choose the column with delimited values (e.g., "tags") and set the delimiter - in this case,
a comma (,).
Click Apply. The three rows become nine:
| product_id | name | tags |
|---|---|---|
| 101 | Wireless Mouse | electronics |
| 101 | Wireless Mouse | sale |
| 101 | Wireless Mouse | featured |
| 102 | Running Shoes | sports |
| 102 | Running Shoes | new-arrival |
| 103 | Coffee Maker | kitchen |
| 103 | Coffee Maker | sale |
| 103 | Coffee Maker | bestseller |
| 103 | Coffee Maker | featured |
Each tag is now its own row. The product_id and name columns are duplicated for each tag - which is exactly what you want.
Before Unnest (3 rows)
| product_id | tags |
|---|---|
| 101 | electronics, sale, featured |
| 102 | sports, new-arrival |
| 103 | kitchen, sale, bestseller, featured |
After Unnest (9 rows)
| product_id | tags |
|---|---|
| 101 | electronics |
| 101 | sale |
| 101 | featured |
| 102 | sports |
| 102 | new-arrival |
| 103 | kitchen |
| 103 | sale |
| 103 | bestseller |
| 103 | featured |
The SQL behind it
Open the pipeline to see what's happening:
SELECT product_id, name,
UNNEST(STRING_SPLIT(tags, ',')) AS tags
FROM data
STRING_SPLIT(tags, ',')
turns "electronics, sale, featured" into a list: ['electronics', ' sale', ' featured'].
UNNEST() takes that list
and expands it into separate rows. DuckDB handles the duplication of the other columns
automatically.
Trim the whitespace
Look at the result table again. The split values have leading spaces: " sale" instead of "sale". That's because the original data was "electronics, sale, featured" with spaces after the commas. The split kept the spaces.
Fix this by adding a Text Transform step right after the unnest. Select the tags column and choose "trim". This strips leading and trailing whitespace from every value.
Pipeline steps to split and clean tags:
- 1Unnest on
tags- delimiter: comma (,). Expands 3 rows into 9. - 2Text Transform on
tags- trim. Removes leading spaces left by ", " separators.
After trim: "electronics", "sale", "featured" - clean values ready for group-by or filter.
Now your tags are clean: "electronics", "sale", "featured" - no stray spaces.
What you can do after unnesting
With tags in separate rows, analysis becomes straightforward:
Count tag frequency: Group by tags, count rows. You'll see that "sale" appears in 2 products, "featured" in 2, and everything else in 1.
Filter by tag: Add a filter where
tags = 'sale' to see
only products on sale.
Pivot by tag: Create a cross-tabulation of product categories by tag to see which tags appear in which categories.
None of this was possible when the tags were stuck together in one cell.
Other delimiters
Commas are the most common delimiter, but the same technique works for any separator:
- Semicolons: "skill1; skill2; skill3" - set delimiter to
; - Pipes: "red|blue|green" - set delimiter to
| - Slashes: "US/CA/MX" - set delimiter to
/ - Spaces: "tag1 tag2 tag3" - set delimiter to a space
Understanding the row count increase
Unnesting increases your row count. If you started with 1,000 products and each has an average of 3 tags, you'll end up with roughly 3,000 rows. This is expected and correct - you're not duplicating data, you're normalizing it.
Keep the original row count in mind when doing aggregations after unnesting. If you sum revenue after unnesting tags, a product with 3 tags will contribute its revenue 3 times. Group by product_id first if you need accurate totals, or do your revenue calculations before the unnest step.
When not to unnest
Unnest is the right tool when you need to analyze or filter by individual values within a
multi-value cell. It's overkill if you just need to check whether a cell contains a
specific value - for that, a filter with
CONTAINS(tags, 'sale')
is simpler and doesn't change your row count.
Use unnest when you need to count, group, or join on individual values. Use string matching when you just need to filter.