Removing Unnecessary Columns from a Large Export
You export a report from Salesforce. It has 83 columns. You need 12 of them. The other 71
are system fields, audit timestamps, internal IDs, and columns with names like
__c_custom_field_37 that nobody remembers creating.
Opening this in Excel means scrolling sideways for eternity. Opening it in Google Sheets hits the column limit warning. The file is 200MB when it could be 30MB. Every operation you run is slower because it's processing data you don't care about.
The fix is simple: cut it down to the columns you actually need. Here's how to do that in ExploreMyData.
Two approaches, different situations
ExploreMyData gives you two operations for managing columns, and which one you reach for depends on the math:
- Select Columns - pick the ones you want to keep. Best when you need a small number of columns from a large set.
- Delete Columns - remove specific columns. Best when you want most columns but need to drop a few.
If you need 12 out of 83 columns, use Select Columns. If you need 78 out of 83, use Delete Columns. It's about doing less clicking.
Using Select Columns
Open the toolbar and choose Select Columns from the Columns group. You'll see every column in your dataset listed. Check the ones you need. Uncheck the rest.
Here's the part that matters: you can also drag to reorder the selected columns.
This means you're not just trimming - you're arranging the output to match how you actually
think about the data. Put company_name first, then
deal_stage, then
amount, then
close_date. Whatever makes sense for your report.
The generated SQL is straightforward:
SELECT company_name, deal_stage, amount, close_date, owner, region, source, created_at, contact_email, contact_name, industry, arr FROM source
That's it. A SELECT with exactly the columns you chose, in the order you arranged them. The other 71 columns are gone.
Select Columns - 12 checked from 83 total, arranged in the order they should appear in the output:
| Column name | Keep? | Position in output |
|---|---|---|
| company_name | Yes | 1 |
| deal_stage | Yes | 2 |
| amount | Yes | 3 |
| close_date | Yes | 4 |
| created_by_id | No | - |
| last_modified_by_id | No | - |
| __c_custom_field_37 | No | - |
| system_modstamp | No | - |
71 columns dropped. File size: 200 MB → ~30 MB. All unwanted columns gone in a single SELECT.
Using Delete Columns
Sometimes you're happy with most of what's in the file but a few columns are junk. Maybe the
export includes created_by_id,
last_modified_by_id,
system_modstamp, and
is_deleted - internal system fields that add nothing
to your analysis.
Select Delete Columns from the Columns group. Pick the columns to remove. Done. Everything else stays in its original order.
Delete Columns - selecting specific system fields to drop while keeping everything else:
| Column to delete | Why it's junk |
|---|---|
| created_by_id | Internal Salesforce user ID, not meaningful for analysis |
| last_modified_by_id | Same - internal audit field |
| system_modstamp | System timestamp, not the business date you care about |
| is_deleted | Always "false" in this export - no signal |
| __c_custom_field_37 | Deprecated field, all values NULL |
Why this matters more than you think
Trimming columns isn't just about aesthetics. It has real effects:
- Smaller exports. When you download the cleaned file, it's a fraction of the original size. A 200MB CRM dump becomes a 25MB focused dataset.
- Faster operations. Every filter, sort, and group-by runs faster when there's less data to scan. DuckDB is fast, but 71 unnecessary columns still cost time on large files.
- Clearer analysis. When your table fits on screen without scrolling, you spot patterns faster. You see the data instead of hunting for it.
- Easier sharing. Sending a clean 12-column file to your team is better than sending the 83-column monster and saying "just ignore most of it."
Combine with other cleanup
Column selection is usually your first pipeline step. Trim the fat, then work with what's left. A typical workflow looks like:
- Select Columns - keep only the 12 you need.
- Rename a few columns for clarity (
acct_nmbecomescompany_name). - Filter out cancelled deals.
- Sort by close date.
- Export the clean result.
Each step is visible in your pipeline. You can remove any step, reorder them, or see the SQL for the entire chain.
A typical cleanup pipeline - column selection first, then progressively refine the data:
| Step | Operation | What it does |
|---|---|---|
| 1 | Select Columns | Keep only the 12 needed columns (83 → 12) |
| 2 | Rename Column | acct_nm → company_name, opp_amt → amount |
| 3 | Filter | Exclude deals where stage = "Cancelled" |
| 4 | Sort | Sort by close_date descending |