January 2, 2026 by Arif Aslam 4 min read

Removing Unnecessary Columns from a Large Export

You export a report from Salesforce. It has 83 columns. You need 12 of them. The other 71 are system fields, audit timestamps, internal IDs, and columns with names like __c_custom_field_37 that nobody remembers creating.

Opening this in Excel means scrolling sideways for eternity. Opening it in Google Sheets hits the column limit warning. The file is 200MB when it could be 30MB. Every operation you run is slower because it's processing data you don't care about.

The fix is simple: cut it down to the columns you actually need. Here's how to do that in ExploreMyData.

Two approaches, different situations

ExploreMyData gives you two operations for managing columns, and which one you reach for depends on the math:

Select Columns - pick the ones you want to keep. Best when you need a small number of columns from a large set.
Delete Columns - remove specific columns. Best when you want most columns but need to drop a few.

If you need 12 out of 83 columns, use Select Columns. If you need 78 out of 83, use Delete Columns. It's about doing less clicking.

Using Select Columns

Open the toolbar and choose Select Columns from the Columns group. You'll see every column in your dataset listed. Check the ones you need. Uncheck the rest.

Here's the part that matters: you can also drag to reorder the selected columns. This means you're not just trimming - you're arranging the output to match how you actually think about the data. Put company_name first, then deal_stage, then amount, then close_date. Whatever makes sense for your report.

The generated SQL is straightforward:

SELECT company_name, deal_stage, amount, close_date, owner, region, source, created_at, contact_email, contact_name, industry, arr FROM source

That's it. A SELECT with exactly the columns you chose, in the order you arranged them. The other 71 columns are gone.

Select Columns - 12 checked from 83 total, arranged in the order they should appear in the output:

Column name	Keep?	Position in output
company_name	Yes	1
deal_stage	Yes	2
amount	Yes	3
close_date	Yes	4
created_by_id	No	-
last_modified_by_id	No	-
__c_custom_field_37	No	-
system_modstamp	No	-

71 columns dropped. File size: 200 MB → ~30 MB. All unwanted columns gone in a single SELECT.

Using Delete Columns

Sometimes you're happy with most of what's in the file but a few columns are junk. Maybe the export includes created_by_id, last_modified_by_id, system_modstamp, and is_deleted - internal system fields that add nothing to your analysis.

Select Delete Columns from the Columns group. Pick the columns to remove. Done. Everything else stays in its original order.

Delete Columns - selecting specific system fields to drop while keeping everything else:

Column to delete	Why it's junk
created_by_id	Internal Salesforce user ID, not meaningful for analysis
last_modified_by_id	Same - internal audit field
system_modstamp	System timestamp, not the business date you care about
is_deleted	Always "false" in this export - no signal
__c_custom_field_37	Deprecated field, all values NULL

Why this matters more than you think

Trimming columns isn't just about aesthetics. It has real effects:

Smaller exports. When you download the cleaned file, it's a fraction of the original size. A 200MB CRM dump becomes a 25MB focused dataset.
Faster operations. Every filter, sort, and group-by runs faster when there's less data to scan. DuckDB is fast, but 71 unnecessary columns still cost time on large files.
Clearer analysis. When your table fits on screen without scrolling, you spot patterns faster. You see the data instead of hunting for it.
Easier sharing. Sending a clean 12-column file to your team is better than sending the 83-column monster and saying "just ignore most of it."

Combine with other cleanup

Column selection is usually your first pipeline step. Trim the fat, then work with what's left. A typical workflow looks like:

Select Columns - keep only the 12 you need.
Copy columns with new names for clarity (Copy Columns: acct_nm becomes company_name, then Delete Columns to remove the originals).
Filter out cancelled deals.
Sort by close date (use Top / Bottom Rows or Limit Rows with a sort field).
Export the clean result.

Each step is visible in your pipeline. You can remove any step, reorder them, or see the SQL for the entire chain.

A typical cleanup pipeline - column selection first, then progressively refine the data:

Step	Operation	What it does
1	Select Columns	Keep only the 12 needed columns (83 → 12)
2	Copy Columns + Delete Columns	acct_nm → company_name, opp_amt → amount (copy with new name, delete original)
3	Filter	Exclude deals where stage = "Cancelled"
4	Top / Bottom Rows	Sort by close_date descending

Clean up your next export →

Removing Unnecessary Columns from a Large Export

Two approaches, different situations

Using Select Columns

Using Delete Columns

Why this matters more than you think

Combine with other cleanup

Try it yourself

Related tools

More from the blog

5 Ways to Clean Messy Data Without Writing Code

Extracting Email Domains to Identify Company Accounts

A Complete Guide to Handling NULL and Empty Values