Copying a Column and Transforming the Copy
You need to normalize company names to lowercase for deduplication. But you also need to keep the original casing for the report that goes to the client. You want to strip special characters from a product SKU for matching, but the original SKU still needs to appear in the export. You need both versions of the same data, side by side.
The solution is simple: copy the column first, then transform the copy. The original stays untouched. Here's how to do it in ExploreMyData.
Step 1: Copy the column
Open the toolbar and select Copy Column from the Columns group.
Choose the source column (e.g., company_name)
and give the copy a name that describes its purpose, like
company_name_normalized.
The generated SQL is straightforward:
SELECT *, company_name AS company_name_normalized
The new column appears at the end of your table with identical values to the original.
| customer_id | company_name (original) | company_name_normalized (copy) |
|---|---|---|
| C001 | Acme Corp. | Acme Corp. |
| C002 | ACME Corp | ACME Corp |
| C003 | Johnson & Johnson | Johnson & Johnson |
After Copy Column: both columns are identical. The original stays untouched while transformations are applied only to the copy.
Step 2: Transform the copy
Now apply your transformations to the copy, leaving the original alone. Select
Text Transform from the Transform group, choose the
company_name_normalized column,
and pick "lowercase".
After this step, your table has both columns:
| company_name | company_name_normalized |
|---|---|
| Acme Corp. | acme corp. |
| ACME Corp | acme corp |
| Johnson & Johnson | johnson & johnson |
| McKinsey & Company | mckinsey & company |
The original column is preserved exactly as it was. The normalized copy is ready for matching.
Chaining more transformations on the copy
Lowercasing is just the first step. You can keep stacking transformations on the copy:
- Apply Text Transform with "trim" to remove leading and trailing whitespace.
- Use Find & Replace on the copy to remove punctuation. For example, replace "." with "" to strip periods, or replace " Corp" with "" to normalize company suffixes.
- Apply another Find & Replace to collapse double spaces into single spaces.
Each operation targets only the company_name_normalized
column. The original company_name stays pristine through every step.
Pipeline for normalizing company names:
- 1Copy Column -
company_name→company_name_normalized - 2Text Transform on
company_name_normalized- lowercase - 3Text Transform on
company_name_normalized- trim whitespace - 4Find & Replace on
company_name_normalized- replace "." with ""
The original company_name column is untouched at every step.
Real-world example: deduplication prep
Say you're trying to find duplicate companies in a CRM export. The original names have inconsistent formatting: "Acme Corp.", "ACME Corp", "acme corp", and " Acme Corp " are all the same company but look different to any exact-match comparison.
The workflow:
- Copy Column:
company_nametocompany_match_key. - Text Transform: lowercase
company_match_key. - Text Transform: trim
company_match_key. - Find & Replace: replace "." with "" in
company_match_key. - Remove Duplicates: deduplicate on the
company_match_keycolumn.
Now "Acme Corp.", "ACME Corp", and " Acme Corp " all collapse into one row, but the row you keep
still has the original formatting in the company_name column.
Other use cases for copy-then-transform
This pattern shows up everywhere:
- Email normalization: Copy
emailtoemail_clean, then lowercase and trim the copy. Keep the original for display. - Phone formatting: Copy
phonetophone_digits, then use Find & Replace to strip parentheses, dashes, and spaces. The original stays human-readable. - URL slugs: Copy a
titlecolumn, lowercase it, and replace spaces with hyphens to generate URL-friendly slugs. - Currency stripping: Copy a
price_displaycolumn (like "$1,299.00"), strip the $ and commas in the copy, then convert the copy to a numeric type for calculations.
Why not just transform in place?
You could apply Text Transform directly to the original column. But then you've lost the original data. In ExploreMyData, pipeline steps are non-destructive and you can delete them, but if your workflow needs both the original and the transformed version in the final output, transforming in place doesn't work. Copy first, transform second.
It's two pipeline steps instead of one, and the SQL is clear about what's happening: first a copy, then a modification of the copy. Anyone reviewing the pipeline later can see exactly what was done and why.