Converting Yes/No/True/False/1/0 to a Consistent Boolean
You're looking at an is_active column. It should
be simple - either the account is active or it isn't. But the values tell a different story:
"Yes", "Y", "TRUE", "true", "1", "yes", "True". And that's just the truthy side. The
false values are equally chaotic: "No", "N", "FALSE", "0", "false", "no".
This happens when data comes from multiple sources, or when different people entered it manually, or when someone exported from a system that uses "Y/N" and another that uses "true/false". The result: you can't filter for active accounts reliably. A filter for "TRUE" misses the rows that say "Yes". A filter for "1" misses the rows that say "true".
Here's how to collapse all of that into a clean boolean column in ExploreMyData.
What the data looks like
Here's a sample from a user accounts export:
| user_id | name | is_active | email_verified |
|---|---|---|---|
| 1001 | Alice Okonkwo | Yes | TRUE |
| 1002 | Ben Torres | Y | 1 |
| 1003 | Clara Johansson | true | false |
| 1004 | David Kim | No | FALSE |
| 1005 | Elena Petrov | 1 | 0 |
| 1006 | Farhan Ali | FALSE | True |
| 1007 | Grace Nakamura | 0 | yes |
Two columns, both supposed to be boolean, both a mess. The
is_active column alone has seven
distinct values that represent just two states.
Step 1: Map all values with Bulk Replace
The fastest approach is Bulk Replace. This operation lets you define groups
of values and what each group should become. Open the toolbar, select Bulk Replace from the
Transform group, and choose the is_active column.
Define two groups:
- Group 1: "Yes", "Y", "TRUE", "true", "True", "1", "yes" → replace with
true - Group 2: "No", "N", "FALSE", "false", "False", "0", "no" → replace with
false
The SQL that ExploreMyData generates is a CASE WHEN with every variation mapped:
CASE WHEN is_active IN ('Yes', 'Y', 'TRUE', 'true', 'True', '1', 'yes') THEN 'true' WHEN is_active IN ('No', 'N', 'FALSE', 'false', 'False', '0', 'no') THEN 'false' ELSE is_active END
That ELSE is_active END at the end is a safety net.
If there's a value you didn't account for, it stays as-is rather than silently becoming
NULL. You'll spot it when you check the results.
Bulk Replace groups - all truthy variations map to "true", all falsy to "false":
| Source values (is_active) | Target |
|---|---|
| Yes, Y, TRUE, true, True, 1, yes | true |
| No, N, FALSE, false, False, 0, no | false |
After applying, all rows show either "true" or "false" as a string. Convert Type (BOOLEAN) in the next step makes them real booleans.
Result after Bulk Replace:
| user_id | is_active (before) | is_active (after) |
|---|---|---|
| 1001 | Yes | true |
| 1002 | Y | true |
| 1003 | true | true |
| 1004 | No | false |
| 1005 | 1 | true |
| 1006 | FALSE | false |
| 1007 | 0 | false |
Step 2: Convert to actual BOOLEAN type
After the Bulk Replace, every row in
is_active says either "true" or
"false". But it's still a text column. The values are strings, not booleans.
Select Convert Type from the Transform group. Choose the
is_active column and set the target type to
BOOLEAN.
The SQL:
TRY_CAST(is_active AS BOOLEAN)
TRY_CAST is important here. If there's a stray
value that didn't get caught by the Bulk Replace - say someone typed "active" instead of
"true" - TRY_CAST returns NULL for that row instead
of throwing an error. A regular CAST would fail the
entire operation.
Convert Type step - column type changes from VARCHAR to BOOLEAN:
| Column | Type before | Type after | SQL |
|---|---|---|---|
| is_active | VARCHAR | BOOLEAN | TRY_CAST(is_active AS BOOLEAN) |
| email_verified | VARCHAR | BOOLEAN | TRY_CAST(email_verified AS BOOLEAN) |
TRY_CAST returns NULL for any value that can't convert rather than crashing the query.
Now it works like a boolean
With a real BOOLEAN column, everything just works:
- Filtering - filter where
is_active = truecatches every active user, not just the ones who happened to have "TRUE" in uppercase. - Counting - group by
is_activeand you get two rows: true and false. Not seven. - Logic - you can use it in conditions with other boolean columns, like finding users where
is_active = true AND email_verified = false.
Handling multiple boolean columns
In the sample data, email_verified has the same
problem. Repeat the Bulk Replace and Convert Type steps for that column. If you have five
or six boolean columns in your dataset - which is common in user exports - you'll do this
for each one.
Each pass adds two pipeline steps. For three boolean columns, that's six steps total. It takes about a minute. The alternative - writing the SQL by hand, testing it, handling edge cases - takes significantly longer.
Watch for edge cases
A few things to check after converting:
- NULLs. If some rows had no value to begin with, they'll remain NULL after conversion. That's correct - a missing value isn't true or false, it's unknown.
- Unexpected values. Check Column Explorer after the Bulk Replace. If you see any value besides "true" and "false", you missed a variant. Add it to the appropriate group.
- Numeric booleans. "1" and "0" are common but so are "2" or "-1" in some systems. Decide if "2" means true or if it's an error.
Column Explorer for is_active after cleanup - exactly two values, clean and ready for analysis:
| value | count | % of total |
|---|---|---|
| true | 8,342 | 83.4% |
| false | 1,541 | 15.4% |
| NULL | 117 | 1.2% |
Before: 7 distinct values for a yes/no field. After: 2 values plus NULL. Filters and group-bys now work correctly.