← All posts
5 min read

A Complete Guide to Handling NULL and Empty Values

You filter a column for missing values and get 200 results. You filter for empty strings and get 47 more. You search for "N/A" and find another 83. Then there's "n/a", "null", "None", and a handful of dashes. They all mean the same thing - "this data is missing" - but your database treats every single one as a different value.

This is one of the most common problems in real-world data, and it's frustrating because it's invisible. The cells aren't obviously wrong. They just quietly break your counts, your filters, and your group-bys.

Here's how to find every flavor of "missing" in your data and get them all consistent using ExploreMyData.

The five types of "missing"

Before you fix anything, you need to understand what you're dealing with. In a typical messy dataset, "missing" can look like:

What you see What it actually is SQL test
(empty cell)NULL - the cell has no valuecol IS NULL
(looks empty)Empty string - the cell has a value, it's just ""col = ''
N/AThe text "N/A"col = 'N/A'
nullThe text "null" (not actual NULL)col = 'null'
-A dash used as a placeholdercol = '-'

The distinction between NULL and an empty string trips up even experienced analysts. NULL means "no value was provided." An empty string means "a value was provided, and it's nothing." Most of the time you want to treat them the same way. But WHERE col IS NULL won't catch empty strings, and WHERE col = '' won't catch NULLs.

Step 1: Find the missing values

Start by figuring out what you're dealing with. Use Filter to check:

  1. Filter the column with IS NULL to see actual NULLs.
  2. Filter with is Empty to catch empty strings.
  3. Filter with equals and type "N/A" to find that variant. Repeat for "n/a", "null", "None", "-", and any other placeholders your data source uses.

Or, faster: click the column header to open Column Explorer. It shows the distribution of values including a count of NULLs. If you see "N/A" or "null" in the value list, you know what needs fixing.

Column Explorer on the region column - five different representations of "missing", all treated as distinct values:

value count what it actually means
Northeast4,102Real value
West3,847Real value
Midwest2,910Real value
NULL200Actual NULL
N/A83Missing (text)
n/a41Missing (text)
-19Missing (placeholder)
null12Missing (text "null")

355 rows represent missing data, but they're spread across 5 different values. A filter for IS NULL only catches 200 of them.

Step 2: Normalize placeholder text to NULL

The goal is to collapse all the different representations of "missing" into one consistent thing. Usually that means converting them all to actual NULL, because NULL is what SQL functions like COALESCE, COUNT, and IS NULL understand.

Use Find & Replace to convert each placeholder. Select the column, enter the find value (e.g., "N/A"), and leave the replacement empty. When the replacement is empty, ExploreMyData replaces the value with NULL.

Do this for each variant: "N/A", "n/a", "null", "None", "-". Each one becomes a pipeline step. Under the hood, the SQL looks like:

CASE WHEN col = 'N/A' THEN NULL ELSE col END

For empty strings, use Update Values with a condition:

CASE WHEN col = '' THEN NULL ELSE col END

Five Find & Replace steps convert all placeholder text to actual NULL - one step per variant:

Step Find Replace with Rows affected
1N/ANULL83
2n/aNULL41
3-NULL19
4nullNULL12
5(empty string)NULL47

After all steps: 355 unified NULLs. Now IS NULL catches every missing value in one condition.

Step 3: Decide what to do with NULLs

Now all your missing values are actual NULLs. But you still need to decide: keep them as NULL, or fill them with something? It depends on the column and what you're doing with it.

Option A: Leave them as NULL. This is often the right call. NULLs are excluded from AVG(), SUM(), and COUNT(col) automatically. If the data is genuinely missing, it's honest to keep it that way.

Option B: Fill with a default. Use Fill Missing and pick "literal" mode. For a region column, maybe you fill with "Unknown". For a discount column, maybe you fill with "0". The SQL uses COALESCE(col, 'Unknown').

Option C: Fill from adjacent rows. For time-series data where a value carries forward (like a subscription tier that doesn't change every month), use forward fill. The previous non-NULL value fills in the gaps.

Conditional cleanup with Update Values

Sometimes the logic is more nuanced. Maybe you want to fill NULLs in the state column with "Unknown", but only when the country column is "US". For international rows, you want to leave them NULL.

Use Update Values for this. It generates a CASE WHEN expression:

CASE WHEN state IS NULL AND country = 'US' THEN 'Unknown' ELSE state END

This gives you precise control without having to write the SQL yourself.

The order matters

A good cleanup sequence for null handling:

  1. Normalize first. Convert "N/A", "n/a", "null", "None", "", "-" to actual NULL. Now you have one type of missing value, not six.
  2. Then decide per column. Some columns should stay NULL. Some should get a default. Some should be forward-filled. Handle each one based on what makes sense for that data.
  3. Verify. Open Column Explorer again. The NULL count should match what you expect. The phantom values like "N/A" should be gone from the value distribution.

Column Explorer after normalization - only real values and proper NULLs remain:

value count % of total
Northeast4,10236.8%
West3,84734.5%
Midwest2,91026.1%
NULL3553.2%

No more "N/A", "n/a", "-", or "null" as categorical values. One filter catches all missing data.

Every step is reversible

All of this lives in your pipeline. If you realize you were too aggressive - maybe "-" actually means something in one particular column - just delete that pipeline step. The data rebuilds without it. You're never making destructive changes to the original file.

Clean up your null values →

Try it yourself

No sign-up, no upload, no tracking.

Open ExploreMyData