December 5, 2025 by Arif Aslam 5 min read

Opening and Exploring JSON Files as Tables

You just downloaded an API response. Maybe it's a list of users from Stripe, events from Mixpanel, or records from your company's internal API. The file is JSON - an array of objects, some with nested fields, some with arrays inside arrays. You open it in a text editor and see 4,000 lines of curly braces.

What you actually want is a table. Rows and columns. Something you can scan visually, sort, filter, and make sense of without parsing nested structures in your head. That's what this guide covers.

Loading JSON and JSONL files

Open ExploreMyData and drag your JSON file onto the page (or click to browse). The tool handles two formats:

JSON: A single array of objects, like [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]
JSONL (JSON Lines): One JSON object per line, no wrapping array. Common in log files and streaming exports.

DuckDB reads both formats natively. Each top-level key in the objects becomes a column, and each object becomes a row. If your file has 500 objects with keys "id", "name", "email", and "address", you get a table with 500 rows and 4 columns.

id	name	email	tags	address
BIGINT VARCHAR VARCHAR VARCHAR[] STRUCT
1	Alice Chen	alice@example.com	["premium","enterprise"]	{"street":"123 Main","city":"Portland","state":"OR"}
2	Bob Lund	bob@example.com	["starter"]	{"street":"40 Oak Ave","city":"Seattle","state":"WA"}
3	Priya Shah	priya@example.com	["premium","early-adopter"]	{"street":"7 Pine St","city":"Austin","state":"TX"}

JSON loaded as a table. Top-level keys become columns. The tags column is a VARCHAR[] (array) and address is a STRUCT - both need further processing.

The nesting problem

The table loads, but some columns look wrong. Instead of plain values, you see things like {"street": "123 Main", "city": "Portland", "state": "OR"} in the address column. That's because JSON objects can nest, and DuckDB preserves nested structures as STRUCT or JSON types.

This is technically correct but not useful when you want to filter by city or sort by state. You need to flatten the nested fields into their own columns.

Flattening nested objects with JSON Extract

Use the JSON Extract operation in dictionary mode. Select the nested column (e.g., "address") and choose the keys you want to extract. Each key becomes a new column.

For example, extracting "street", "city", and "state" from the address column gives you three new columns: address_street, address_city, and address_state.

After JSON Extract (dictionary mode) on the address column, extracting street, city, state:

id	name	address_street	address_city	address_state
1	Alice Chen	123 Main St	Portland	OR
2	Bob Lund	40 Oak Ave	Seattle	WA
3	Priya Shah	7 Pine St	Austin	TX

The nested STRUCT is flattened into three plain columns. Now you can filter by city, group by state, or sort by street - just like any other column.

Under the hood, this generates SQL like json_extract_string(address, '$.city') for each key. You don't need to write this - just pick the keys from the list.

Expanding arrays

Some JSON fields contain arrays. A user might have a "tags" field like ["premium", "enterprise", "early-adopter"], or an "orders" field that's a list of order objects.

Use JSON Extract in array mode for these columns. This unnests the array, creating one row per array element. A user with 3 tags becomes 3 rows (one per tag), with all the other columns duplicated.

This is the JSON equivalent of a SQL UNNEST(). It's the right approach when you need to analyze the individual elements - count how many users have each tag, or look at order-level details.

Be careful with large arrays. If each row has 50 array elements and you have 1,000 rows, unnesting gives you 50,000 rows. That's usually fine for analysis, but it's worth knowing the row count will jump.

Exploring the structure with Column Explorer

After loading a JSON file, open the Column Explorer to understand what you're working with. It shows you:

The data type of each column (VARCHAR, INTEGER, STRUCT, JSON, etc.)
How many NULL values exist (missing keys in the source JSON)
Value distributions for categorical columns
Min/max for numeric columns

Columns with type STRUCT or JSON are the ones that need flattening. Columns with type VARCHAR[] or JSON[] are arrays that might need expanding. The Column Explorer helps you identify which columns need further processing before you can analyze them.

Column	Type	Nulls	Action needed
id	BIGINT	0	None
name	VARCHAR	0	None
email	VARCHAR	12	None
tags	VARCHAR[]	8	JSON Extract (array mode)
address	STRUCT	3	JSON Extract (dictionary mode)

Column Explorer view. STRUCT and VARCHAR[] types flag the columns that need flattening or expansion before analysis.

Trimming to the columns you need

API responses tend to include everything. A Stripe customer object has dozens of fields, but you probably only care about five or six. After loading and flattening, use Select Columns to keep only what you need.

Pick the columns you want, in the order you want them. Everything else is dropped from the view. This makes the table manageable and focused on the data you actually need to analyze.

The pipeline preserves the full data - Select Columns just filters the output. If you realize later you need another field, delete the Select Columns step and the original columns reappear.

Common JSON exploration workflows

Here are a few patterns that come up regularly:

API response audit: Load the JSON, check Column Explorer for structure, extract nested fields, Select Columns to focus on what matters. Good for understanding what an API actually returns vs. what the docs say.
Log file analysis: Load a JSONL log file, filter by timestamp range or log level, group by error type. Each log entry becomes a row.
Config comparison: Load two JSON config files as tables, join on the key name, compare values side by side.
Nested data flattening: Extract all nested objects, drop the original nested columns, export as CSV. Useful when downstream tools can't handle JSON nesting.

JSON vs. CSV: when each makes sense

JSON is better at representing complex, hierarchical data. CSV is better for flat, tabular data. When you load JSON into ExploreMyData, you're essentially converting from one to the other - flattening a hierarchy into a table.

This works well when the data is mostly tabular with a few nested fields. It works less well when the structure is deeply nested or highly variable (different objects have completely different keys). For those cases, you might need to extract specific paths rather than trying to flatten everything.

The point is to get from "I have a JSON file" to "I can see and explore this data" as fast as possible. Load, flatten what needs flattening, trim to the columns you need, and start asking questions.

Open a JSON file now →

Opening and Exploring JSON Files as Tables

Loading JSON and JSONL files

The nesting problem

Flattening nested objects with JSON Extract

Expanding arrays

Exploring the structure with Column Explorer

Trimming to the columns you need

Common JSON exploration workflows

JSON vs. CSV: when each makes sense

Try it yourself

Related tools

More from the blog

Parquet vs CSV: When to Use Which Format

Finding Patterns in Missing Data

How to Analyze a CSV File Without Uploading It Anywhere