Converting Between CSV, Parquet, JSON, and Excel
Your data warehouse wants Parquet. Your team has CSVs. Someone emailed you an .xlsx and your ETL script only reads CSV. The API returns JSON and your colleague wants a spreadsheet. Every one of these is a format conversion problem, and none of them should require installing a tool or writing a script.
ExploreMyData reads six formats and exports to two. Open a file, optionally clean it up, and export in the format you need. The whole thing runs in the browser.
Supported formats
Input (what you can open):
- CSV - comma-separated values, the universal default
- TSV - tab-separated, common in bioinformatics and database exports
- Parquet - columnar binary format, used in data warehouses and analytics pipelines
- JSON - array of objects or nested structures, common from APIs
- JSONL / NDJSON - newline-delimited JSON, one record per line, used in logging and streaming
- Excel (.xlsx, .xls) - spreadsheet files, the format non-technical teams default to
Output (what you can export):
- CSV - maximum compatibility
- Parquet - maximum compression and type preservation
The conversion process
It's three steps:
- Open your file in any supported format. Drag and drop or use the file picker.
- Preview the data. Make sure it loaded correctly. Check column types.
- Export as CSV or Parquet from the export menu.
That's it. The conversion happens in-browser using DuckDB WASM. No server, no upload, no waiting for a job to finish.
Export options
The exported file reflects the final state of your pipeline, including all filters, type conversions, and added columns.
Clean before you convert
The real power is that you can transform the data between open and export. Every pipeline step you apply is included in the exported file. This means you can:
- Open an Excel file, fix column names, remove empty rows, then export as clean CSV
- Open a JSON API response, flatten nested fields, filter to relevant records, then export as Parquet
- Open a messy CSV, fix data types, remove duplicates, then export as a typed Parquet file for your warehouse
The exported file reflects the final state of your pipeline, not just the raw input. This turns a simple format conversion into a lightweight ETL step.
Why convert to Parquet?
If you're only working with small files and non-technical collaborators, CSV is fine. But Parquet has real advantages when files get bigger or data integrity matters:
- Compression. A 100 MB CSV might become 15-30 MB as Parquet. Columnar storage compresses similar values together extremely well.
- Types are preserved. A Parquet file knows that a column is an integer, a date, or a boolean. CSV stores everything as text, so downstream tools have to guess (and often guess wrong).
- Faster to read. Tools like DuckDB, Pandas, and Spark can read specific columns from a Parquet file without scanning the whole thing. With CSV, you read the entire file even if you only need two columns.
Why convert to CSV?
CSV is the lowest common denominator. Every tool, language, and spreadsheet app can read it. Convert to CSV when:
- You need to share data with someone who uses Excel or Google Sheets
- A script or legacy system only accepts CSV input
- You want a human-readable file you can open in a text editor
- You're converting from Excel or JSON to something simpler
Format comparison
| Format | Human readable | Compressed | Typed | Best for |
|---|---|---|---|---|
| CSV / TSV | Yes | No | No | Sharing, compatibility |
| Parquet | No | Yes | Yes | Warehouses, analytics, archival |
| JSON | Yes | No | Partial | APIs, nested data |
| JSONL | Yes | No | Partial | Streaming, logs |
| Excel | Yes (in Excel) | Somewhat | Yes | Business teams, reporting |
Common conversions
Excel to CSV: Open the .xlsx file. ExploreMyData reads the first sheet. Preview it, verify the columns look right, and export as CSV. Useful when a colleague sends you a spreadsheet and your code expects CSV.
CSV to Parquet: Open the CSV. Check that column types were detected correctly (use Convert Type if needed). Export as Parquet. The result is smaller, faster to query, and preserves types. Ideal before loading into a data warehouse.
JSON to CSV: Open the JSON file. DuckDB flattens the
top-level array of objects into rows and columns automatically. If you have nested fields, use
the SQL operation or Add Column to extract them with
json_extract_string().
Then export as CSV.
Parquet to CSV: Sometimes you need to make a Parquet file human-readable, or hand it to a tool that only reads CSV. Open the Parquet file (DuckDB handles large Parquet files with streaming) and export as CSV.