How to Run SQL Queries on CSV Files in Your Browser
SQL is the most powerful language for working with tabular data, but traditionally you need a database server to use it. Not anymore. With DuckDB WASM, you can run full SQL queries on CSV files directly in your browser.
How it works
ExploreMyData embeds DuckDB, a high-performance analytical database, compiled to WebAssembly. When you open a CSV file, DuckDB creates an in-memory table from it. You can then query that table using standard SQL syntax.
This isn't a simplified SQL subset. it's full DuckDB SQL with support for:
- SELECT, WHERE, GROUP BY, ORDER BY, HAVING
- JOIN (inner, left, right, full outer) across multiple files
- Window functions (ROW_NUMBER, RANK, LAG, LEAD, running totals)
- CTEs (WITH clauses) for complex multi-step queries
- Aggregate functions (SUM, AVG, MIN, MAX, COUNT, COUNT DISTINCT)
- String functions (UPPER, LOWER, TRIM, REPLACE, SUBSTRING, CONCAT)
- Date functions (EXTRACT, DATEDIFF, date arithmetic)
- Type casting with TRY_CAST for safe conversions
Two ways to use SQL
1. Visual operations (no SQL needed)
Most operations in ExploreMyData have a visual interface. Click "Filter" to add conditions, "Pivot" to create pivot tables, "Join" to combine files. Behind the scenes, each operation generates SQL and adds it to the pipeline. You can click "Show SQL" on any pipeline step to see the generated query.
2. Raw SQL query
For anything the visual operations don't cover, use the SQL operation directly. Select "SQL Query" from the Advanced group in the toolbar, write your query, and apply it. Your query becomes a pipeline step like any other.
Example queries you might write:
-- Top 10 customers by total revenue
SELECT customer_name, SUM(amount) AS total
FROM "sales"
GROUP BY customer_name
ORDER BY total DESC
LIMIT 10
-- Monthly revenue trend
SELECT
EXTRACT(YEAR FROM order_date) AS year,
EXTRACT(MONTH FROM order_date) AS month,
SUM(amount) AS revenue,
COUNT(*) AS orders
FROM "sales"
GROUP BY year, month
ORDER BY year, month
-- Running total with window function
SELECT *,
SUM(amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM "sales"
AI-assisted SQL
If you're not comfortable writing SQL, ExploreMyData can write it for you. In the pipeline sidebar, type a description like "show me the top 5 products by revenue per quarter" and the AI generates the SQL. You can review and edit it before applying.
Only your column names and types are sent to the AI. your actual data stays in your browser.
Works with any format
SQL queries work identically whether your source file is CSV, Parquet, JSON, JSONL, or Excel. DuckDB reads all of them natively. You can even join a CSV file with a Parquet file in a single query.