Blog
Guides and tutorials for exploring data in your browser.
Featured
How to Analyze a CSV File Without Uploading It Anywhere
Learn how to open, filter, sort, and transform CSV files entirely in your browser. No server uploads, no accounts, no data leaving your device.
Parquet vs CSV: When to Use Which Format
A practical comparison of Apache Parquet and CSV formats. When columnar storage wins, when plain text is enough, and how to convert between them.
How to Run SQL Queries on CSV Files in Your Browser
Use DuckDB WASM to write SQL queries against CSV, Parquet, and JSON files directly in your browser. No database setup required.
5 Ways to Clean Messy Data Without Writing Code
Fix common data quality issues like missing values, inconsistent text, duplicates, and wrong types using ExploreMyData's visual operations.
How to Create Pivot Tables in Your Browser
Create pivot tables from CSV, JSON, or Excel files without Excel or Google Sheets. A step-by-step guide using ExploreMyData's pivot operation.
How to Join Multiple CSV Files Without Coding
Combine data from two or more CSV files using inner, left, right, or full outer joins. All in your browser with ExploreMyData.
Working with Dates: Extract, Calculate, and Transform
Extract year, month, and day from dates. Calculate differences between dates. Add intervals. A complete guide to date operations in ExploreMyData.
Cleaning
Fixing messy, inconsistent, or broken data.
A Complete Guide to Handling NULL and Empty Values
Understand the difference between NULL, empty strings, "N/A", and "null" and handle each one consistently.
How to Standardize Inconsistent Company Names in a CSV
Fix variations like "IBM", "I.B.M.", and "International Business Machines" using Find & Replace and Bulk Replace.
How to Remove Test Data and Junk Rows from a CSV
Filter out TEST rows, placeholder emails, and zero-value entries using the condition builder.
Converting "$1,234.56" Strings to Actual Numbers
Strip dollar signs and commas, then convert to a numeric type you can sum and filter.
Fixing Dates When Your CSV Has Three Different Formats
Normalize MM/DD/YYYY, Jan 15 2024, and ISO dates into one consistent date type.
Converting Yes/No/True/False/1/0 to a Consistent Boolean
Group all truthy and falsy variations into a consistent boolean column.
Splitting a Full Address into Street, City, State, Zip
Break apart single-column addresses into separate fields for geographic analysis.
Fixing Messy Phone Number Formats in Your Data
Standardize phone numbers from 8+ formats into one consistent format.
Finding Near-Duplicate Rows in Your Data
Normalize text, then deduplicate records that differ only by case, spacing, or abbreviations.
Extracting Email Domains to Identify Company Accounts
Pull the domain from email addresses to segment users by company.
Filling Missing Category Labels Using Forward Fill
Fix report data where category names only appear on the first row of each group.
Removing Unnecessary Columns from a Large Export
Trim a bloated 80-column export down to just the columns you need.
Exploring
Understanding your data, finding patterns, and filtering.
What to Do First When You Open a New CSV File
A systematic approach to understanding an unfamiliar data file in under 5 minutes.
How to Find Outliers in Numeric Data Without Code
Spot suspicious values using histograms, top/bottom filters, and summary statistics.
Understanding Category Distributions in Your Data
See how many records fall into each category and find dominant groups or misspelled values.
Checking Date Ranges and Gaps in Time Series Data
Verify your data covers the expected date range and find missing weeks or months.
Finding Patterns in Missing Data
Discover whether missing values are random or systematic - by date, category, or source.
Comparing Two Versions of the Same Data File
Find new rows, deleted rows, and changed values between two exports using a FULL JOIN.
Quick Top-10 Analysis: Finding the Biggest Items
Find top customers, most common categories, or highest values in 30 seconds.
Opening and Exploring JSON Files as Tables
Load JSON or JSONL files, flatten nested objects, and browse API responses visually.
Spotting Columns Where the Data Type is Wrong
Identify numbers stored as text, dates as strings, and other type mismatches.
Profiling an Excel Export to Understand Data Quality
Assess the completeness and consistency of an .xlsx file before using it.
How to Count Unique Values Per Column
Find how many distinct customers, products, or categories exist in each column.
Working with Large Files: Sampling and Limiting Rows
Explore multi-million row files efficiently by limiting, filtering, or sampling.
Creating Charts and Visualizations from CSV Data
Build bar charts, line charts, and more from your data using the built-in Chart Builder.
Transforming
Reshaping, converting, and computing new values.
Adding a Calculated Column to Your CSV
Create profit from revenue minus cost, totals from quantity times price, and other computed columns.
Calculating Percentages and Ratios Across Columns
Calculate what fraction of total revenue each product or region represents.
Adding Running Totals and Cumulative Sums
Add a column that accumulates values over time for revenue tracking or reconciliation.
Ranking Rows by Value: Top Salespeople, Best Products
Assign rank numbers with ROW_NUMBER, RANK, and DENSE_RANK, including per-group ranking.
Creating Conditional Columns: If-Then Logic Without Code
Add "High", "Medium", "Low" tier columns based on revenue thresholds or custom rules.
Extracting Data From Messy Text Columns
Pull product codes, invoice numbers, or IDs out of free-text descriptions.
Doing Math Across Columns: Add, Multiply, Round
Calculate tax, apply discounts, convert units, and round results.
Remapping Category Codes to Readable Labels
Replace "CAT_01" with "Electronics" and "REG_NE" with "Northeast" using Bulk Replace.
Splitting Multi-Value Cells into Separate Rows
Turn "electronics, sale, featured" in one cell into three separate rows.
Combining First Name and Last Name into Full Name
Merge two name columns into one for mailing lists or reports.
Extracting Fields from JSON Columns into Regular Columns
Parse a JSON metadata column into separate, typed columns you can filter and sort.
Copying a Column and Transforming the Copy
Keep the original intact while creating a normalized version alongside it.
Calculating Age from a Date of Birth Column
Turn a birthdate column into current age in years using Date Difference.
Advanced
Window functions, joins, SQL, JSON, and AI features.
Window Functions Without Writing SQL: A Visual Guide
Understand ROW_NUMBER, RANK, running totals, LAG, and LEAD through visual examples.
Building Complex Filters with AND, OR, and NOT
Combine multi-condition filters with nesting, type-specific operators, and keep/remove modes.
Group By and Aggregate Without Writing SQL
Summarize totals by category using Pivot, Window functions, or the SQL editor.
Building a Multi-Step Data Pipeline
Chain operations into a reproducible pipeline with editable, reorderable steps.
Using AI to Generate SQL Queries for Your Data
Describe what you want in English and get DuckDB SQL back from Claude.
DuckDB SQL Cheatsheet for CSV Analysis
Quick reference for aggregations, string functions, dates, window functions, and more.
Analyzing Data Across Multiple Files
Load multiple CSVs, join them, and build cross-file analysis pipelines.
Converting Between CSV, Parquet, JSON, and Excel
Open any format and export as another. Clean your data before converting.
Finding the Nth Largest or Smallest Values
Find the 3rd highest salary, 5th lowest price, or any Nth extreme value.
Using Regex to Extract Patterns from Text
Pull zip codes, invoice numbers, or any pattern using regular expressions.
Using Multiple Views to Compare Transformations
Create parallel analysis branches without losing your current pipeline.
Grouping and Aggregating Data: A Practical Guide
Use the Group & Aggregate operation to compute SUM, AVG, COUNT, and more across grouped rows.
Extracting Tables from PDF Files in Your Browser
How PDF table extraction works, what types of PDFs produce good results, and tips for cleanup.
Working with XML Data: Import, Explore, and Convert
Import XML files, flatten nested elements into columns, and export as CSV or JSON.
Exporting Data as a Self-Contained HTML Table
Generate a standalone HTML file with your data table. Opens in any browser, no dependencies.
Workflows
End-to-end scenarios chaining multiple operations.
Building a Monthly Sales Report from Raw Transactions
Turn flat order data into a summary by month, product, and region.
Segmenting Customers by Purchase Behavior
Create customer tiers based on total spend, frequency, and recency.
Cleaning and Normalizing a CRM Export
Fix messy names, remove duplicates, fill gaps, and trim a Salesforce or HubSpot export.
Merging Survey Responses with Respondent Data
Join survey exports with your customer database to analyze by company or tier.
Analyzing Expense Reports for Policy Violations
Flag over-limit expenses, weekend charges, and duplicate submissions.
Reconciling Two Inventory Lists
Compare system counts against physical counts to find discrepancies.
Calculating Time-to-Resolution from Support Tickets
Measure resolution time and analyze by priority, category, and assignee.
Cleaning an Email Marketing List Before a Campaign
Remove duplicates, fix formatting, filter fake addresses, and segment by domain.
Tracking Customer Cohorts by First Purchase Month
Build a cohort retention table to track how many customers return each month.
Preparing Data for Migration to a New System
Reformat, rename, reorder, and type-convert columns to match a target schema.
Parsing and Analyzing Server Log Files
Extract timestamps, log levels, and service names from raw log entries.
Comparing Year-To-Date Financial Data Across Years
Find which products or regions grew by comparing this year vs. last year.
Summarizing Sales Data by Region and Product Category
Import, filter, group, and aggregate order data into a regional summary. Export to Excel.
Converting PDF Invoices to a Clean Spreadsheet
Extract line items from PDF invoices, clean up the data, and export as Excel.