Blog

Guides and tutorials for exploring data in your browser.

Cleaning

Fixing messy, inconsistent, or broken data.

5 min read

A Complete Guide to Handling NULL and Empty Values

Understand the difference between NULL, empty strings, "N/A", and "null" and handle each one consistently.

5 min read

How to Standardize Inconsistent Company Names in a CSV

Fix variations like "IBM", "I.B.M.", and "International Business Machines" using Find & Replace and Bulk Replace.

4 min read

How to Remove Test Data and Junk Rows from a CSV

Filter out TEST rows, placeholder emails, and zero-value entries using the condition builder.

4 min read

Converting "$1,234.56" Strings to Actual Numbers

Strip dollar signs and commas, then convert to a numeric type you can sum and filter.

5 min read

Fixing Dates When Your CSV Has Three Different Formats

Normalize MM/DD/YYYY, Jan 15 2024, and ISO dates into one consistent date type.

4 min read

Converting Yes/No/True/False/1/0 to a Consistent Boolean

Group all truthy and falsy variations into a consistent boolean column.

5 min read

Splitting a Full Address into Street, City, State, Zip

Break apart single-column addresses into separate fields for geographic analysis.

5 min read

Fixing Messy Phone Number Formats in Your Data

Standardize phone numbers from 8+ formats into one consistent format.

5 min read

Finding Near-Duplicate Rows in Your Data

Normalize text, then deduplicate records that differ only by case, spacing, or abbreviations.

4 min read

Extracting Email Domains to Identify Company Accounts

Pull the domain from email addresses to segment users by company.

4 min read

Filling Missing Category Labels Using Forward Fill

Fix report data where category names only appear on the first row of each group.

4 min read

Removing Unnecessary Columns from a Large Export

Trim a bloated 80-column export down to just the columns you need.

Exploring

Understanding your data, finding patterns, and filtering.

5 min read

What to Do First When You Open a New CSV File

A systematic approach to understanding an unfamiliar data file in under 5 minutes.

5 min read

How to Find Outliers in Numeric Data Without Code

Spot suspicious values using histograms, top/bottom filters, and summary statistics.

4 min read

Understanding Category Distributions in Your Data

See how many records fall into each category and find dominant groups or misspelled values.

5 min read

Checking Date Ranges and Gaps in Time Series Data

Verify your data covers the expected date range and find missing weeks or months.

5 min read

Finding Patterns in Missing Data

Discover whether missing values are random or systematic - by date, category, or source.

5 min read

Comparing Two Versions of the Same Data File

Find new rows, deleted rows, and changed values between two exports using a FULL JOIN.

4 min read

Quick Top-10 Analysis: Finding the Biggest Items

Find top customers, most common categories, or highest values in 30 seconds.

5 min read

Opening and Exploring JSON Files as Tables

Load JSON or JSONL files, flatten nested objects, and browse API responses visually.

5 min read

Spotting Columns Where the Data Type is Wrong

Identify numbers stored as text, dates as strings, and other type mismatches.

4 min read

Profiling an Excel Export to Understand Data Quality

Assess the completeness and consistency of an .xlsx file before using it.

4 min read

How to Count Unique Values Per Column

Find how many distinct customers, products, or categories exist in each column.

4 min read

Working with Large Files: Sampling and Limiting Rows

Explore multi-million row files efficiently by limiting, filtering, or sampling.

4 min read

Creating Charts and Visualizations from CSV Data

Build bar charts, line charts, and more from your data using the built-in Chart Builder.

Transforming

Reshaping, converting, and computing new values.

4 min read

Adding a Calculated Column to Your CSV

Create profit from revenue minus cost, totals from quantity times price, and other computed columns.

5 min read

Calculating Percentages and Ratios Across Columns

Calculate what fraction of total revenue each product or region represents.

4 min read

Adding Running Totals and Cumulative Sums

Add a column that accumulates values over time for revenue tracking or reconciliation.

5 min read

Ranking Rows by Value: Top Salespeople, Best Products

Assign rank numbers with ROW_NUMBER, RANK, and DENSE_RANK, including per-group ranking.

5 min read

Creating Conditional Columns: If-Then Logic Without Code

Add "High", "Medium", "Low" tier columns based on revenue thresholds or custom rules.

5 min read

Extracting Data From Messy Text Columns

Pull product codes, invoice numbers, or IDs out of free-text descriptions.

4 min read

Doing Math Across Columns: Add, Multiply, Round

Calculate tax, apply discounts, convert units, and round results.

4 min read

Remapping Category Codes to Readable Labels

Replace "CAT_01" with "Electronics" and "REG_NE" with "Northeast" using Bulk Replace.

4 min read

Splitting Multi-Value Cells into Separate Rows

Turn "electronics, sale, featured" in one cell into three separate rows.

4 min read

Combining First Name and Last Name into Full Name

Merge two name columns into one for mailing lists or reports.

5 min read

Extracting Fields from JSON Columns into Regular Columns

Parse a JSON metadata column into separate, typed columns you can filter and sort.

4 min read

Copying a Column and Transforming the Copy

Keep the original intact while creating a normalized version alongside it.

4 min read

Calculating Age from a Date of Birth Column

Turn a birthdate column into current age in years using Date Difference.

Advanced

Window functions, joins, SQL, JSON, and AI features.

6 min read

Window Functions Without Writing SQL: A Visual Guide

Understand ROW_NUMBER, RANK, running totals, LAG, and LEAD through visual examples.

5 min read

Building Complex Filters with AND, OR, and NOT

Combine multi-condition filters with nesting, type-specific operators, and keep/remove modes.

5 min read

Group By and Aggregate Without Writing SQL

Summarize totals by category using Pivot, Window functions, or the SQL editor.

5 min read

Building a Multi-Step Data Pipeline

Chain operations into a reproducible pipeline with editable, reorderable steps.

5 min read

Using AI to Generate SQL Queries for Your Data

Describe what you want in English and get DuckDB SQL back from Claude.

6 min read

DuckDB SQL Cheatsheet for CSV Analysis

Quick reference for aggregations, string functions, dates, window functions, and more.

5 min read

Analyzing Data Across Multiple Files

Load multiple CSVs, join them, and build cross-file analysis pipelines.

4 min read

Converting Between CSV, Parquet, JSON, and Excel

Open any format and export as another. Clean your data before converting.

4 min read

Finding the Nth Largest or Smallest Values

Find the 3rd highest salary, 5th lowest price, or any Nth extreme value.

5 min read

Using Regex to Extract Patterns from Text

Pull zip codes, invoice numbers, or any pattern using regular expressions.

4 min read

Using Multiple Views to Compare Transformations

Create parallel analysis branches without losing your current pipeline.

5 min read

Grouping and Aggregating Data: A Practical Guide

Use the Group & Aggregate operation to compute SUM, AVG, COUNT, and more across grouped rows.

5 min read

Extracting Tables from PDF Files in Your Browser

How PDF table extraction works, what types of PDFs produce good results, and tips for cleanup.

4 min read

Working with XML Data: Import, Explore, and Convert

Import XML files, flatten nested elements into columns, and export as CSV or JSON.

3 min read

Exporting Data as a Self-Contained HTML Table

Generate a standalone HTML file with your data table. Opens in any browser, no dependencies.

Workflows

End-to-end scenarios chaining multiple operations.

5 min read

Building a Monthly Sales Report from Raw Transactions

Turn flat order data into a summary by month, product, and region.

5 min read

Segmenting Customers by Purchase Behavior

Create customer tiers based on total spend, frequency, and recency.

5 min read

Cleaning and Normalizing a CRM Export

Fix messy names, remove duplicates, fill gaps, and trim a Salesforce or HubSpot export.

5 min read

Merging Survey Responses with Respondent Data

Join survey exports with your customer database to analyze by company or tier.

5 min read

Analyzing Expense Reports for Policy Violations

Flag over-limit expenses, weekend charges, and duplicate submissions.

5 min read

Reconciling Two Inventory Lists

Compare system counts against physical counts to find discrepancies.

5 min read

Calculating Time-to-Resolution from Support Tickets

Measure resolution time and analyze by priority, category, and assignee.

5 min read

Cleaning an Email Marketing List Before a Campaign

Remove duplicates, fix formatting, filter fake addresses, and segment by domain.

5 min read

Tracking Customer Cohorts by First Purchase Month

Build a cohort retention table to track how many customers return each month.

5 min read

Preparing Data for Migration to a New System

Reformat, rename, reorder, and type-convert columns to match a target schema.

5 min read

Parsing and Analyzing Server Log Files

Extract timestamps, log levels, and service names from raw log entries.

5 min read

Comparing Year-To-Date Financial Data Across Years

Find which products or regions grew by comparing this year vs. last year.

4 min read

Summarizing Sales Data by Region and Product Category

Import, filter, group, and aggregate order data into a regional summary. Export to Excel.

4 min read

Converting PDF Invoices to a Clean Spreadsheet

Extract line items from PDF invoices, clean up the data, and export as Excel.