← All posts

Arif Aslam

Senior software/data engineer · Bangalore, India · LinkedIn

I'm Arif Aslam, a senior software/data engineer based in Bangalore, India. I've spent the last several years working on data tooling: backend systems, analytics platforms, and the infrastructure that supports them. By day I'm at Mammoth Analytics building a data analytics platform. ExploreMyData is what I work on after hours.

The reason I built it is simple. Every time I needed to take a quick look at a Parquet file or convert a CSV between formats, I'd hit the same wall: upload it to a sketchy online tool, install Python, or fire up Jupyter. None of that works when the data is sensitive or when you just want to glance at it for thirty seconds. DuckDB-WASM changed the math. The same analytical engine that powers serious data pipelines now runs inside a browser tab, and that means real exploration without a server in the loop.

I write here about the things I've learned shipping ExploreMyData and from years of working with data in production. Most posts are practical: a specific problem, the way I solved it, the tradeoffs I'd think about if you're solving the same thing.

What I write about

The posts here cluster around four themes:

  • Cleaning and transforming data: fixing dates that come in five different formats, deduping with fuzzy matches, splitting address fields, fixing boolean columns. The unglamorous half of every real analysis.
  • Exploration without writing SQL: counting unique values, profiling a fresh CSV, finding outliers, comparing two files. Operations that should take a click and usually don't.
  • SQL and DuckDB: window functions, cross-file joins, regex extraction, group-by patterns. DuckDB is the engine; I try to make its sharper edges feel approachable.
  • File formats and conversion: when Parquet is worth it, when it isn't, what breaks in PDF tables, how to handle nested JSON sensibly.

Where to find me

The fastest way to reach me is email at support@exploremydata.com. Otherwise I'm on LinkedIn.

All posts

See the full archive on the blog index. Posts are organised by category (cleaning, exploring, transforming, advanced, workflows) and by recency.