March 11, 2026 by Arif Aslam 4 min read

Working with XML Data: Import, Explore, and Convert

XML remains a common export format across enterprise systems, SOAP APIs, RSS feeds, and government open data portals. If you have ever tried to open an XML file in a spreadsheet, you know the result is usually a mess of nested tags rather than usable rows and columns.

ExploreMyData can import XML files directly and flatten them into a table you can filter, transform, and export. The entire process runs in your browser. Nothing is uploaded to a server.

How the import works

When you load an XML file, the parser walks the document tree and converts it to a JSON intermediate representation. That JSON is then loaded into DuckDB as a table. Each repeating element becomes a row, and its child elements become columns. Attributes are preserved as separate columns with an @ prefix.

For example, an element like <item id="42"> produces a column called @id alongside any child element columns. Nested structures are flattened using dot notation: author.name, author.email.

Example: importing an RSS feed

Consider a standard RSS feed exported from a blog or news site. The XML contains a <channel> element with multiple <item> children:

<rss version="2.0">
  <channel>
    <title>Engineering Blog</title>
    <item>
      <title>Deploying to Production</title>
      <link>https://example.com/deploy</link>
      <pubDate>Mon, 01 Apr 2026 08:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Database Migration Patterns</title>
      <link>https://example.com/migrations</link>
      <pubDate>Fri, 28 Mar 2026 10:30:00 GMT</pubDate>
    </item>
  </channel>
</rss>

After import, the table looks like this:

title	link	pubDate
Deploying to Production	https://example.com/deploy	Mon, 01 Apr 2026 08:00:00 GMT
Database Migration Patterns	https://example.com/migrations	Fri, 28 Mar 2026 10:30:00 GMT

Each <item> becomes a row. The child elements become columns. From here you can sort by date, filter by title keyword, or convert the pubDate string to a proper date type for time-based analysis.

Common XML sources

The same import process works for several common XML formats:

RSS and Atom feeds - blog posts, podcast episodes, news articles. Each entry becomes a row with title, link, date, and description columns.
SOAP API responses - many legacy enterprise APIs return XML. Save the response body as a file, then import it to inspect the payload as a table.
CRM and ERP exports - Salesforce, SAP, and similar systems often provide XML export options. These tend to be deeply nested, but the flattening logic handles multiple levels.
Government open data - regulatory filings, public records, and statistical datasets are frequently published as XML. The U.S. SEC EDGAR system and EU open data portals are typical examples.

After import: transform and export

Once your XML data is loaded as a table, you have access to the full pipeline. Common next steps:

Rename columns - XML element names are often verbose. Rename publicationDate to date for clarity.
Convert types - dates arrive as strings. Cast them to a date type so you can sort and filter by date range.
Filter rows - remove entries that don't match your criteria. Keep only items from the last 30 days, or only records with a specific status.
Aggregate - count entries by category, sum values by group, or compute averages. The grouping and aggregation guide covers this in detail.

When you are done, export the result in the format your downstream system expects:

XML to CSV for spreadsheet or database import. From CSV you can chain to Excel if a colleague needs an XLSX, or Parquet for analytics handoff.

Import your XML file →

Working with XML Data: Import, Explore, and Convert

How the import works

Example: importing an RSS feed

Common XML sources

After import: transform and export

Try it yourself

Related tools

More from the blog

Exporting Data as a Self-Contained HTML Table

Grouping and Aggregating Data: A Practical Guide

Group By and Aggregate Without Writing SQL