CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

Question

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

Pocket PortfolioverifiedBacker posted Dec 17, 2025 Originally published at pocketportfolio.app 4 min read

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

If you have ever tried to build anything on top of retail investing data, you already know the truth:

Every broker CSV export is garbage.

Not a bit inconsistent.
Not mostly usable.
Garbage.

This post is about how I spent three months normalizing 15 different broker CSV formats into a single, predictable, local-first schema, and why I extracted the whole thing into a standalone npm package so no one else has to suffer through this again.

This is not a marketing post.
This is an engineer-to-engineer write-up about pain, tradeoffs, and why CSV parsing is a harder problem than most people admit.

The Problem: Broker CSVs Are a Nightmare

Let us start with the reality.

You download a CSV from a broker thinking:
"Great, I will just parse this and move on."

Then you open the file.

Here are real examples I dealt with.

Robinhood

Ticker is in column B
Column name is "Instrument" instead of "Symbol"
Buy and sell are represented as positive and negative quantities, but fees are embedded in price

Fidelity

Adds three blank rows at the top of every export
Column headers do not start until row 4
Dates are formatted inconsistently across exports

eToro

Dates are formatted as DD/MM/YYYY even for US users
Numbers use commas as decimal separators in some locales
Asset names include emojis and marketing strings

And that is before we even talk about:

Missing headers
Renamed columns between years
Silent format changes
CSVs that are technically valid but semantically useless

I looked for an existing library to handle this.

There was not one.

So I built it.

The Goal: One Schema to Rule Them All

Before writing a single line of parsing logic, I defined a rule:

Every broker must normalize into the same internal shape, no exceptions.

If the schema is not strict, everything downstream breaks:

P and L calculations
Portfolio timelines
Tax logic
Visualizations

So I designed a Standard Trade Schema that every adapter must output.

At a high level, every row becomes something like this:

{
  "id": "uuid",
  "symbol": "AAPL",
  "assetType": "equity",
  "side": "buy",
  "quantity": 10,
  "price": 172.34,
  "currency": "USD",
  "fees": 1.25,
  "executedAt": "2024-01-12T15:31:00Z",
  "broker": "robinhood",
  "raw": {}
}

Once everything looks like this, the rest of the system becomes boring, which is exactly what you want.

The Architecture: Adapter Pattern or Nothing

The only sane way to solve this was the Adapter Pattern.

Each broker gets:

Its own adapter
Its own parsing rules
Its own ugly edge cases

But all adapters implement the same interface.

The Interface

export interface BrokerAdapter {
  canParse(file: File): boolean;
  parse(file: File): Promise<StandardTrade[]>;
}

That is it.

No shared assumptions.
No almost compatible logic.

If an adapter cannot confidently parse a file, it refuses to.

From Raw CSV to Structured Data

Here is what a real raw row might look like coming out of a CSV parser:

{
  "Instrument": "Apple Inc.",
  "Quantity": "10",
  "Price": "$172.34",
  "Trade Date": "01/12/2024",
  "Fees": ""
}

Every adapter’s job is to:

Map column names
Normalize numbers
Fix dates
Infer missing fields
Validate the output

Example (simplified):

function normalizeRobinhoodRow(row: any): StandardTrade {
  return {
    id: crypto.randomUUID(),
    symbol: mapInstrumentToSymbol(row["Instrument"]),
    assetType: "equity",
    side: Number(row["Quantity"]) > 0 ? "buy" : "sell",
    quantity: Math.abs(Number(row["Quantity"])),
    price: parseCurrency(row["Price"]),
    currency: "USD",
    fees: parseCurrency(row["Fees"] || "0"),
    executedAt: parseDate(row["Trade Date"]),
    broker: "robinhood",
    raw: row
  };
}

Every adapter does ugly work so the rest of the app never has to.

The Aha Moment: Do It in the Browser

Originally, I planned to do this on the backend.

That was a mistake.

Why Local-First Wins

CSV files can be large
Financial data is sensitive
Users do not want to upload raw trade history to a server
Latency kills onboarding

So I moved everything into the browser.

Tools Used

PapaParse for streaming CSV parsing
Web Workers to avoid blocking the UI
Zod-style validation for schema safety

The result:

Drag and drop CSV import
Zero backend processing
No credentials, no accounts, no risk

The user keeps their data.
The browser does the work.

The Trojan Horse: Turning Pain into a Tool

At some point it became obvious:

This logic should not be locked inside an app.

So I extracted it into a standalone package:

@pocket-portfolio/importer
https://www.npmjs.com/package/@pocket-portfolio/importer

Install it:

npm install @pocket-portfolio/importer

Use it in your own project:

import { importTrades } from "@pocket-portfolio/importer";

const trades = await importTrades(file);

That is it.

No signup.
No API keys.
No backend.

Why This Is the Gateway

Most developers do not wake up wanting a portfolio tracker.

They wake up angry that:

Their broker export is unusable
Their script broke again
Their data is inconsistent

This library solves that problem.

And once you have clean, normalized trade data, the next question is obvious:

"Okay, now what do I do with it?"

That is where Pocket Portfolio comes in, a free, local-first dashboard that already understands this schema.

You can use the library alone.
Or you can see it working end to end.

https://www.pocketportfolio.app

Lessons Learned

CSV parsing is not a simple task
Normalization is a product feature
Local-first is not a buzzword, it is UX
The best open-source tools start by solving boring pain

What Is Next

More brokers
Better locale handling
Schema versioning
Community adapters via PRs

If you have ever screamed at a CSV file, this is for you.

Links

This is part of the Rolling Thunder launch for Pocket Portfolio.
Every post ships something real.

2 Comments

chevron_left

Commenters (This Week)

Contribute meaningful comments to climb the leaderboard and earn badges!

Greg Elfrink · Answer 1 · 2025-12-18T06:04:54+0000

Greg Elfrink • Dec 18, 2025

Really clever approach here, moving everything local-first to handle sensitive CSV data feels like a game changer for developers.

Pocket Portfolioverified • Dec 22, 2025

Thanks Greg — glad that stood out.

Local-first ended up being the biggest unlock for us too. Once the data never leaves the browser, a lot of security, trust, and UX problems just disappear. It turns CSV import from a risk into a non-event.

Appreciate you reading and sharing the feedback.

	Local-First: The Browser as the Vault Pocket Portfolioverified - Apr 20
	Architecting a Local-First Hybrid RAG for Finance Pocket Portfolioverified - Feb 25
	How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work Dharanidharan - Feb 9
	Comparison: Universal Import vs. Plaid/Yodlee Pocket Portfolioverified - Mar 12
	Merancang Backend Bisnis ISP: API Pelanggan, Paket Internet, Invoice, dan Tiket Support Masbadar - Mar 13

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

The Problem: Broker CSVs Are a Nightmare

Robinhood

Fidelity

eToro

The Goal: One Schema to Rule Them All

The Architecture: Adapter Pattern or Nothing

The Interface

From Raw CSV to Structured Data

The Aha Moment: Do It in the Browser

Why Local-First Wins

Tools Used

The Trojan Horse: Turning Pain into a Tool

Why This Is the Gateway

Lessons Learned

What Is Next

Links

2 Comments

Please log in to add a comment.

Please log in to comment on this post.

More Posts

Local-First: The Browser as the Vault

Architecting a Local-First Hybrid RAG for Finance

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

Comparison: Universal Import vs. Plaid/Yodlee

Merancang Backend Bisnis ISP: API Pelanggan, Paket Internet, Invoice, dan Tiket Support

More From Pocket Portfolio

Route to Rise: Code as the Global Language

Stateless Gateway for Institutional Inference

Design Partnership Challenge

Related Jobs

Commenters (This Week)

Welcome to Coder Legion

Connect with 4,163 amazing developers

Don't have an account? Sign up

OR

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

CSV Hell: How I Normalized 15 Broker Exports into a Single Local-First Schema

The Problem: Broker CSVs Are a Nightmare

Robinhood

Fidelity

eToro

The Goal: One Schema to Rule Them All

The Architecture: Adapter Pattern or Nothing

The Interface

From Raw CSV to Structured Data

The Aha Moment: Do It in the Browser

Why Local-First Wins

Tools Used

The Trojan Horse: Turning Pain into a Tool

Why This Is the Gateway

Lessons Learned

What Is Next

Links

2 Comments

Please log in to add a comment.

Please log in to comment on this post.

More Posts

More From Pocket Portfolio

Related Jobs

Commenters (This Week)