| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throw03172019 181 days ago
	So CSV with a “typed” header?

1 comments

maheshvaikri99 181 days ago

Essentially yes, but with a few additions CSV lacks:

1. Multiple tables in one document (table.users, table.orders) 2. References between tables (:user:42 links to id 42) 3. Object blocks for config/metadata 4. Streaming format (ISONL) for large datasets

The type annotations are optional - they help LLMs understand the schema without inference.

You could think of it as "CSV that knows about relationships" - which is exactly what multi-agent systems need when passing state around.

link

throw03172019 181 days ago

Got it. Thanks.

Any data on how LLMs like this format? Are they able to make the associations etc?

link

maheshvaikri99 181 days ago

Yes - I ran a 300 Questions benchmark comparing ISON vs JSON vs JSON-COMPACT etc on the same tasks.

ISON: 88.3% accuracy JSON: lower (can share exact numbers if interested)

Tested across Claude, GPT-4, DeepSeek, and Llama 3.

The key finding: LLMs handle tabular formats natively because they've seen billions of markdown tables and CSVs in training. No special prompting needed.

For associations, I tested with multi-table ISON docs like:

table.users id name 1 Alice 2 Bob

table.orders id user_id product 101 :1 Widget 102 :2 Gadget

Prompt: "What did Alice order?"

All models correctly resolved :1 → Alice → Widget without explicit instructions about the reference syntax.

The 30-70% token savings come from removing JSON's structural overhead (braces, quotes, colons, commas) while keeping the same semantic density.

Haven't published formal benchmarks on this yet - that's good feedback. I should.

link