Essentially yes, but with a few additions CSV lacks:
1. Multiple tables in one document (table.users, table.orders)
2. References between tables (:user:42 links to id 42)
3. Object blocks for config/metadata
4. Streaming format (ISONL) for large datasets
The type annotations are optional - they help LLMs understand the schema without inference.
You could think of it as "CSV that knows about relationships" - which is exactly what multi-agent systems need when passing state around.
Tested across Claude, GPT-4, DeepSeek, and Llama 3.
The key finding: LLMs handle tabular formats natively because they've seen billions of markdown tables and CSVs in training.
No special prompting needed.
For associations, I tested with multi-table ISON docs like:
table.users
id name
1 Alice
2 Bob
table.orders
id user_id product
101 :1 Widget
102 :2 Gadget
Prompt: "What did Alice order?"
All models correctly resolved :1 → Alice → Widget without explicit instructions about the reference syntax.
The 30-70% token savings come from removing JSON's structural overhead (braces, quotes, colons, commas) while keeping the same semantic density.
Haven't published formal benchmarks on this yet - that's good feedback. I should.
1. Multiple tables in one document (table.users, table.orders) 2. References between tables (:user:42 links to id 42) 3. Object blocks for config/metadata 4. Streaming format (ISONL) for large datasets
The type annotations are optional - they help LLMs understand the schema without inference.
You could think of it as "CSV that knows about relationships" - which is exactly what multi-agent systems need when passing state around.