|
|
|
|
|
by djhn
87 days ago
|
|
LLMs will often helpfully predict made up tokens for the content of the data fields. For 100% of jq use cases I have the data wouldn’t fit into context. But even for the smaller things, I have never, not even once, had an LLM not mangle data that is fed into it. Take a feed of blog posts (and select the first 50 or so just to give the model a fighting chance). I’ll give you 80% likelihood of the output being invalid JSON. And if you manage to get valid JSON out of it, the actual dates, times and text content will have changed. |
|
One possibility: Claude Code subagents get their own 1 million token context window; should be better with large JSON files vs. having everything in the same context window.