Thanks! The key insight: don't fight the model's limitations, design around them.
Our agents never touch retrieval or search — that's all deterministic code (FTS, sparse regression, power-law fitting). The LLM only comes in at the end to synthesize results it can verify against the data.
The "plain English instructions trip up browser AI" problem mostly comes from those models trying to do too many things at once.
Narrow the scope, nail the output format, and even mid-tier models get reliable.
Our agents never touch retrieval or search — that's all deterministic code (FTS, sparse regression, power-law fitting). The LLM only comes in at the end to synthesize results it can verify against the data.
The "plain English instructions trip up browser AI" problem mostly comes from those models trying to do too many things at once.
Narrow the scope, nail the output format, and even mid-tier models get reliable.