| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vinibrito 214 days ago

For JSON I agree, now I can just mention JSON and provide examples and the response always comes in the right format, but for tool calling and information retrieval I have never seen a system actually work, nor in my tests have these worked.

Now, I'm open to the idea that I am just using it wrong, but I have seen several reports around the web that the most that people got in tool calling accuracy is 80%, which is unusable for any production system, also for info retrieval I have seen it lose coherence the more data is available overall.

Is there a model that actually achieved 100% tool calling accuracy?

So far I built systems for that myself, surrounding the LLM, and only like this it worked well in production.