Hacker News new | ask | show | jobs
by thot_experiment 560 days ago
YMMV, it's a negative effect in terms of "reasoning" but the delta isn't super significant in most cases. It really depends on the LLM and whether your prompt is likely to generate a JSON response to begin with, the more you have to coerce the LLM the less likely it is to generate sane input. With smaller models you more quickly end up at the edge of space where the LLM has meaningful predictive power and so the outputs start getting closer to random noise.

FWIW measured by me using a vibes based method, nothing rigorous just a lot of hours spent on various LLM projects. I have not used these particular tools yet but ollama was previously able to guarantee json output through what I assume is similar techniques and my partner and I worked previously on a jsonformer-like thing for oobabooga, another LLM runtime tool.