Hacker News new | ask | show | jobs
by pizza 607 days ago
Here's a practical in this vein but much simpler - if you're trying to answer a question with an LLM, and have it answer in json format within the same prompt, for many models the accuracy is worse than just having it answer in plaintext. The reason is that you're now having to place a bet that the distribution of json strings it's seen before meshes nicely with the distribution of answers to that question.

So one remedy is to have it just answer in plaintext, and then use a second, more specialized model that's specifically trained to turn plaintext into json. Whether this chain of models works better than just having one model all depends on the distribution match penalties accrued along the chain in between.

2 comments

I wrap the plaintext in quotes, and perhaps a period, so that it knows when to start and when to stop, you can add logit biases for the syntax and pass period as a stop marker to chatgpt apis.

Also you don't need to use a model to build a json from plaintext answers lol, just use a programming language.

So developing solutions with ai is like trying to build stuff with family feud.