Hacker News new | ask | show | jobs
by BoorishBears 606 days ago
Why not preprompt with ```json {
1 comments

Yes, you can pre-fill the assistant's response with "```json {" or even "{" and that should increase the likelihood of getting a proper JSON in the response, but it's still not guaranteed. It's not nearly reliable enough for a production use case, even on a bigger (8B) model.

I could recommend using ollama or VLLm inference servers. They support a `response_format="json"` parameter (by implementing grammars on top of the base model). It makes it reliable for a production use, but in my experience the quality of the response decreases slightly when a grammar is applied.

Grammars are best but if you read their comment they're apparently using ollama in a situation that doesn't support them.