Hacker News new | ask | show | jobs
by firejake308 561 days ago
Remember that you still need to include an instruction to produce a CSV to get the prompt into the right context to generate a CSV that makes sense. Otherwise, you may get output that is technically in the CSV format but doesn't make any sense because the model was actually trying to write a paragraph response and the token sampler just selected really low-probability tokens that the model didn't really want to say.
1 comments

It seems ollama only supports JSON Schema.

Interestingly, JSON Schema has much less of this problem than say CSV - when the model is forced to produce `{"first_key":` it will generally understand it's supposed to continue in JSON. It still helps to tell it the schema though, especially due to weird tokenization issues you can get otherwise.

> It seems ollama only supports JSON Schema.

"Encoding" CSV as JSON is trivial though, so make it output JSON then parse the array-of-arrays into CSV :)