Hacker News new | ask | show | jobs
by jacky2wong 1073 days ago
Point 1 doesn't feel like a good enough reason. The number of tokens outputted as a JSON is so small if you tell GPT to output it properly.
1 comments

Costs add up surprisingly quickly. A quote-colon-space-quote combo alone is four tokens wasted. Now scale that up....
Using tiktokenizer, these are only two tokens: quote-colon is token 498, space-quote is token 330 (as per https://tiktokenizer.vercel.app/ ). But I agree to the general argument.

I think what factors in even more when you use the API is that you do not have fine-grained control over the generation process. If you follow the MS guidance approach, you fill in structured text yourself, and then let the model generate only the value parts, e.g. up to the next quote. To do that more or less word by word, you have multiple API calls, and have to be very smart about providing the right stop tokens.