Hacker News new | ask | show | jobs
by freilanzer 683 days ago
I'm in the process of rolling out an LLM to a user facing feature and it's difficult. The scaling is not obvious, and the quality fluctuates even with Llama-3.1 (8B) when compared to GPT-4o. We're probably going with 4o since the JSON return works much more reliable, it follows instructions for text generation more directly, etc.