Hacker News new | ask | show | jobs
by Tepix 1180 days ago
Have you tried bigger models? Llama-65B can indeed compete with GPT-3 according to various benchmarks. The next thing would be to get the fine-tuning as good as OpenAI's.
1 comments

I wonder how accurate those benchmarks are in terms of actual problem solving capability. I think there's a major line at which point LLM becomes actually useful and it actually feels like you are speaking to something intelligent and that can be useful for you in terms of productivity etc.
They aren't at all. They are synthetic benchmarks that carry little resemblance to real world experience.