Hacker News new | ask | show | jobs
by great_psy 889 days ago
This a pretty strong claim with zero data to back it up
2 comments

Every small model that has outperformed GPT-4 has proven to be an overfit, so I would say it is the obvious claim, and any claim opposite that is what we should be skeptical of.
With the exception of task specialization. Fine-tuning a small model such as Mistral 7B on a specific set of tasks can outperform using GPT-4 on those tasks, and with cheaper and faster inference.
Not on the leaderboards mentioned here. That’s my point, you can overfit for specific tasks, you can’t beat them on multi-task leaderboards without training on the test data.
While I lack specific data, my intuition is based on observed trends in AI model development. I believe some other models that claimed such numbers excelled in benchmarks but fell short in real-world applications. Further research can validate this claim, and I welcome a balanced discussion.
It does seem incredible that chatgpt has so much expertise in literally everything. Does this mean you can beat chatgpt by creating smaller "experts" and directing questions to each?
See mixture of experts. It’s likely what chatGPT does in the backend.