|
|
|
|
|
by gbickford
551 days ago
|
|
Small models don't "know" as much so they hallucinate more. They are better suited for generations that are based in a ground truth, like in a RAG setup. A better comparison might be Flash 2.0 vs 4o-mini. Even then, the models aren't meant to have vast world knowledge, so benchmarking them on that isn't a great indicator of how they would be used in real-world cases. |
|