|
|
|
|
|
by dust42
54 days ago
|
|
The output of any LLM is always 100% hallucination by principle. On top of that, most benchmarks are at best an approximation of LLM quality. Your use case decides which one to use. That said, I haven't tested v4 yet but the old 3.2 is still a decent model. And concerning use cases, I had coding problems that Opus couldn't solve but a local 35B model did. All the talk about frontier and SOTA is do dig deeper and deeper into the pockets of VCs and finally do an IPO. |
|