|
|
|
|
|
by mbesto
222 days ago
|
|
HN members do too. Look at my comment history. The general populace doesn't care to question how benchmarks are formulated and what their known (and unknown) limitations are. That being said, they are likely decent proxies. For example, I think the average user isn't going to observe a noticeable difference between Claude Sonnet and OpenAI Codex. |
|