|
|
|
|
|
by NateEag
44 days ago
|
|
> Perhaps a widely recognized but not overly optimized for benchmark for this class of problems? I don't see how this could be achieved. Any widely-recognized benchmark is going to be gamed by the genAI companies. They have a strong financial incentive to do so, and their products' nature shows that they are not influenced by ethical or societal-good incentives. |
|