|
|
|
|
|
by freediver
666 days ago
|
|
Short answer is no, because there is no 'standardized' use case. One thing is sure - that current commonly used benchmarks are mostly polluted and worthless. So you have to go to niche ones. For example the one I check for coding is Aider LLM leaderboard [1]. We maintain Kagi LLM Benchmarking Project [2] optimized for the use case of using LLMs in search. [1] https://aider.chat/docs/leaderboards/ [2] https://help.kagi.com/kagi/ai/llm-benchmark.html |
|