|
|
|
|
|
by shihab
501 days ago
|
|
Here [1] is the leaderboard from chabot arena, where users vote on the output of two anonymous models. Deepseek R1 needs more data points- but it already climbed to No 1 with Style control ranking, which is pretty impressive. Link [2] to the result on more standard LLM benchmarks. They conveniently placed the results on the first page of the paper. [1] https://lmarena.ai/?leaderboard [2] https://arxiv.org/pdf/2501.12948 (PDF) |
|