Hacker News new | ask | show | jobs
by VulgarExigency 2 days ago
Any chance of also benchmarking a couple of more affordable Chinese models? (specifically Deepseek and Xiaomi's MiMo)
1 comments

i think <third party evals platform> will help us do that best on their standardized model matrix. for frontiercode’s launch we were focused on.. the frontier models
What qualifies as a frontier model? From my personal "taste tests", I wouldn't have placed Sonnet or Kimi above Deepseek Pro or MiMo, or Gemini 3.1 Flash Lite above Deepseek Flash, but they're listed in the benchmark.