| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by VulgarExigency 2 days ago
	Any chance of also benchmarking a couple of more affordable Chinese models? (specifically Deepseek and Xiaomi's MiMo)

1 comments

swyx 2 days ago

i think <third party evals platform> will help us do that best on their standardized model matrix. for frontiercode’s launch we were focused on.. the frontier models

link

VulgarExigency 2 days ago

What qualifies as a frontier model? From my personal "taste tests", I wouldn't have placed Sonnet or Kimi above Deepseek Pro or MiMo, or Gemini 3.1 Flash Lite above Deepseek Flash, but they're listed in the benchmark.

link