|
|
|
|
|
by golfer
23 days ago
|
|
Arena.ai: > Gemini 3.5 Flash’s pricing shifts the Pareto frontier in Text. 8 models from
GoogleDeepMind dominate the Text Arena Pareto curve where only 4 labs are represented for top performance in their price tiers. https://x.com/arena/status/2056793180998361233 |
|
Artificial Analysis's "Cost to run" model (aka num_tokens_used * price_per_token) is much better, but even that is likely problematic since it's not clear whether running a bunch of benchmarks maps cleanly to real-world token use.