| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by XCSme 150 days ago

> 3.1 Flash-Lite (reasoning)

(reasoning) doesn't say much. Is it low/med/high reasoning? I ran my own benchmarks, and 3.1 Flash-Lite on high costs A LOT: https://aibenchy.com/compare/google-gemini-3-1-flash-lite-pr...

Do not use 3.1 Flash-Lite with HIGH reasoning, it reasons for almost max output size, you can quickly get to millions of tokens of reasoning in a few requests.

1 comments

vlmutolo 150 days ago

Wow, that’s very interesting. I wish more benchmarks were reported along with the total cost of running that benchmark. Dollars per token is kind of useless for the reasons you mentioned.

link

XCSme 150 days ago

Yup, MiniMax M-2.5 is a standout in that aspect. It's $/token is very low, because it reasons forever (fun fact, that's also the reason why it's #1 on OpenRouter, because it simply burns through tokens, and OpenRouter ranking is based on tokens usage)...

link

XCSme 150 days ago

https://aibenchy.com/compare/google-gemini-3-1-flash-lite-pr...

link