| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by InitialPhase55 128 days ago

Curious, how did you settle on Haiku/Sonnet? Because there are much cheaper models on OpenRouter that probably perform comparatively...

Consider Haiku 4.5: $1/M input tokens | $5/M output tokens vs MiniMax M2.7: $0.30/M input tokens | $1.20/M output tokens vs Kimi K2.5: $0.45/M input tokens | $2.20/M output tokens

I haven't tried so I can't say for sure, but from personal experience, I think M2.7 and K2.5 can match Haiku and probably exceed it on most tasks, for much cheaper.

6 comments

lanyard-textile 127 days ago

Since they're opening it publicly on irc here, the safety rails might be a consideration. I've made an agent recently and that's why I'm paying a premium to Anthropic atm -- Though I'm still experimenting to see if it's really necessary.

It's getting some organic usage -- 100M input tokens for just chats this month -- and I've seen enough users try to throw Haiku against the wall and failing to trick it into misbehaving. It "pumps the breaks" a lot and imitates annoyance when you ask it repeatedly :) Handles emotionally driven real-life questions mid-conversation well. It just works.

Not seeing all that consistently with other models I've tried so far -- but I've assumed it's not a completely fair comparison with (e.g.) open weights, since these safety rails are presumably not always arising from the natural model calls.

link

nickthegreek 127 days ago

Agreed and I feel like this is a commonly overlooked and important point. Once you have people who are not you interacting with these bots, the necessity of using a sota model to protect against multi step attacks increases. I don't believe IRC provides a layer for ignoring a user and not letting their commands continue to be received.

link

InitialPhase55 127 days ago

Good point! Didn't consider that aspect, agree.

link

nl 127 days ago

Xiaomi Mimo v2-Flash is fantastic.

I have a relatively hard personal agentic benchmark, and Mimo v2-Flash scores 8% higher in 109 seconds for $0.003 (0.3 cents!) vs Haiku which took 262 seconds for $0.24 (24 cents)

Gemini 3.1 Flash Lite Preview (yes that is its name) is also a solid choice.

link

efromvt 127 days ago

The gemini models are fantastic for price but the naming scheme is ridiculous, I have to triple check it every time.

link

ruguo 128 days ago

MiniMax M2.7 is actually pretty solid. I’ve been using it for coding lately and it handles most tasks just fine, but Opus 4.6 is still on another level.

link

jeremyjh 127 days ago

MiniMax's Token Plan is even less expensive and agent usage is explicitly allowed.

link

faangguyindia 127 days ago

just use gemini flash3, it's better than haiku

link

0123456789ABCDE 127 days ago

unless gp really cares about lower hallucination rates

https://artificialanalysis.ai/?omniscience=omniscience-hallu...

link

attentive 127 days ago

or better yet 3.1 Flash-Lite at $0.25/1M input

link

ls612 128 days ago

Because this is probably paid marketing by Anthropic?

link