Hacker News new | ask | show | jobs
by gnulinux 313 days ago
Wow, that's significantly cheaper than o4-mini which seems to be on part with gpt-oss-120b. ($1.10/M input tokens, $4.40/M output tokens) Almost 10x the price.

LLMs are getting cheaper much faster than I anticipated. I'm curious if it's still the hype cycle and Groq/Fireworks/Cerebras are taking a loss here, or whether things are actually getting cheaper. At this we'll be able to run Qwen3-32B level models in phones/embedded soon.

2 comments

It's funny because I was thinking the opposite, the pricing seems way too high for a 5B parameter activation model.
Sure you're right, but if I can squeeze out o4-mini level utility out of it, but its less than quarter the price, does it really matter?
Yes
Are the prices staying aligned to the fundamentals (hardware, energy), or is this a VC-funded land grab pushing prices to the bottom?