Hacker News new | ask | show | jobs
by golergka 314 days ago
4o-mini costs ~$0.26 per Mtok, running qwen-2.5-7b on a rented 4090 (you can probably get better numbers on a beefier GPU) will cost you about $0.8. But 3.5-turbo was $2 per Mtok in 2023, so IMO actual technical progress in LLMs drives prices down just as hard as venture capital.

When Uber did it in 2010s, cars didn't get twice as fast and twice as cheap every year.