Hacker News new | ask | show | jobs
by sfifs 30 days ago
Benchmarking the kind of cost savings I'm seeing moving from sonnet and gemini flash to local models, inference runs at least 90-95% gross margins. So they are probably still gross margin profitable.

BTW form my benchmarking, open weigh models are good enough for many agentic tasks starting with Qwen 3.5/6 family and Deepseek v4 family, so it's likely we'll see displacement of api usage from the premium priced providers. Yes trainingis expensive, this isn't training