Hacker News new | ask | show | jobs
by jrandolf 72 days ago
20 tok/s is an average. It can be more, it can be less. If you are running off-peak I'm sure you'd get some crazy number.
2 comments

That doesn’t matter when you have the average. Even if you are somehow able to get 10000tok/s during off peak times, by virtue of how averages work, you’re still only getting 52M tokens per month (as calculated above).
Why wouldn't developers just do llm arbitrage against openrouter if it is a better deal?
The problem is different. OpenRouter is a router to LLMs. It doesn't solve GPU underutilization.
What I am saying is if your system lets me pay $x/token and open router lets me pay $y/token if x<y then someone could make money just by providing those tokens through the open router API. That would either drive up demand for your systems increasing costs or drive up supply on open router decreasing costs. Eventually the costs would converge, no?
For the same reason people don’t do server arbitrage because Hetzner is cheaper than AWS.