Hacker News new | ask | show | jobs
by rbitar 249 days ago
Interesting and if they are using speculative decoding that variance would make sense. Also your numbers line up with what openrouter is now publishing at 169.1tps [1]

Anthropic mentioned this model is more then twice as fast as claude sonnet 4 [2], which OpenRouter averaged at 61.72 tps for sonnet 4 [3]. If these numbers hold we're really looking at an almost 3x improvement in throughput and less then half the initial latency.

[1] https://openrouter.ai/anthropic/claude-haiku-4.5 [2] https://www.anthropic.com/news/claude-haiku-4-5 [3] https://openrouter.ai/anthropic/claude-sonnet-4