|
|
|
|
|
by bigbuppo
118 days ago
|
|
Maybe make that intelligence per token per relative unit of hardware per watt. If you're burning 30 tons of coal to be 0.0000000001% better than the 5 tons of coal option because you're throwing more hardware at it, well, it's not much of a real improvement. |
|
But to be clear, 1000 tokens/second is WAY better. Anthropic's Haiku serves at ~50 tokens per second.