Y
Hacker News
new
|
ask
|
show
|
jobs
by
woadwarrior01
86 days ago
> Sorry to shatter your bubble, but this is patently false, LLMs are far more efficient on hardware that simultaneously serves many requests at once.
You might want to read this:
https://arxiv.org/abs/2502.05317v2