Y
Hacker News
new
|
ask
|
show
|
jobs
by
drbscl
286 days ago
Distributed compute is cool, but $320 for 13 tokens/s on a tiny input prompt, 4 bit quantization, and 3B active parameter model is very underwhelming