Y
Hacker News
new
|
ask
|
show
|
jobs
by
gunalx
36 days ago
They probably use it on all models. Fast is probably just a resource pool with less congestion and therefore faster throughput per user but less efficent.
1 comments
cma
36 days ago
If it speeds prefill too I guess so.
link