Y
Hacker News
new
|
ask
|
show
|
jobs
by
ghughes
1071 days ago
But given the rumored architecture (MoE) it would make complete sense for them to dynamically scale down the number of models used in the mixture during periods of peak load.