Y
Hacker News
new
|
ask
|
show
|
jobs
by
cma
121 days ago
When you predict with the small model, the big model can verify as more of a batch and be more similar in speed to processing input tokens, if the predictions are good and it doesn't have to be redone.