Y
Hacker News
new
|
ask
|
show
|
jobs
by
mluo
488 days ago
For quantization, very big impact for small models, can drop at much as 10% on AIME. Our model does best on bfloat16 ;)
Come checkout our repo at:
https://github.com/agentica-project/deepscaler