Y
Hacker News
new
|
ask
|
show
|
jobs
by
bildung
509 days ago
Well there's the practical reason of the model natively being fp8 ;) One of the innovative ideas making it so much less compute-intensive, apparently.