Y
Hacker News
new
|
ask
|
show
|
jobs
by
pama
490 days ago
I am sure DeepSeek did optimize the inference cost of R1. They did not yet release an efficient MoE downscaling of it, ie an R1-mini.