Y
Hacker News
new
|
ask
|
show
|
jobs
by
bottled_poe
503 days ago
It will be very interesting to see if they can reproduce a similar model on the shoestring budget claimed by Deepseek.
1 comments
anshumankmr
502 days ago
but deepseek hasn't claimed the figure touted by everyone for this particular R1 model, cause that 5.6mn was apparently for Deepseek's coder model
link
boroboro4
502 days ago
5.6mn figure is for base Deepseek V3 model. Both instruction and reasoning tuning of it has neglectable cost in comparison with it.
link