Y
Hacker News
new
|
ask
|
show
|
jobs
by
siwakotisaurav
514 days ago
None of the models other than the 600b one are R1. They’re just prev gen models like llama or qwen trained on r1 output making them slightly better
2 comments
int_19h
514 days ago
"Slightly" is an understatement, though. Distillations of R1 are significantly better than the underlying models.
link
doctorpangloss
514 days ago
Yeah but the second comment you see believes they are, and belief is truth when it comes to stock market gambling.
link