Y
Hacker News
new
|
ask
|
show
|
jobs
by
s1artibartfast
9 days ago
My understanding is that they didnt do any distalation. Tevery weight is a 60/40 element wise average of QWEN and NEX. Is this possible if the rio contracter did thei own post-training as claimed?
https://x.com/tenobrus/status/2066243352211996728/photo/1