Y
Hacker News
new
|
ask
|
show
|
jobs
by
sp332
792 days ago
Why is the 3B model worse than the 450M model on MMLU and TruthfulQA?