Hacker News new | ask | show | jobs
by solverist 1023 days ago
Why are ablations useful? Their release report seemed very informative to me without getting bogged down in jargon.
1 comments

Ablations are important because they tell why the model is better. Here the model is of similar size of llama 7b, trained on 1/3rd the dataset still their claim is that the performance is better. Now this could happen due to lot of things like relu squared or better dataset or 16k tokens. We just don't know why it performed better.