Hacker News new | ask | show | jobs
by YetAnotherNick 1023 days ago
This is the least detailed foundational model release I have seen. Llama paper offers lot more details like ablations, loss curves etc. Falcon has data preparation details etc. Google's model release papers like T5 are some of the best and includes many ablations.
2 comments

I mean "I am become death, destroyer of worlds" bullshit about AI safety/ethics/etc that is included in every press release from Google/Meta/OpenAI and even much smaller players.
Yes, but in many cases paper is written by many people including ethicists who believes that and add that to the paper. It doesn't deplete the value of people who actually made it work.
Why are ablations useful? Their release report seemed very informative to me without getting bogged down in jargon.
Ablations are important because they tell why the model is better. Here the model is of similar size of llama 7b, trained on 1/3rd the dataset still their claim is that the performance is better. Now this could happen due to lot of things like relu squared or better dataset or 16k tokens. We just don't know why it performed better.