Hacker News new | ask | show | jobs
by benxh 981 days ago
It's missing a lot of crucial details. Nothing on the dataset used, nothing on the data mix, nothing on their data cleaning procedures, nothing on the tokens trained.
2 comments

What we get when it is on arxiv first before being peer reviewed.
BERT was on arXiv before being peer reviewed. As were T5, BART, LLaMA, OPT and GPT-NeoX-20B. The Pile and FLAN were also on arXiv before being peer reviewed. Of course, the original Transformer paper was also on arXiv before being peer reviewed.

Being on arXiv before being peer reviewed is not the or even a problem.

I cud almost tell this would be the case when the title of the paper was simply Mistral 7B. A little more info would be useful!