| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Trapais 879 days ago

You can grep for bad words. What you can't do(unless hoops are jumped through) is to verify that weights came from the same dataset. You can set the same random seed and still get different results. Calculations are not that deterministic. (https://pytorch.org/docs/stable/notes/randomness.html#reprod...).

>I am overall skeptical that this is true in the case of LLMs

This skepticism seems reasonable. EleutherAI have documentation to reproduce training (https://github.com/EleutherAI/pythia#reproducing-training). So far I haven't seen it leading to anything. Lots of arxiv papers I've seen complain about time and budget constraint even regarding finetunes, forget pretraining.