Hacker News new | ask | show | jobs
by joshvm 1152 days ago
There are efforts to provide an open source replica of the training dataset and independently trained models. So far the dataset has been recreated following the original paper (allowing for some vagueness that Meta researchers didn't specify):

https://github.com/togethercomputer/RedPajama-Data/

https://twitter.com/togethercompute/status/16479179892645191...