|
|
|
|
|
by lhl
1139 days ago
|
|
Namespace collisions are inevitable, especially w/ how fast-moving the LLM space is right now, just wanted to point out that besides this "Open-Llama" project (which looks really interesting, and well documented in the Github repo), there is also another group training "OpenLLaMA" https://github.com/openlm-research/open_llama (which looks like an effort by two Berkeley PhD students, https://www.haoliu.site/ and http://young-geng.xyz/ to reproduce LLaMA using the 1.2T token Together RedPajama dataset. They've released up to a 300B checkpoint so far.) Feedback for /u/bayes-song - it'd be great to have a more info on the model card on HF - right now it's unclear the parameter count, # of total tokens you're planning on training on/how many you've trained on so far. An Evaluation section (maybe using lm-evaluation-harness) might be good as well? |
|