Hacker News new | ask | show | jobs
Releasing Common Corpus: the largest public domain dataset for training LLMs (huggingface.co)
1 points by ororm 816 days ago
1 comments