Hacker News new | ask | show | jobs
by kenan_warren 962 days ago
The Pile is still used to train LLMs and it's still very much available on the net. I agree it's a risk to train your models on the dataset until the legal implications are worked out, but it doesn't seem to be stopping people.