Hacker News new | ask | show | jobs
by supern0va 27 days ago
That's correct, and their recent work on natural language autoencoders has given extremely compelling evidence of that...which is why their data collection practices for pre-training have almost certainly evolved, particularly since they've already scraped most of the internet.