Hacker News new | ask | show | jobs
by esafak 1056 days ago
That is not so certain. Microsoft's "Textbooks are all you need" is a case in point. https://news.ycombinator.com/item?id=36413768
1 comments

That paper kind of does the same thing that my comment above proposed, starting with as large dataset as they can get and then filtering it to extract a much smaller dataset focused on a specific task that still is larger than all of English Wikipedia.