Hacker News new | ask | show | jobs
FinePDFs: 3T token dataset made from internet PDFs
3 points by hynky 292 days ago
1 comments