|
|
|
|
|
by ks2048
673 days ago
|
|
I don’t have 8TB laying around, but we can be a bit more clever.... In particular I cared about a specific column called url. I really care about the urls because they essentially tell us a lot more from a website than what meats the eye.
I'm I correct that it is only only using the URL of the PDF to do classification? Maybe still useful, but that's quite a different story than "classifying all the pdfs". |
|
The legwork to classify PDFs is already done, and the authorship of the article can go to anyone who can get a grant for a $400 NewEgg order for an 8TB drive.