Y
Hacker News
new
|
ask
|
show
|
jobs
by
deweller
673 days ago
Is it possible that the 8 TB is just the extracted text?
1 comments
tokai
673 days ago
No, the Safedocs dataset is unprocessed pdfs.
link