|
|
|
|
|
by daemonologist
674 days ago
|
|
I'm doing some work for a company that handles scanned documents (PDFs which are purely images) and they accumulate about 15 TB / year. Of course the actual amount of information is relatively small, just inflated by being scanned. Probably 80% of them were typed up, printed, and then scanned or faxed, and of course the first thing we do is OCR them to try to recover the original text and formatting... |
|