|
|
|
|
|
by gwern
506 days ago
|
|
This would benefit from examples. What's a gnarly set of documents that this will process to clean useful Markdown, which a much simpler stack like 'pdftotext' would fail on, and what would this buy me over just running Zerox or another OCR tool directly? |
|