| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lou1306 691 days ago
	If the PDFS are textual or have OCR, then pdf2text from the Poppler suite ought to be enough? If not, add Tesseract/ocrmypdf to the pipeline?