Hacker News new | ask | show | jobs
by mmetzger 5164 days ago
Think of it as trying to pull text out of PDF / Office Docs / etc into a form (text, usually) that can be fed to a search engine, catalog, or whatever else you'd like to do with it.