Hacker News new | ask | show | jobs
by 0bit 1390 days ago
I would recommend using Apache Tika to extract the text from the PDFs and using Solr (or Elasticsearch) to index and search them.