| Hi all! My friend and I (MIT alumni and longtime engineers) created Papermint (papermintai.com), a web application that allows you to easily extract multiple key phrases from any PDF document that has searchable text. No consultation needed, you can click the link and start experimenting What are key phrases: Phrases in a document that fall under some "type" or "category". For example, with a restaurant menu, key phrases could be "item names", "item prices", "item descriptions", etc. Easy to use: Papermint does not require annotating multiple documents or writing complex rules to extract data. To achieve good extraction accuracy, all Papermint needs is the name and description of your key phrase. E.g. "item names": "Names of items that can be ordered in a restaurant menu, e.g. 'Slice of Pizza', 'Coke', etc." How is this different from existing tools? Current tools cannot extract all key phrases when their number varies across documents. They also do not work very well with document types they don’t directly support without a lot of manual annotation. Please reach out to me with any questions or feedback, excited to hear how you use it! |