|
|
|
Show HN: Annotate any document and train extraction by example, not prompts
(deeptagger.com)
|
|
2 points
by avloss11
276 days ago
|
|
Hi HN, I built a tool for teaching LLMs how to extract structured data from documents by annotating, not prompt engineering. I’d love your feedback. How it works:
- Upload a document (DOCX, PDF, image, etc.)
- Select and tag parts of it (supports nesting, arrays, custom tag structures)
- Upload another document → click "predict" → see editable annotations
- Amend them and save as a new example
- Call the API with a third document → get JSON back Use cases:
- Identify "important clauses" in contracts
- Extract "total value" from invoices
- Anything subjective, like "healthy ingredients" on a label
- Anything objective, like "postcode" or "phone number"
You could even tag things like "good rhymes" in a poem — basically anything an LLM can understand. The key idea: instead of iterating endlessly on prompts (and sometimes regressing), you just iterate on examples. Each example improves accuracy in a concrete way, and you often need far fewer than traditional ML approaches. We’re also on Product Hunt today (currently #5), but feedback from HN is very appreciated. |
|