|
|
|
|
|
by casvc
3157 days ago
|
|
Thanks for asking. The main difference is focus on depth instead of breadth - thus instead of multitude of possible output formats support only few (PDF/HTML/TXT/IMG), but with some added features. Just few examples:
- bulk search and autoredactions (marking / blacking out parts of documents that match certain queries)
- signature and handwriting detection
- tokenization (for TXT output)
- language detection (for TXT/PDF output)
- named entity detection (for TXT/PDF output) Potential customers are people developing systems for GDPR (data protection), fraud detection, eDiscovery and content management. |
|
I have been thinking about universal annotation and the formats that I find the most interesting are PDF (because so much content exists in PDF) and HTML (open, easy to work with.)