Hacker News new | ask | show | jobs
by benjamind 5414 days ago
Not sure how true that is.

At their most basic they do text classification and extraction, as well as document comparison. So you can index a whole load of documents, then train the system to recognise a particular type of document (based on any number of other training documents) and give it a specific classification.

The marketing spin is that you can extract 'meaning' from a whole load of text and deduce what a document is 'about'. Its not strictly true, but you can get a close approximation of that idea with decent training sets and classifications.