| HN Mirror

> How would any non-domain specific tool (ie a voice recorder app for real estate or even real estate in PR" even know what "opportunity" means.

> It could do a loose keyword match but unless you used the words "Tulum" or "Evan" how would it know to link notes together without context on who Evan is?

Does it need to know? Fairly vanilla NLP can provide the data to categorize (or index) by identified parts of speech, such as verbs, proper nouns, etc. If you have a large enough pile of notes, categorizing or subcategorizing by combinations would be useful.

There are pitfalls, such as lacking sufficient context for disambiguating between identically named people (eg. your sister Mary vs. Mary from work), but that doesn't negate the utility of such a feature.

Further refinements for association and disambiguation would be highly contingent, but that very contingency can be modeled with Bayesian classification (or more advanced attentional mechanisms) that learns when to apply them. For example, a bit of sentiment analysis could help associate Mary (that you're often mad at) with the words 'project' and 'report', but Mary (that you like) with 'barbecue' and 'holiday' for clustering purposes.

These supplementary techniques necessarily operate on 'small data', and the real challenge is finding natural UI flows and affordances to suggest them to the user when appropriate and solicit feedback without overwhelming.