|
|
|
|
|
by tostitos1979
3238 days ago
|
|
Is there an open source NLP engine out there? I've been trying to learn this area and there are so many "pot holes" and wrong paths ... I've looked at OWL/Sparql, Graph DBs, logic programming, rule based systems. I feel like I'm dancing around the real topic and I don't know what "it" is :'( |
|
- You can do the Named Entity Tagging based on the categorical data (e.g. columns that are Text/Strings with low-ish relative cardinality would make good candidates to filter out text fields with for example email addresses (which shouldn't be in a DWH in the first place as categoricals))
- FLOATs/decimals/Integers would be good candidates for values that somebody looks for (and the name of the column would be the 'trigger' of the query.
All in all, with a bit of logic, good OLAP design and a lot of up front configuration I got in a weekends time to answer basic questions like 'revenue in the US in 2016' using NLTK back in the day. Today I would probably give spaCy a try as NLP engine.