Hacker News new | ask | show | jobs
by makmanalp 4359 days ago
This is cool!

We've been working on something similar, but the opposite direction (for world trade data). Instead of trying to NLP our way out of the problem, we pre-generate and index a bunch of possible questions, and let full text search handle the rest.

It's interesting, because theoretically NLP should be able to "understand" what you mean but in reality I find that even if you parse sentence structure and extract some meaning, you're still at some level hardcoding the possible things that can be queried into the code.

So it's a neat tradeoff of whether it's more worth it to create a mini query language, or go full natural language, or go somewhere in between.

...

Anyway, try it out by clicking on the title (keeping it a bit hidden for now for testing purposes): http://atlas.cid.harvard.edu/explore/tree_map/export/usa/all...

Things you can try (mix and match too!):

- "wine italy" - "france" - "germany spain" - "germany export wine 2002 to 2012" - "turkey feasible"

...

If you want to see the code, check out our github:

https://github.com/cid-harvard/atlas-economic-complexity/blo... (search view)

https://github.com/cid-harvard/atlas-economic-complexity/blo... (indexer)

Apologies for any mess, I recently joined and we're undergoing a huge overhaul right now.

2 comments

Couple of years back, I implemented a natural-language-ish solution for querying ERP data. My approach was to

a. have structured queries that can be precisely parsed

b. provide query completion to guide the user while entering the query

It turned out quite well, if I say so myself. But I never got the time to market it.

It can be seen in action here: http://nlq.lavadip.com/servlet/demo

A cool system but I'd suggest this is NLP to the extent SQL or a similar (or somewhat better) query language is NLP. You've made queries in your query language easy but it's more like interactive programming than free-form NLP.

Not that it's a bad application, it's nice, it's just if one extended this model, one would wind-up with a query language, not something new.

Thanks for the praise!

Totally agree; it's not true NLP.

Moreover, true NLP is currently not achievable. We would need an algorithm that passes the Turing test to infer the meaning of a free-form statement.

Like my parent post said, every current NLP system uses some hard-coded assumptions. They just differ in the amount of assumptions.

this is pretty cool ! could you talk about the parser-generator and ER relationships ? I was looking to build something similar for a webapp and was stuck at how to generate the autocompletes.
The autocomplete is provided by my parser-combinator library which is designed in a very generic way. It can be used for any application, not just natural language queries.

About the parser generator: the ER description provides natural language phrases for every entity and relationship. From this the parser builder is able to create parser combinators. There are some hard-coded assumptions and parsers for common data types such as dates and numerical figures.

It's interesting, because theoretically NLP should be able to "understand" what you mean but in reality I find that even if you parse sentence structure and extract some meaning, you're still at some level hardcoding the possible things that can be queried into the code.

This is a good point.

One thing that I'd claim is that NLP is less useful than you'd think for single queries. I'd just suggest the thought experiment of being able to ask a highly knowledge person one question with no follow-ups. Even someone with a mastery of English and the data probably won't start out with the same idioms and approaches, so you'd have to phrase the question carefully and exactly ... just the way you have to spend a lot of effort writing a SQL or similar query.

If you can interact with such a knowledgeable person, you'd learn their expression style and they'd learn yours - after some interaction one has a very powerful effect. Until then, things are rather limited.