Hacker News new | ask | show | jobs
by jhpriestley 3874 days ago
The translation of unrestricted human language into a restricted, "logical" language which could be understood mechanically, is another long-standing problem - going back as far as Liebniz - which is unlikely to be solved by a few gigabytes of chat logs.
3 comments

This is not a problem of structural semantics, each phrase is itself a semantic atom -- think Zork.

It doesn't need to see "What|WP 's|VBZ it|PRP like|IN outside|IN ?|."

Just "What's it like outside?", and know that it will always respond to that with the same call.

(Can't edit.)

Warning: This is mere conjecture.

All true, but it doesn't need to solve that bigger problem to have succeeded at building a reliable M: it just needs a high success rate at converting requests into desired responses, not translating everything into some master language for arbitrary use.
Are you claiming that chat logs won't contribute at all? Or are you claiming that progress is irrelevant unless someone launches a 100% solution all at once?
I think the chat logs will help, but it won't just be about that data.

The data around how users interact will also be important:

Do they prefer a back and forth conversation, or do they want to say everything in one go.

Do they want to start a conversation, drop it, come back to it several hours later, or do they like completing it in one go.

How do they handle switching back and forth between different contexts, if certain requests take time, or do users not switch context.

What data are they happy to share, and what are they not.

What are the typical response times that a user considers acceptable 5 seconds, 1 minute, 5 minutes, 60 minutes? Does it vary depending on scenario.

Is there particular services or information that there is a trend towards, for example local search requests, research/information, particular types of purchase etc.

We've just spent 6 months going through a very similar process to this, which has helped drive the development of our Converse platform, which allows people to build semi or fully automated conversational messaging services, so this is fairly closely related.

(From our point of view, the NLP data we gathered was useful, but it wasn't the most important part)