| > enough training data
> good chat interface I'm going to try to do that here: https://github.com/bennyschmidt/llimo I totally understand your skepticism, and am also surprised it works so well as a sentence-finisher, even with hardly any data (a handful of books). Think about it like this: If you had a file with billions of sentences: "Paris is in France"
"Apples grow on trees"
"Gold is worth more than silver"
"I know where to go!"
etc... Then you can complete virtually any sentence. Say the user enters "Paris" - you can easily find and return "is in France" by simply searching for "Paris" in the text then slicing out the rest of the sentence, "is in France" -- or just give the next word "is" for more of an auto-suggest feel. But if there were 4 sentences starting with "Paris", it gets slightly more complex, now I have to rank them to know which one is the best suggestion: "Paris is in France."
"Paris is nice this time of year."
"Paris Hilton liked my post!"
"Paris was my favorite city overall." In this case, "is" is still the best because it scores higher than "Hilton" and "was" - because it follows "Paris" more frequently. So in addition to the text file of billions of sentences, I need to make a ranking system to give a point score to every possible word in the file at every possible position it might be in. To make it all faster (because a super massive text file is not feasible), the billions of sentences are not actually represented in a single text file, but as a deeply nested JavaScript object our computers are more optimized to traverse and lookup, along with the point values for each word. At this point, _with enough sentences on file_ with ranked suggestions and fast lookups, you can complete almost anything a user could input. This is what I shared today. > good chat interface To put the sentence-finisher to use as a chat bot, all you need to do is convert the user's question or "prompt" into the beginning of a sentence. For example: "Where is Paris?" -> "Paris is" and let the completer give you "in France". So the answer is: "Paris is in France." Does this make sense? |