Hacker News new | ask | show | jobs
by viktour19 2981 days ago
Interesting. I wonder what the dataset used to train the machine learning model looks like. Books and conversation style query pairs?
2 comments

I wouldn't be surprised if they were actually using some techniques from NLP. In particular, I wouldn't be surprised if they'd run sentence parsers over their entire Books collection and then manipulate the parse trees to find sentences in books that look like good answers because their sentence structure is similar.
Basically, yes. This area is called Question Answering (QA). One of the popular datasets is SQuAD https://rajpurkar.github.io/SQuAD-explorer/