Hacker News new | ask | show | jobs
by derf_ 445 days ago
> Recognising the flow of a conversation is just pattern recognition. That is what machine learning is good at.

And surprisingly hard to do well in practice. My guess is that the problem is that there is very little information in your training dataset (because only the transition from "talking" to "done talking" matters), but the actual knowledge required to perform well is large (up to and including full speech recognition, in theory). So even with over a terabyte of training data, your choices are a small model that performs badly or a large(r) model that overfits severely.

It's possible there was something I was overlooking when I tried it, though. I couldn't think of a good way to confirm my guess experimentally.