Hacker News new | ask | show | jobs
by stepvhen 2952 days ago
this paper is about lip reading, which is indeed a difficult task (i cant do it well, to be honest) and worthy of study. however, lip reading is not body language. lip reading maps movements to words in a natural language. recognizing body language requires knowing e.g. that crossing arms maps to the idea of closed-ness and thus that the interlocutor is closing themselves off from the conversation to some degree. or maybe the interlocutor has very long arms and has never been able to know what to do with them in a conversation or something.
2 comments

Multimodality isn't really my field, but there's also a lot of research on emotion detection. E.g. https://arxiv.org/pdf/1801.07481.pdf is a recent survey about commonly used methods that I found by quick googling.

We definitely can combine things like detecting crossed arms (and knowing that it's correlated with closedness) with emotion and stress signs in your voice, sentiment mapping of the words you say, micro-movements and your pulse rate (that a machine can detect from video if it's sufficiently good) and various other things to infer your likely emotional state.

The trouble is that in-depth analysis requires excessive external context and a shared worldview - i.e. "being of the same tribe" and knowing how a particular real world event "should" make one feel (and why), which is pretty much a general AI problem; but purely reading what the body language of this moment is telling about your emotions is a hard task but somewhat solvable even right now.

You’re right. I guess lip-reading is quite the stretch to be called body language.