| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rck 4868 days ago

It looks like the research paper this is based on is here:

http://schererstefan.net/assets/files/scherer_etal_FG2013.pd...

I've only skimmed it, but the vision work looks sound, and it looks like it makes pretty essential use of the depth map from the Kinect. But I don't think there's any speech recognition going on, so that part is just acting (from both the humans and the virtual platform). I'll bet an untrained user could get the system to break pretty quickly...

Neat proof of concept, though.

1 comments

a_bonobo 4868 days ago

Another thing is that this needs a properly trained interviewer, and it's not based on the Kinect's measurements alone but also on the questions asked (and the reaction is then measured). The behaviour of the interviewer is a great confounding factor in this.

Reminds me of a certain someone asking another certain someone why he flipped a tortoise in the desert...

link

rck 4868 days ago

I think the problem isn't with the published paper - it looks like they used a trained human interviewer and recorded both humans to evaluate their descriptors. The real problem is with the PR video, which looks really impressive, but seriously oversells the capabilities of the system. Not that the problem is unique to this situation - I think a lot of the videos in AI oversell the research. It's a pity, because the research is usually very good, but by itself isn't "exciting enough for general consumption," so they add bells and whistles that have nothing to do with the science.

link

a_bonobo 4868 days ago

You're right, posting the paper and not that blog-post would have led to a much more interesting discussion.

One of the strangest things in that SimSensei is the automated interviewer - when I talk to a recorded voice instead of a real person, I behave enormously different! Why should I fidget around in answering when no-one really listens anyway? Why should I give proper answers? Why should I exhibit signs of shame when I talk to no-one about myself?

This doesn't work like described in the paper.

link