| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by isaacfung 855 days ago
	There are lots of video content with audio. We can train a facial expression classification model to detect the speaker's emotion(we can also use a multimodal model to take in consideration of the language context). Another potential source of data is voice acting script of animations. I always thought the storyboards of films/animations can be great annotated training data but it seems there are no open datasets, probably because of copyright issues.