Hacker News new | ask | show | jobs
by jjk166 1677 days ago
I honestly wonder which came first, the viewer preference or the algorithm.

Like are there people out there that see a whacky face in the thumbnail and think "Oh that's a video I want to watch!" and the algorithm just got trained on that?

Or did whacky faces just resemble something (from the algorithm's perspective) that was briefly popular and then a positive feedback loop of algorithm recommends it -> gets popular -> algorithm updated to recommend it more often.

3 comments

I've discussed this with my brother a bit (he's a neurologist, though not a specialist in vision), and he seems to think part of it goes back to our brain's instinct to react more to 'face' than other stimuli. Therefore thumbnails with a discernible face will do better.

And on top of that, an expression outside of "stock photo smile" will stand out more, and if you combine that, plus a catchy title that piques your interest, and have a video that can back it up, it will do a lot better than just having a video that has good content.

This explanation never really made sense to me. It's not like videos are hidden in a background and the thumbnail needs to stand out in our field of vision. Videos are presented front and center, and you're only looking at a few at a time. I presume everyone looks at all the video thumbnails; I just go left to right, top to bottom. Maybe if you were presented with hundreds of options at once and had to click on something quickly, something that draws the eye might get that sort of response, but that's not how Youtube works.

Instead you are presented with a small number of options and given unlimited time to choose, and the consequence of choosing a bad video is several minutes of wasted time. I presume most people are asking themselves "do I want to watch this" and deciding "yes, this video looks interesting/entertaining/etc." Maybe if the effect were small, like 1-2% I could believe its just a slight nudge due to psychological hacking, but the reported differences in the tens of percent and the extreme importance that seems to be placed by the algorithm indicates this is a major factor in peoples' decision making process.

I just go left to right, top to bottom

I doubt this is how most people's eyes actually scan a visual field. I expect there's a lot more randomness going on subconsciously, and that face-like features cause a higher subconscious dwell time.

had to click on something quickly...but that's not how Youtube works.

I think for most people it is. Scroll scroll scroll tap, all within a few seconds.

It's how we scan text. I'm also reading the titles of the videos. Maybe there's a very quick scan of the page beforehand, but if I click on any of the options presented, it's only after I've done that main scan.

When you're only presented with a tiny bit of information, it doesn't take long to evaluate a decision. A few seconds is a long time. No one, at least to my knowledge, is opening youtube and clicking on the first thing they see in milliseconds. Hell, just variation in page loading time probably has a bigger effect on what we see first. Critically though, there's no deadline - even if you see something you want to watch quickly, you don't have to click on something if you don't see something you want to watch. Without time pressure, there's no need to make an impulsive click.

> he seems to think part of it goes back to our brain's instinct to react more to 'face' than other stimuli.

I've noticed this same thing with advertisements and dancing. Why are people always dancing and singing in advertisements? Seems like people just are wired to pay attention to certain human behaviors.

I think it's because people look at images more than text and the author's face immediately identifies the video.

It's happened multiple times before that a YouTube video keeps following me and I ignore it, but once I realize who it's by I see it.

Have you A/B tested mouth open and mouth closed?

I'm sure that's part of it.

I also think the graphics, title, face; these are signaling some level of effort and focus. It might not be the individual face, graphics, or title, it could be just their presence.

I think people's eyes are just naturally drawn to faces, so more people end up seeing the thumbnail among the grid of other videos, and are more likely to watch the it as a result
That's the base narrative, but I find it difficult to believe without proper controlled testing. People are 25% more likely to click on a technical video when the creator is trying to look like a buffoon? It just doesn't sound correct.
People don't click on the video because of the dumb face. They watch the video if they think they'll be interested in it. But the dumb face makes it more likely that you will initially notice the thumbnail
The dumb face makes the algorithm more likely to put the video on peoples' feeds*.
Weirdly enough, the algorithm works wonders for me when I'm not logged in. It does what it is supposed to do. Give me more of what I might want while throwing in a heathly dash interesting things I might be interested in. YT to me, bon apetit.

But when I'm logged in, it shows me the exact videos I've been avoiding watching these past few weeks and when I do watch a video, instead of showing me similar vids or vids from the same creator - it just shows me videos I've watched or the video updates I've been avoiding. And the odd Taylor Swift MV, an artist whose works (is it even music) I've never even...not even in a mix.

The team at YT has jokes it seems. Comedians.