Hacker News new | ask | show | jobs
by geerlingguy 1678 days ago
I've discussed this with my brother a bit (he's a neurologist, though not a specialist in vision), and he seems to think part of it goes back to our brain's instinct to react more to 'face' than other stimuli. Therefore thumbnails with a discernible face will do better.

And on top of that, an expression outside of "stock photo smile" will stand out more, and if you combine that, plus a catchy title that piques your interest, and have a video that can back it up, it will do a lot better than just having a video that has good content.

4 comments

This explanation never really made sense to me. It's not like videos are hidden in a background and the thumbnail needs to stand out in our field of vision. Videos are presented front and center, and you're only looking at a few at a time. I presume everyone looks at all the video thumbnails; I just go left to right, top to bottom. Maybe if you were presented with hundreds of options at once and had to click on something quickly, something that draws the eye might get that sort of response, but that's not how Youtube works.

Instead you are presented with a small number of options and given unlimited time to choose, and the consequence of choosing a bad video is several minutes of wasted time. I presume most people are asking themselves "do I want to watch this" and deciding "yes, this video looks interesting/entertaining/etc." Maybe if the effect were small, like 1-2% I could believe its just a slight nudge due to psychological hacking, but the reported differences in the tens of percent and the extreme importance that seems to be placed by the algorithm indicates this is a major factor in peoples' decision making process.

I just go left to right, top to bottom

I doubt this is how most people's eyes actually scan a visual field. I expect there's a lot more randomness going on subconsciously, and that face-like features cause a higher subconscious dwell time.

had to click on something quickly...but that's not how Youtube works.

I think for most people it is. Scroll scroll scroll tap, all within a few seconds.

It's how we scan text. I'm also reading the titles of the videos. Maybe there's a very quick scan of the page beforehand, but if I click on any of the options presented, it's only after I've done that main scan.

When you're only presented with a tiny bit of information, it doesn't take long to evaluate a decision. A few seconds is a long time. No one, at least to my knowledge, is opening youtube and clicking on the first thing they see in milliseconds. Hell, just variation in page loading time probably has a bigger effect on what we see first. Critically though, there's no deadline - even if you see something you want to watch quickly, you don't have to click on something if you don't see something you want to watch. Without time pressure, there's no need to make an impulsive click.

> he seems to think part of it goes back to our brain's instinct to react more to 'face' than other stimuli.

I've noticed this same thing with advertisements and dancing. Why are people always dancing and singing in advertisements? Seems like people just are wired to pay attention to certain human behaviors.

I think it's because people look at images more than text and the author's face immediately identifies the video.

It's happened multiple times before that a YouTube video keeps following me and I ignore it, but once I realize who it's by I see it.

Have you A/B tested mouth open and mouth closed?

I'm sure that's part of it.

I also think the graphics, title, face; these are signaling some level of effort and focus. It might not be the individual face, graphics, or title, it could be just their presence.