Hacker News new | ask | show | jobs
by grabcocque 3390 days ago
The problem is, we don't have a clear definition of what clickbait is.

nouninformal (on the Internet) content whose main purpose is to attract attention and encourage visitors to click on a link to a particular web page.

But that's basically everything on the web.

6 comments

I'm not sure that quite addresses the problem here.

After all there is no clear definition of 'what dogs look like' (in the sense of a collection of logical rules), but deep learning models excel at detecting them, when provided with enough positive examples.

If it's possible for humans to agree on whether a given article is clickbait or not, we should be able to put together an adequate dataset for training a system to classify them too. From the linked article I am unable to discern how the training dataset was labelled.

In other words, the fact that 'clickbait' is a nebulous concept shouldn't preclude machine learning from being able to detect it.

Just as "dogness" is a factor, so is "clickbaityness". You're right, this is all about thresholds.
I often wonder what Wittgenstein would have made of today's models of machine learning / deep learning https://en.m.wikipedia.org/wiki/Family_resemblance
Me too. My reading of the Blue and Brown books led me to believe Wittgenstein's conception of meaning is inextricably tied up with the notion of "learning" and exposure to language and its use. Rather than meaning being contingent on 'hard' logico-mathematical derivations of formal semantics.

This contrast seems somewhat reminiscent of the complementary approaches of hard-coded rule based AI vs machine learning.

The core of the definition is subtly wrapped in "main purpose" -- once the attention is attracted and the link is clicked, the clickbait's job is done. So the content of the article will be lower quality and less intellectually satisfying than non-clickbait articles.

For example, if you charted "interest on clicking this link text" vs "satisfaction with article after reading", I think clickbait would be clearly in the high interest vs low satisfaction quadrant.

One problem with that definition is that intellectual satisfaction is disproportionately affected by confirmation bias. For instance, an article entitled "10 reasons why <politician> getting elected means the end of America" might be clickbait to some, but not others depending on what <politician> contains.
In 2014, Jon Stewart offered an interesting definition of clickbait:

"I scroll around, but when I look at the internet, I feel the same as when I’m walking through Coney Island. It’s like carnival barkers, and they all sit out there and go, 'Come on in here and see a three-legged man!' So you walk in and it’s a guy with a crutch."

The thing is, he was talking about BuzzFeed when he said that, and that is not what BuzzFeed does at all. BuzzFeed's editor wrote about the distinction here, and it's the most insightful article I've read on the topic:

https://www.buzzfeed.com/bensmith/why-buzzfeed-doesnt-do-cli...

People tend to consider things like lists clickbait, even though those articles usually deliver exactly what the headline suggests. (If you click on "23 photos of kittens that are just too adorable," that is what you will get.) But because it's an article that was made specifically to get traffic, people incorrectly call it clickbait.

And it often goes even further than that. On Reddit and Hacker News, commenters constantly call articles clickbait. Sometimes it's true, and there's a sensational headline that leads to a bullshit story. But just as often, the story delivers on what the headline promises, but commenters call it clickbait because the headline is slightly hyperbolic, snappy, or just plain well-written.

I would define clickbait as articles which intentionally try to disguise what you'll get out of reading them. The information is banal, but the headline makes it out to be revolutionary or shocking.

You might disagree with the details of the formulation, but I think there's pretty broad agreement that something similar is going on with clickbait.

I guess it's an inherently fuzzy concept, so quite a good fit for machine learning.

But my definition of clickbait is any link I follow where I feel like I've been tricked into the click. The link looked interesting, but I feel regret once I see the actual content.

A definition would need a bit more fleshing out, mostly about the (lack of) actual content; a long-winded page (not just text) that eventually leads to the core, which could be summarised in one line, even the article title itself. (like 'peanut butter is made out of peanuts' instead of "you'll NEVER guess this ONE SECRET peanut butter ingredient!"
Is that a deep buried lede with a teaser headline?

The thing that irritates me the most is these absurd ads that make bold claims and never deliver on what they're advertising. Even if you're interested in what they're offering, you're willing to take the bait, it's a lost cause, they never fulfill their promise. It's just carousels of bullshit jammed full of more ads.