Hacker News new | ask | show | jobs
by xiii1408 494 days ago
It's another sock puppet study: https://arxiv.org/abs/2501.17831

Very similar methodology to an earlier study the government cited in their case against TikTok: https://networkcontagion.us/wp-content/uploads/NCRI-Report_-...

There are a number of issues with these studies, one being that the way the sock puppet bots interact with content is not exactly organic. Typically they search for content in a conditioning phase, followed by random scrolling during which the recommended videos are collected and classified by an LLM. Modern recommendation algorithms famously work by examining how long and how users engage with content, and there's none of that going on here. Still, the methodology itself and the use of LLMs to classify content is clever and probably about the best we can get.

Also, even if there _is_ a bias, it doesn't tell us why. Are the recommendations intentionally spiked, or is this simply the recommendation strategy that maximizes profit? (Or that the recommendation model thinks will maximize profit?) It's very difficult to tell, which is part of what makes these models dangerous and also part of what makes them difficult to regulate.

On a sidenote, TikTok (and presumably other content platforms) _really_ does not like these studies, as demonstrated by them nerfing search functionality after the second study above was released to prevent researchers using these techniques in the future. I haven't read the study in detail yet, but it will be interesting to see how the team at NYU Abu Dhabi adapted their methodology.

2 comments

While I am skeptical of what reasonable conclusions can be drawn from a study like this, they explain the methodology in the article. You said:

>Typically [...] followed by random scrolling during which the recommended videos are collected and classified [...] Modern recommendation algorithms famously work by examining how long and how users engage with content, and there's none of that going on here.

But they claim that videos are watched, not just collected from the recommendation page.

"The accounts watched 10 videos, followed by a one-hour pause, and repeated this process for six days"

Perhaps I should have been more clear. It's TikTok, so of course the only way to collect recommendations is to watch videos. Some studies watch the whole video, some just watch part of it, but it's TikTok, so fundamentally you're watching a video.
I might just not be reading it properly. I've never used TikTok, I assumed by your description that they scraped video titles/transcripts/etc. from the recommendation page without any engagement on the video. (I suppose I should read the study you linked!)

When you say "how users engage with content, and there's none of that going on here", by "none of that", do you just mean likes/comments, that sort of thing?

I would usually consider watching as engaging with content, but if you mean additional engagement (as I would call it, anyways), that would make a lot more sense to me.

I think the key metric missing here is how long the user watches each video. Likes and replies are probably helpful too, but when I've used short-form video content apps like TikTok, Reels, and YouTube Shorts before, they've gotten a pretty good measure of me without me ever liking, replying, or following.

With the current methodology, the bot either watches the whole video, a fixed duration of it, or a random duration before swiping. The bot doesn't organically watch or swipe based on its interests like a human user would.

> Also, even if there _is_ a bias, it doesn't tell us why

Because that is a political discussion that will inevitably derail the point.

If there is identified bias then the platform must address it. Or it should be labelled a national security threat.

I think this is a valid point and a really interesting question. If that's the standard, we need to regulate all recommendation algorithms. (i.e. put limits on Twitter, Instagram, and YouTube as well.)

How could we regulate this? I can think of two ways:

- Results-based enforcement. i.e., companies are free to use whatever recommendation model they like, but have to recommend content within ideological bounds. i.e., you can't bias toward one partisanship more than X%. There's some precedent for this with the equal-time rule (https://en.wikipedia.org/wiki/Equal-time_rule) and FCC fairness doctrine (https://en.wikipedia.org/wiki/Fairness_doctrine).

- Algorithm-based enforcement. i.e., there are limits on the algorithm itself. Perhaps you have to present your algorithm to a government agency and provide a proof that it obeys certain properties. But the enforcement here is analytical rather than empirical.

> Twitter, Instagram, and YouTube

Can you provide research for each of these.

Otherwise it's just muddying the waters to act like the bias is inherent to all platforms.

People do these same sock puppet studies on Twitter/YouTube/etc. and find biases there as well. There's a lot of literature out there.

Here's a recent study on YouTube from the same author as this TikTok study finding left-leaning bias in US recommendations: https://academic.oup.com/pnasnexus/article/2/8/pgad264/72424....

Here's a somewhat older study from Twitter itself where they determined that their recommendations were biased toward right-leaning accounts: https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter....

IMO the interesting question is not whether an individual platform is biased and what its biases are, but rather how we might regulate recommendations given that there is always a risk of bias.

The problem is that the algorithms are programed to show people what they want to see OR what the platform want's you to see.

If it's orgasmic then this is no different than any other form of organic popularity. Seeing as Trump won the popular vote and the electoral collage, people were interested in republican content. On the same token it's very easy to AstroTurf and claim it was organic. From my personal conversations in meat-space I lean organic.

Is there funny business going on? Absolutely, all the time everywhere in every way. Can we say this was funny business? Not without the code.

TLDR: Popularity algorithms push popular content algorithmically.

> TLDR: Popularity algorithms push popular content algorithmically

Not aware of any major platform that solely uses a popularity algorithm.