Hacker News new | ask | show | jobs
by CathalMullan 1174 days ago
Only used for metrics, apparently. [0]

  /**
   * These author ID lists are used purely for metrics collection. We track how often we are
   * serving Tweets from these authors and how often their tweets are being impressed by users.
   * This helps us validate in our A/B experimentation platform that we do not ship changes
   * that negatively impacts one group over others.
   */
[0]: https://github.com/twitter/the-algorithm/blob/7f90d0ca342b92...
9 comments

... Metrics tracked in AB test. So even if it's not explicitly encoded in the algo (or implicitly through some of the features plugged in), they'll pick the winning cell as long as it doesn't hurt Elon's metrics (I'm just parroting the comment you quoted).

It doesn't have to be in the algorithm for the systems to be tweaked to please Elon vanity metrics.

[I've been running lots of ML AB tests over the years, some in organizations of similar size & complexity as Twitter]

That lines up with reporting from Casey Newton a few days ago where a handful of VIPs e.g. Musk, LeBron James, AOC were being used as weather vanes to understand what the algorithm was doing.

It definitely isn't just metrics. Any algorithm change that negatively affected Musk was clearly not going live.

Do you think the code looked like that prior to Elon's purchase? I suspect that there was another name there before.

Separately, which of these groups do you think that they use as a control?

> I suspect that there was another name there before

Who ? Musk is unique in being obsessed with being liked and relevant.

All of the other social CEOs including Porag and Jake have never really cared that much. And none of them participated in contributing content anything close to what Musk does.

>Who ? Musk is unique in being obsessed with being liked and relevant.

This is a meta-level bias!

I miss Jake
Do you mean Jacky Dorsy?
Why is Musk obsessed with being liked? There was a hoax last month about the algorithm being tweaked to make everyone view Elon’s tweets which was provably false.

It is very much likely to be the CEO a company wanting to understand his companies product.

> a hoax last month about the algorithm being tweaked to make everyone view Elon’s tweets

Didn't this just reveal that they're A/B testing on Musk's tweet performance? At the very least they're avoiding regressions, and I guess any incidental improvements to it won't be reversed, so isn't it fundamentally the same? Unless we're taking the word "everyone" literally, I guess.

We can't comment on why, but there's no rational way to watch his behavior and assume he isn't obsessed with being liked. He constantly tweets about his own tweets performance, makes humiliating appearances on stages, and pretty much terrified that someone might do something 1/100000th as awful to him as he and his family have done to others.

he is isn't a CEO, he is a whining baby

Hard to see the difference between that or someone so obsessed with twitter that he paid a huge premium to own it.
> which of these groups do you think that they use as a control?

When you run an A/B test you randomly divide your users into groups, one (treatment) getting the new behavior and one (control) getting the current production behavior. So your question doesn't make much sense?

Depends on the question. If you want to answer a question like “does change X increase engagement?” then a straight A/B test works. But if you want to answer one like “does change X increase engagement while (favoring/not favoring) group 1 over group 2?”, then an A/B test plus measuring groups 1 and 2 will not work without a control group, because without controls you don’t know if any changes to engagement for your measured groups are significant. There is some threshold of change to the engagement for the measured groups which is too small to be significant, and you should ignore results that only measure that noise.
You’re describing a multivariate test. Two groups under two different algorithm variants. Multivariate tests still require control groups.
I don't think so. If the token was "author_is_jake" for example, it would have changed to "author_is_ceo" on second pass
Our old friend Singleton looks up from the bench, hope in their eyes.
Possible, but clearly not the only possibility.
I think a lot of it is lies... I used to observe that tweeting anything remotely political went straight to trending timelines, so did tweets about crypto and NFTs...

I doubt that Twitter and most of these social platforms are driven by Ai... Maintaining and changing complex algos could not be done on a rapid pace like what occurs now... I think Twitter has moderators, and scripts that control everything, and it can be more easily adjusted to tweak what is visible on the platform based on whatever agenda they want to represent (politics, revenue, PR/damage control).

Twitter frequently bends the rules to serve celebrities, politics, and sponsors and for that they need to be able to quickly adjust scripts. True Ai is meant to function on it's own with minimal intervention, and therein a simple change to a massive logic scheme would completely FUBAR everything.

I think there are rooms full of people that filter and either promote or suppress posts on Twitter every day, namely suppressing any tweets critical of the platform and it's owner... I've noticed that at night time hours, moderation of Tweets is less restrictive as a cue to what goes on there.

There are certain topics that are not moderated as heavily as others (3d Design and other non-controversial topics), and some topics (for example music and porn) that are restricted heavily visibility-wise because they force people to run ads to rank in trending (because it generates a lot of opportunistic money for Twitter as a platform).

Users that are critical of the platform and Elon and his allies (for example) can easily be neutered by moderators and put on shadow ban for any period of time in order to preserve the illusion of calm concerning Twitter operations. Complaints about Twitter only become visible when the majority of the audience tweets about problems (as that can't be moderated out without exposing moderation).

It's pretty much all smoke and mirrors there in order to maintain order in my opinion, and it's pretty much futile and torturous to people who just want to create and seize opportunity for their business without spending tons of money on platform marketing...

There is absolutely no reason to believe there was another Single user getting this treatment before. The Elon-case was just copy & pasted as an ego-stroking hack.
i think he means trump, and i think it would've been strategically wise to give trump special treatment (tho probably not ideal)
Actually, my first thought was the previous owner, or perhaps Twitter itself. But I left the question open, since it doesn’t matter who.
It’s just a two-pass EM (Elon Maximization) algorithm!
So many unnecessarily cynical takes here. Let's say you were in charge of a large legacy system that some segment of customers complain about it not working for them as well as other segments. How would you know whether their complaints are valid unless you measured it? You have to know first. So measure it.
Yeah, but then what do you do after you measure it? Nothing? No, you make decisions differently so as not to offend whoever is part of the criteria. For example, can we agree that we don't want an "author_is_flat_earther" flag? Because who gives a shit if Twitter makes a change to their recommendation engine that negatively affects flag earthers? Just because something is only used for A/B testing doesn't make it completely inert.
There could be an argument for an "author is a flat earther" flag with the intention of those tweets being repeated by twitter less.
Do you think Elon bought Twitter so he can bury his own tweets?
Actually, that's a refreshing take.

(No).

flat_earther can be a a proxy for conspiracy theories. They may want to know if there is a swing.
You can dismiss the complaint without measurement if you are confident in two things:

1. Your system does nothing to actually segment this specific group by their identity.

2. You are confident that the systems you have set up to reward good behavior and punish bad behavior are accurate.

If both of those are true, you know that even if the group is being disproportionately negatively impacted by some form of recommendation/moderation, that it is only because that group disproportionately participates in behavior that is bad for the platform. That isn't a problem. It would actually be worse for the platform overall if you did anything to appease that group.

> ...it is only because that group disproportionately participates in behavior that is bad for the platform. That isn't a problem.

That is exactly what Twitter's stance has been all along (in the pre-Elon era) and it IS a problem for the product because people being silenced due to their own bad behavior (example: misgendering transgender people) feel an injustice is being done. The rule-makers get to set the range of acceptable discourse on Twitter and those to the right of center have felt unfairly disadvantaged by the way it was done in the past.

Over time this has eroded trust in the product. Just because people aren't being labeled and ranked based on whether they are red team or blue team, the people deciding what "good" and "bad" behavior looks like on the platform have the power to disproportionately impact these groups.

Data isn't going to tell Twitter whether to allow or disallow misgendering people. You either think that is bad behavior that shouldn't be allowed or you don't. Disallowing it is not disadvantaging Republicans. It is stopping behavior Twitter has deemed is bad for the platform. As I said in point 2 above, either Twitter is confident in those decisions or not. Data is worthless when it comes to a moral decision like that.

And if we accept hat Twitter believes (or more accurately did believe) that misgendering people is wrong, who cares whether people who want to do it feel an injustice is being done? Would anyone say that deleting spam is an injustice to spammers? You break the rules and you get punished.

> Data isn't going to tell Twitter whether to allow or disallow misgendering people. You either think that is bad behavior that shouldn't be allowed or you don't. Disallowing it is not disadvantaging Republicans.

If one of the defining characteristics of a political/religious/cultural group is having a particular ethical view, then enforcing a contrary ethical view against them is disadvantaging them and discriminating against them. Now, it may in some cases be morally and/or legally permissible, or even justifiable, discrimination, but it still is discrimination, and it is still disadvantaging them.

> Would anyone say that deleting spam is an injustice to spammers? You break the rules and you get punished.

Worldwide, many jurisdictions have laws against discrimination on the basis of religion; although it is less common, some jurisdictions also have laws against discrimination on the basis of political belief. A law prohibiting discrimination on some ground, is evidence that some people believe discrimination on that ground to be immoral. By contrast, I've never heard anyone suggest that spammers should constitute a "protected class", and I'm not aware of any jurisdiction which treats them as one.

Some people believe that there is nothing morally wrong with discrimination on the basis of religion and/or politics. Other people think there is something morally wrong with it, but if there is a conflict between the right to be free from religious and/or political discrimination, and the rights of LGBT people, the rights of the latter morally ought to take priority. Spam is irrelevant to that ethical debate.

>If one of the defining characteristics of a political/religious/cultural group is having a particular ethical view, then enforcing a contrary ethical view against them is disadvantaging them and discriminating against them.

I don't think misgendering people is a "defining characteristic" of Republicans and if that is, the Republican Party is in a pretty sad state considering all the bigger problems in the world. And if that qualifies as a "defining characteristic", there are plenty of other counter examples of society accepting discrimination as you define it. Banning polygamy would be discriminatory against Mormons is one. You could even argue that a full abortion ban is discriminatory against Jewish people.

>some jurisdictions also have laws against discrimination on the basis of political belief.

Notably not in the US where Twitter is based and were most of these complaints originate.

Am I misunderstanding? Are you suggesting that penalising those who misgender transgender people is unfairly disadvantaging people whose political views are right of centre?
They obviously feel that it is.

Likewise, if Twitter actioned people for saying "kill all men" or "all cops are bastards", this would be seen as having an obvious partisan impact.

It's not about what I personally believe or think is unfair. It's about what Republicans (broadly speaking) believe. There is massive resentment from people on the right who think the Twitter rules unfairly elevated some political opinions as good/correct/acceptable while treating others as unacceptable.

The handling of trans issues is just one example to illustrate the problem here. People on the left think trans rights are human rights while people on the right think a lot of trans issues should be open for discussion, legislation, persecution, etc. I think if we're being intellectually honest most would acknowledge that as a country we are far from consensus on many of the details here (bathrooms, girls sports, etc), and yet Twitter's rules and enforcement actions behaved as if the leftist view of transgender people is the only valid and permissible view.

The handling of January 6 and the banning of Trump is another example.

These things are Rorschach tests; people apply their biases and reach very different conclusions about what should be done. I don't claim to know the solution, I'm just trying to sketch out the problem with the way things were creating a climate where a big segment of the US felt unwelcome and resentful toward the platform.

This presents a problem for the platform, since you can't afford to alienate large double digit percents of the population if your mandate from shareholders is to grow mDAU by any means necessary. In that context, having some metrics tracking in place to measure the impact of algorithm changes on democrats and republicans to see whether impact is disproportionate is a completely rational thing to do.

While I can understand there may be debate around various trans issues purposefully calling someone something when they’ve politely asked you to call them something else isn’t up for much debate. Seems like common courtesy/politeness.

I see your point about losing users potentially but I would argue that Twitter’s intense focus on the US (as shown by the democrat/republican metrics) and trying to placate everyone is actually a negative for their business. There’s billions of other internet users outside the US. Shifting focus to serve them instead of focussing intensely on trying to please both sides in the US (and failing) would probably deliver better value for their shareholders.

>It's not about what I personally believe or think is unfair. It's about what Republicans (broadly speaking) believe.

No, it is about what Twitter believes. That was what I was referring to with point 2 in my original comment. Not every customer complaint is valid. It is ok to hear a complaint and dismiss it without further investigation. Twitter doesn't have some obligation to get all of society to think its rules are fair.

It is fine for a company to tell some potential customers to "fuck off" as long as that company isn't discriminating against a protected class. Twitter isn't discriminating against a protected class here.

If Twitter thinks misgendering people is wrong, it is impossible to come to an agreement with a group that think properly gendering people is wrong without Twitter compromising its own morals. Twitter is allowed to stick to its own morals and tell the people who disagree to "fuck off".

> Twitter's rules and enforcement actions behaved as if the leftist view of transgender people is the only valid and permissible view.

Twitter has changed in that regard since Musk took over. You can pretty much say what you like on trans issues now, as long as it doesn't break other rules. Loads of gender critical feminists have had their accounts restored in the past few months - usually having been suspended for 'misgendering' or some such nonsense.

> There is massive resentment from people on the right who think the Twitter rules unfairly elevated some political opinions as good/correct/acceptable while treating others as unacceptable.

Would it surprise you to find out that this resentment is in fact, conveniently manufactured, politically useful outrage? Because it's simply not true on its face, and the only thing we need to know to understand this is to see that it took Trump launching a coup to be banned on the platform. He violated the TOS every day, and he was allowed to spread his message to his millions of followers by Twitter. You want to talk about unfairly elevating political opinions? Trump used the platform to violate citizens' first amendment rights, and we had to take him to court to get those rights back. Twitter didn't do shit to protect us from him.

But it's not just Trump. It's right wing political opinions writ large. Far and away from sinking right wing conservative voices, Twitter research found they actually amplify right wing voices in every one of their top 6 countries except Germany [1]. Yes, that includes the US.

Is your mind blown? Have you heard of this once? I bet all you've heard from Musk and right wing politicians is that Twitter is going hard on conservatives and deplatforming them. Blocking their messages. Being unfair to conservatives and right wing opinions.

Yet what has actually happened? Twitter was actually deferential to conservative voices! It boosted conservatives and right wing voices at the expense of liberals. How did this happen? This is conservative messaging 101: complain about bias loudly enough and the other side will go so far out of their way to seem unbiased, they will be biased in the other direction. Conservatives managed to complain so loud about Twitter being biased against them that you not only believe it, but reality is actually completely the opposite.

[1] https://www.theguardian.com/technology/2021/oct/22/twitter-a...

Folks like Jordan Peterson would like you to think so, yes.

Of course, the man is essentially a walking “old man yells at clouds” meme at this point, so I’m not sure you should take anything he says with any merit.

Elon has said he wanted more reach and experimented with getting it. He also loves attention and tweets a ton.

There’s benefit of the doubt then there’s just… whatever the polar opposite of that is.

The penalty of the certain
Presumption of guilt.
I understand the value of measurements but how does measuring tweets from an individual user help?
If engagement on the tweets of that user goes down after a change has been implemented, you can roll back the change to prevent that user from being negatively impacted.
What if engagements around that user naturally declined, perhaps due to that user going off the deep end. Wouldn’t this just serve to bias the algorithm toward propping up the exposure of that user? Do they even care about the control so long as that user’s engagement is up and to the right?
Come on think this through. It’s trivial to tell the difference between a gradual and natural decline and a drastic decline immediately after rolling out a change. Especially when the change is rolled out region by region and only exists in regions running the update. You have to be able to measure the effect of changes and the most popular accounts are the obvious low hanging fruit for doing that.
You do an A/B test so even for the same tweet or same time period, you’re just comparing the new, “treatment” group against the old “control group.”
He's very likely the user on the platform with the most engagement, and probably by a long distance.

From that viewpoint, it does make some sense to use his account as measurement point.

I've never seen so many "experts" speaking from a position of complete ignorance than I do on Hacker News.
I expect they're tracking the red team/blue team metrics because of the political shitstorm that's been the GOP's assertions they're being silenced by The Algorithm.
The fallacy of false equivalence systematized in code.

Now one side can spew as much disinfo and incitement to violence as it likes, and any algorithm change that prevents this shit from getting amplified will be rejected as bias.

BSaaS = Both Sides as a Service

This shouldn't really be a surprise to anyone. It was reported years ago that Twitter was unable to cut down on hate speech because the automated systems they developed triggered too many [debatably false] positives on Republican politicians and that was bad for the company's reputation. If Twitter wanted to prevent future code changes from impacting that approach, there needed to be something like this in the code or tests.
Not sure why you got downvoted, you're absolutely correct.
Well, technically they are looking for relative changes, not equal total exposures.
I don't see an unbiased way to tell which "side" releases more disinformation and incitement to violence. Even deciding what counts as disinformation is hard (e.g. does it have to be literally false or just cause false beliefs in the reader?).
One way to tell would be to look at which side is incited to violence more often.

It turns out, according to the FBI (which is a conservative organization historically and exclusively run by conservatives), right wing extremism and violence is in fact the biggest domestic terror threat in the US, and it's currently growing [1]. FBI Director Wray gave this testimony after a right wing domestic terror attack was carried out that aimed to topple the US government. Not much has changed since then [2]. Since the former President's indictment the other day, the right-wing violent rhetoric has also ratcheted up a notch, so we can expect right-wing violence to follow.

Notably, we can confidently say this doesn't happen on the left, as when Hillary lost they did not launch an assault against the Capitol as the right did. Instead, they knit pink hats and had a march.

(PS before anyone whattabouts the George Floyd protests, the FBI doesn't see them the same way [3])

https://apnews.com/article/fbi-chris-wray-testify-capitol-ri...

https://news.yahoo.com/right-wing-extremists-responsible-for...

https://www.fbi.gov/file-repository/fbi-dhs-domestic-terrori...

Yeah but despite all that they should still give leftists a platform to tweet.
Ahh, the group of Elons.

I was wondering why I see so many tweets by him, and what his "Group's" impression quote is.

This is actually pretty hilarious.

Thankfully they haven't added a "no mute Elon" feature. Yet.
They don't have to, because he effectively can't be muted. People tweet quote him with an image, and it's not blocked even though I have him blocked as an account. This behavior is pervasive enough that you can still see his tweets all the time.
That’s not how it works. See the parent.
They said that they use it for metrics, so clearly there must be an "elon impression" metric.
I imagine it's the largest metric in a mission control style room

it starts dropping, klaxons start blaring, the room drops to red only lighting, engineers on the floor start pulling out their hair knowing the shitstorm that's coming

Yes, it’s not making anyone see any extra Elon tweets as your comment alleged.
It means they won't ship features that hurt Elon's reach. So in a sense, it is biasing code changes in favour of Elon.
Why does it mean that?
The original code is a part of the home-mixer service, which is the "Main service used to construct and serve the Home Timeline."

I suspect the flag corresponds to weights not present in the repo.

Per original source, The code that was released today doesn't show the parts that actually alter the scores of Elon and other users. The part of the code referenced below just tracks Elon stats (from what I know). Employees removed most PII before the code was released.
Correct. It's a binary metric. Did the number go up, yes/no (kept job / not).
Interesting which "groups" they care about (e.g. mainstream political parties).
But who chooses the users to be metrics…