Hacker News new | ask | show | jobs
I got 10k post karma on Reddit with (and without) fast.ai (a8b.io)
126 points by ameerkat 2060 days ago
18 comments

A couple of years back, it was announced that the trailer for one of the new Star Wars movies would debut during halftime of Monday Night Football. There's always a rush to post stuff like links to new movie trailers "first" on Reddit, so as to get the most fake internet points.

I guessed that the trailer would probably get added to the Star Wars YouTube page at around the same time it aired on TV, so I wrote a script to scrape the YouTube page every few seconds, and if a new video was added, submit the link to one of the big subreddits.

Sure enough, my hunch was correct, and my submission of the trailer on YouTube wound up being the first. The upvotes and comments poured in, skyrocketing the post briefly to the #1 link on Reddit's front page. This felt pretty cool and resulted in several thousand post karma. I didn't use the program after that, having satisfied my curiosity as to whether or not it would actually work. But it did make me think about just how easy it is to farm karma on Reddit, and how useless it is as a proxy for "trust" or "reputation" or anything other than what it is - fake internet points.

>But it did make me think about just how easy it is to farm karma on Reddit, and how useless it is as a proxy for "trust" or "reputation" or anything other than what it is - fake internet points.

I have to say, do people really take karma, upvotes, credit, fake internet points etc. As a measure of trustworthiness? I know reddit pushes this idea but do people actually participating in communities with point systems see those with high karma points or whatever as trustworthy?

Personally, i've always assumed people with high amounts of karma or stars are just people who've spent a significant amount of their lives on a site and participated a lot.

"Trust" in this context means that you can be more certain that there is a real person with a certain level of time investment and legitimate intent behind the account. Compare that to an new account whose only purpose is to post an ad before getting banned.

Karma thresholds are highly effective spam filters.

> Karma thresholds are highly effective spam filters.

And that's the reason why a lot of sites sell high karma accounts, whether those were hijacked, or karma farmed farmed for that purpose. People can make quite a few bucks if they find an effective way to farm it.

Not necessarily. There's a site I use with a star based karma system. A bunch of years ago there was a glitch in the site that led to a bunch of random members getting ridiculous amounts of stars.

People still ask questions about it to this day as to why there's random members with this.

You'd be surprised. Even if you're aware of it, you may start noticing a post you like made be a high-karma person. Confirmation bia sets in, and very soon, your subconscious is engaged. This stuff was build that way by design.
I dunno, I usually go out of my way to avoid looking at people's scores on places like that. Even my own if I can help it. HN is hard, it's right in the corner there, but I really don't tend to look at anyone else's scores. Same with reddit or other sites I use with those systems.

Don't get me wrong, I get the little dopamine rush and enjoy getting points and stuff, but I look at the whole thing the same way as a high score board on a video game or something.

Maybe it's also because you guys know how to write scripts like that. I don't know how to do that so I earn my karma in the way it was intended. If I happen to notice someone's karma on Reddit, usually their history will quickly show you what's up.

Now you made me curious about your opinion of karma on places like stack overflow.

Stack overflow i've got mixed feelings about. I've seen it work well sometimes, with the highest quality answer being voted to the top. Other times, it seems the top answers are not the best or just a bunch of pointless arguing.

I've never participated much in any of the stack overflow communities other than passively. I tried asking a question once that I hadn't seen on there but I got downvoted and told off so meh...I just stick to searching and passively reading mostly. I have thought of updating some older answers but they were usually locked and closed to discussion.

I've never looked at anyone's points. It's pointless. Pun intended. It's not a descriptor of anything of value. Essentially I take information on reddit as it is, and use my own critical thinking skills to judge the opinion or value the opinion has to me. Sometimes there is great information, sometimes it's just people trolling. Everything is taken with a grain of salt.
Tangential, but one of my stock interview questions is to ask candidates how they decide whether to use a new dependency in production code.

The vast majority of people say GitHub stars and leave it at that. Occasionally they also mention checking how active the project is. Very, very rarely, someone will say they scan the code, maybe run the tests, check the open issues etc.

This is production code!

Sadly it's no surprise after we experienced left-pad case.
On Reddit it depends on how the karma was obtained.

Example questions:

- Did they gain all their karma in one post, or was it gradual? Gradual karma is better because it implies that they're probably not a troll.

- Did their karma come from the current subreddit or a different one? Just because you have karma from a popular sub doesn't mean you're a good contributor to a niche sub.

- How frequently are they downvoted? Frequent negative karma in even-handed subreddits is a bad sign.

- Does the user have good reputation in related subreddits?

So it's usually not the number that matters, but the context around that number and how it came to be. That said, usually these things only matter when a user is misbehaving: if you're a mod you need to know whether they're having a bad day, or if they have a history of breaking rules and trolling.

You get karma for posting pithy and non-controversial (pessimists might say groupthink agreeable) comments or posts.

As much as everyone claims the system is for rewarding on topic discussion, in reality it is used to award points to people who echo whatever is popular and to put wrong think at the bottom.

You can convey unpopular opinions on Reddit without getting downvoted. It just takes good wording and replying to the right post-- essentially what's required in real life: timing and tactfulness.
most mod positions are offered first to contributing members of a community, so with fake points might eventually come real fake power. see for example the gallowboop story, he farmed points literally stealing other people content, and went to be a mod on a lot of subreddits
Years ago, I saw an auction website where bidders placed bids in penny amounts. Supposedly, you could win an item at a fraction of the price of the product. Bidding was real time, and you won the auction if you held the bid for 15 seconds. You "spent" your bid and never got it back if you lost the auction.

You never got to see how many people was bidding against you, and I always wondered how you could know it wasn't the website jacking up the bid.

I am reminded of that scam every time I see a karma/upvote system.

> But it did make me think about just how easy it is to farm karma on Reddit, and how useless it is as a proxy for "trust" or "reputation" or anything other than what it is - fake internet points.

More than fake internet points, it's a distilled popularity contest, and we know popular people isn't necessarily trustworthy. Reddit calling it karma gives it more credit than it deserves.

Reddit calling it karma is a throwback to its early days when "upvote posts that meaningfully contribute real value to the discussion, even if you disagree with the content" was a closely held ideal of the reddit community. Sort of closer to how Stack Overflow upvotes are currently intended.
> even if you disagree with the content

yeah, the intention was good, but that's not how it works, unfortunately.

I get unreasonably annoyed at seeing posts get downvoted just because they don’t fit the mainstream opinion, especially if the author clearly put in a lot of effort to articulate their views. Bonus points if there are zero replies too.

Anyone who downvotes like that is just lazy, close minded and cowardly. It’s sad that it happens so often.

> Bonus points if there are zero replies too.

This frustrates me to no end. If someone can type out a long comment explaining their position, the least you can do is reply with why you disagree, in addition/instead of hitting the downvote.

Yeah, I tend to upvote in this situation, even when I disagree.
> Reddit calling it karma gives it more credit than it deserves.

I remember PhpBB and vBulletin forums having karma plug-ins back in 2005, this isn't something invented by reddit at all

Slashdot had something similar many years earlier
I think they even called it Karma--back in the late 90's. I bet that's where the term originated.
Gee, Slashdot. That brings back memories
Slasdot had karma, Kuro5hin had mojo.
I wonder how many people are popular among the automated audience.
Whuffie.
I am not sure this is such a bad thing. You did good work for the community -- they wanted the link to the trailer as quickly as possible and you engineered a solution to give them that. They said "thank you" in return.

I understand the "anyone could have done that" or "they could have just waited 3 seconds longer" arguments... but you actually did it, and did it 3 seconds faster than anyone else, so they got a better "product" and you got some internet points. Not a bad deal, in my mind. It's almost a little heartwarming.

Very cool - but in a lot of ways I think that if you believed in the system of Karma then it worked?

You posted a high quality post (being first) and assume if you wrote the script you also took time to write an accurate title.

While it is arbitrary and noise you beat out those you posted 30 seconds ahead of - the same hours a bunch of people posted home videos, self-promotion, or old clips and didn't get any karma.

It is imperfect but the karma system had some desired effect.

> it did make me think about just how easy it is to farm karma on Reddit

I think you put in far more effort in writing that script, than the average karma-farmer puts in.

Far easier to just repost photos without attribution.

Yep. Early on during the famous Rob Ford drug controversy (in Toronto), I realized that any time an article on him would drop with new details on his shady dealings, this would easily gather thousands of upvotes on reddit as the most controversial thing of the day. Soon enough, I started just posting on reddit every time a new news drop would come. Journalists typically share their own articles on twitter as soon as they are online so it wasn't hard to be the first.

Ended up with thousands of points in karma for something as easy as copy pasting the URL on reddit as soon as it was available. Never cared about posting articles on reddit before or after those events, but I figured this was an easy way to make my score way high and keep it that way for the next few years.

Reddit etc are basically games.
Thank you! I said this on Reddit just a few days ago and got downvoted for it. Specifically, I said that karma was an element of gamification. Of course, they did not like that. :P
I'm sure you could do similar with certain blogs and posting to HN. EG PG's blog, anything written by Patio11 etc.
HN is small enough that timing matters even more than on Reddit. So a script would have to take that into consideration. HN also works against duplicate posts, so a post at midnight US eastern time wouldn't see much activity, and then reposts later in the morning would only add to its upvotes rather than getting new (better timed) listings.
But if you were polling, say, PG's blog and posted it here within minutes of it being published you'd accrue all the benefits of the dupes, no?
Yes, but limited. It'd depend on whether the mods reupped the post the next morning so it could have a chance of hitting the front page or if a dupe managed to get through (which does happen). Either because the system permits it, or because the submitter added a #nonce to the end.

If the mods reupped it, then it'd work out really well for the original submitter. If karma mattered, I suppose it'd be worth trying a few times to see how it worked out.

Yeah, I'm not going to bother with it just for imaginary internet points, but I bet if you wanted to take it further you could analyse the top sites that hit the front page and poll all of them.
It "works" as long as it isn't taken too seriously by everyone. I also believe that you can become addicted by karma and some people seem to completely compromise their character in the hunt for more points.
Are HN karma points fake too? I hope not
I think that they are a measure of how much you participate in the community here. I've been a member for 6 years, but as I usually lurk, my karma is low. I don't use karma as a measure of other posters but I can see why the management locks certain features for those with higher points.
> I hope not

Why?

I appreciate karma points as a measure of how much my thoughts resonate with others, or don't... They are real enough for me.
> fake internet points

What are real internet points

I only trust posts on Reddit from usernames that are > 5y old and have less than 3000 karma (unfortunately, this means I wouldn't trust my own account(s) lol). Similar to how 4 star and 2 star reviews on Amazon are infinitely more informational to understand the real pros and cons.
This is a reminder that account karma on Reddit and Hacker News doesn't really mean anything and doesn't make your posts more valuable, or give it any additional benefit in the algorithms. (I say that as someone with 153k Reddit karma and a user ranked #57 on the HN leaderboards: https://news.ycombinator.com/leaders )

Going truly viral on Reddit/HN is still ultimately random as well, which is why reposting (within reasonable amounts) is allowed on both platforms.

As someone who works on social media automation/tooling for their day job (2018 rough overview of my work: https://tech.buzzfeed.com/how-were-building-superpowers-for-... ), I do agree that ML/AI can be helpful to aid human curation, but definitely not a magic end-all-be-all that growth hackers nowadays want it to be. Simple heuristics can be very powerful as well.

HN bestows various abilities once you pass certain karma thresholds. It's the only site I know of that does this and I think it really helps keep a better community.

It's a direct counterexample to the other comments saying karma is meaningless and isn't or shouldn't be used as a proxy for trust. It might be misplaced trust in some cases, but it is trust nonetheless.

Slashdot and StackOverflow give you more capabilities with a higher reputation. Slashdot likes to give moderator points to people with higher reputations, and to people who metamoderate frequently which is itself guarded by reputation. Your votes are recorded at StackOverflow with no real reputation, but they only count for or against a post if you yourself have a high enough score - I forget what they call your reputation/karma/whatever there.
Ah, how could I forget StackOverflow?? Thanks for reminding me.
The highest threshold for unlocking features HN is at 501 karma (comment downvoting), which is not a lot and is mostly there to limit abuse.

More importantly, it doesn't affect the submission ranking algorithm at all.

I hear that if you reach rank 50 or above on the HN leaderboard you gain the ability to have your posts appear with <blink> in rainbow colours but those who have it are too wise to use this power.

More seriously these posts got me interested in what features besides downvoting would be added at various karma levels which led me to your git page on the topic. Linked here for others interested:

https://github.com/minimaxir/hacker-news-undocumented

I appreciated your joke because the latest trend I hate about reddit are the party-parrot-esque avatars. I immediately scroll them off the screen, couldn't care less what they have to say. (I say this as a lover of the party parrot, but, time & place people...)
> ...doesn't really mean anything and doesn't make your posts more valuable..

True for posts, and true for comments if they get upvotes. Downvotes however make comments less valuable. I imagine some people see the gray/faded comments and just presume low value content, which may not be the case.

I have showdead turned on, and, while you're right that most greyed out comments are low quality, there are a few that are just inexplicably greyed out. I vouch for these whenever possible, even if I don't necessarily agree with the content of the comment.
I think he meant that your karma on your account won't affect your posts. (And I agree.. you can even see the karma of the poster without clicking into his username.)

The points you got for a particular post of course matters, since it affects the visibility.

Many subreddits have automod set to hide (or hold for moderation) posts/comments from posters with below some threshold karma.

You may not even know this is being done to you because it uses a shadowban-esq mechanism where the comments remain visible to you.

Username checks out.
I see this kind of comments quite often - what is the point checking out the username? Is an underground internet meme i’ve been missing? (No native English speaker here)
It means that the username is appropriate to the body of the comment. It's very common on Reddit.
It's a Reddit meme.
As someone that got a photo of a grizzly in the wild. It was surprisingly easy. Here's how you can do it too. For probably under 1k.

1. Fly out to Anchorage, Alaska

2. Drive to Denali visitor center, (Denali State Park)

3. Take the Bus to Denali (~$50)

You are almost guaranteed to see a wild grizzlies along the way. The bus will stop so you can take a photo. You can even get off and go for a hike if you like. They will pick you up on the way back. (or whats left of you...) If you decide not to, thats okay too. You will have many near death experiences, as your school bus drives along a dirt road along steeped cliff mountain edges, for many miles. The bus takes you to heart of Denali, where you can see Mount Mckinley (Highest mountain in North America), and where temperatures can dip to minus 60 degrees in winter.

> For probably under 1k. 1. Fly out to Anchorage, Alaska

You are greatly underestimating that cost

I drove by car and camped along the way. Definitely don't stay in hotels in Alaska, if you're not rich. Its crazy expensive (like $400/night for Holiday Inn)

The way to do it cheaply is to fly in late June/July and camp in Denali park. The weather is warm enough for camping and its fantastic. I actually feel camping is the best way to do it!

Assuming you already own camping gear. And can pack some food.

Here's how it can be done:

$100 - Camping in Denali Park: $45 (entrance fee) + $17/night x 3

$200 - $50/day * 4 days car rental

$50 - Fuel / Snacks ( 300 miles round trip from Anchorage)

$50 - Denali Bus Ticket

$50 - Firewood/Misc

_____

$450 - 4 days/3 nights.

Budget left for flight $550 .

Depending where you fly in from you could come in under 1k. Seattle round trip flight in June $400. NY/Alaska is $620 round trip. Which would push you over 1k. But if you convince a friend to go with you, you could split the car/camping costs.

You can also do wild camping for free, and if you're into trying fly fishing, I definitely recommend getting a fishing license.

Fishing License Costs: 1 day - $25 / 3 days - $45. A basic Fly fishing kit can be bought < $100.

Thank you for the detailed info, but at this point you could have started your list by:

1- Live close to Alaska.

I could try driving there, but the Bering strait might be an obstacle.

Thanks for the tip!
> Back in 2006-2007 my friend and I put together a spreadsheet of 20 or so high-level achievements called “Everything’s a Contest”. This included goals like “Photograph a live grizzly bear in the wild”, “Have something named after you”, and “Get 10,000 (post) karma on Reddit”.

I can't be the only one who wants to see the full spreadsheet.

One of the bots in /r/SubredditSimulator was the "top today" bot, which would just take the top 500 posts from /r/all in the last 24 hours, pick one of the links at random, and repost it with a gibberish title made from a markov chain of those 500 posts' titles. With such a small set of input, the titles almost never made any sense at all.

I had to shut that bot down, because eventually just reposting a random popular meme/image from the last 24 hours with a nonsensical title started becoming far too successful and it ended up getting 2,648,254 post karma, while often getting its own posts near the top of /r/all: https://www.reddit.com/r/SubredditSimMeta/comments/c5besk/th...

What do you assume the source and cause of the upvotes were? Browsing /r/all it is easy to attribute the post and comments as pre-programmed.

Faulty algorithms? algorithms working as intended? vote purchase? intentional spam?

Do you scrape, log, and track, and compare user comments, and profiles on reddit or any other site? What data do you collect?

I don't understand what you're asking.
They appear to be asking "exactly why do you think your bot got so many upvotes (i.e. which tactics or techniques caused it to get so many upvotes)?"
The fact that a ton of AI work was beaten by a flash of inspiration just goes to show how far behind AI is over the human brain. But excitingly, it also suggests a crazy future where machine can actually do that.
I think it goes to show how easy it is to get reddit karma. I wrote a bot that uses 2 regular expressions to generate reddit comments. In the 16 days since I created it, it's amassed more than 7k karma (despite substantial downtime due to bugs)
I don't think that a current limitation is an indicator of future success. I think a more likely future is one governed by diminishing returns, and prohibitively large computation.
> Conclusion

Building a reddit post bot with fast.ai was an fun project, however a bit of manual effort beat out my months of web crawling and compute time.

Oh well... (63% of the 10k points were achieved manually)

The real story here is that domain expertise can lead to superior outcomes. The hybrid curation pipeline still worked, but it was a low risk and low reward solution that would eventually reach the goal. In general, there's a lot of unanswered questions regarding how to optimize in the risk reward ML space.
I don't understand why the author took the approach he did.

He creates a classifier with 5 label,s and gets 50% accuracy on the cross-val (This may be ok, may be terrible, we don't know), but then only uses the labels for one of those classes. What's the precision/recall on that class?

Also, he's taken a multi-class classification approach to what is essentially a regression task. He's trying to predict the # of upvotes a post will get, and there are approaches that'll do that, but this isn't one of them.

It's fun to see NLP applied to these sort of problems, but this isn't a good example of how to do it well.

Hmm, I'm actually interested in that kind of ML application, for filtering news I'd be interested in, fine-tuned to my preference.

Maybe crawl the net to find some interesting news, yes, but most of all assign a score to all the news that come my way (RSS, aggregators, etc), and filter out the useless ones.

And while you are at it, apply this to e-mail too. Maybe with a small chance of letting some trough to avoid getting stuck in a local maxima, or locking someone out.

At least this would make it much easier to read the important news first, most of the time. Though important != interesting.

2 things

1. Why was this formed as classification instead of as regression? Seems much more like a regression problem (predict how many upvotes you will get given a reddit post)

2. Seems like it could have been effective with better pre-training. I'd love to see the author rerun the experiments with the likes of GPT-2 or other pre-trained models vectorizing the text first.

Wouldn't you have to bucket and rank content and subreddits, respectively, anyways?

Otherwise you will probably have a none-size fits all algorithm.

Seems like an incremental decison tree problem with weights around karma and time of day.

tldr: It didn't work and the bot never made a popular post.
Thanks mate. Nearly wasted life reading that.
Did the end of the article say he only showed he WOULD get 10k, not actually did?
Yes, what I was trying to articulate at the end was that the bot which finds urls to post only generated 3.7k karma after 1.5 months. Given additional time there's nothing stopping it from generating up to the 10k. I actually achieved the 10k with some additional manual posts rather than extend the bot or run it for additional time.
Well, you won your merit badge, thats the main thing. I can't help feeling using mechanical turk was logistically the same as buying the karma: you could have paid them to do things which either directly or almost directly earned you the score, not just grading inputs. But I don't mean to quibble: it was an interesting/entertaining read!

I think if you'd gone into an investment space, you might have earned the score faster by virtue of earning other people money from your picks.

I've been meaning to write a bot that randomly posts the Always Sunny in Philadelphia quote "Because of the implication.." every ten minutes. It might get gummed up a bit with RIP or self-harm threads but I am certain after enough time it would be net karma positive.
I know that spammers, PR firms, marketing teams, etc will buy accounts with good karma(for how much, I don't know). I wonder approximately how much $/hour a good karma farmer could generate with techniques like this? Could it be an alternative to mechanical turk for some?
How do you know they purchase reddit accounts?
constructive ai bot, reminds me of this xkcd comic: https://xkcd.com/810/
why is that important?
So basically you used unethical trickery to cheat yourself 10k meaningless internet points?

You also shoehorned AI into the story so you could generate some meaningless internet points with the impressionable youth on Hacker News, too?

How about you do something meaningful and ethical with AI instead. Then you might even earn an upvote from me.

The intention was not to write a bot which uses unethical trickery, but one that was a productive poster. I include the posts to reddit the bot made in the article. I understand how you may be disillusioned by this article, and I concur that there are many more meaningful things I would like to apply AI to than a contest between friends.
Aw, please don't flame people this here. It's not sporting and not in the intended spirit of the site.

https://news.ycombinator.com/newsguidelines.html

You may be confused about the meaning of "flame". I did not flame anybody. Flaming is an insult directed against a person.

I rephrased what they confessed about their deeds.

The deeds are, indeed, in my estimation, unethical.

I assumed that part is not controversial, as the cheated system in question is called the Karma system.

Surely it is not news to you that cheating karma is unethical?

It was instituted to foster trust and good behavior among visitors of Reddit.

The posting is about cheating that system.

I must be missing something here. Please explain.