Hacker News new | ask | show | jobs
by lfkdev 715 days ago
What is going on with the comment section on this post?
3 comments

I actually thought some of the comments were funny. Especially the one about the crab in the shell! No idea why they thought it was related to QNX, but an insight into the mind of spammers nonetheless.
That particular snippet was posted four times if you ^F.

It's really interesting to see the current "state of the art" in terms of the types of bots that get past the particular CAPTCHA implementation this site uses.

It's a very rudimentary type of CAPTCHA, the kind that anything developed within the past 5 years would probably get past with at least 30% accuracy (logarithmically skyrocketing to >90% within the past ~2 years).

So the post quality is somewhat distributed across a spectrum - on one end, dumb CAPTCHA OCR/processing <=> "slightly better than Markov chain model replication", and at the other end, clearly more sophisticated systems that more easily pass the CAPTCHA and generate more interesting posts.

What's curious is that 90% of the comments are rudimentary. There are very few interesting spam posts. I'm trying to figure out what to make of this.

I'm picturing some majority of utterly outdated spambot infra, still out there, scanning the Web for WordPress/XSS-level stuff, and finding success on blogs like these... and that these old bots are the only systems of their kind out there, because all the spammers collectively gave up with reCAPTCHA and CloudFlare protecting almost all meaningful concerns, with moderation following not too far behind.

Kind of makes sense.

But it's really depressing to compare these old clunky bots that are kind of cute (in a way) to the upgraded versions - the current-era tech, that get past moderation... and effectively pass the Turing test :'(

I hadn't noticed multiple copies of the crab one, but had of various others. To me the fact there are so many duplicates makes me think those texts, if not written by humans, were selected by humans.

If it was truly AI-generated, I'd expect a random seed as part of its input, and unless there was very little entropy, I'm not sure it would chance upon the same exact formulations over and over. Maybe they hadn't tested the randomness aspect well in the training and it'd learned not to attach much weight to that beyond the first word or two.

> What's curious is that 90% of the comments are rudimentary. There are very few interesting spam posts. I'm trying to figure out what to make of this.

They mostly make positive comments about the website/article/author to have less chances of being deleting. The end goal being linking their website to improve SEO.

the problem with captchas is that spammers will just pay a human in a very-low cost of living country to actually do them.
Link farming, check out the usernames - they are links to third party sites in an attempt to fool search engine rankings.
Clearly they are not implementing good bot protection. The results are not very surprising IMO.