Hacker News new | ask | show | jobs
by simonw 2070 days ago
"The receiver SHOULD perform a HTTP GET request on source to confirm that it actually links to target"

We did that in pingback too. It turned out to be trivial for spammers to circumvent.

2 comments

I've had them on my sites for a few years now, and even with bridgy passing tweet replies through the spam hasn't been too bad, but certainly adding an allow-list makes sense too. Having built Technorati that effectively did this at scale, I do appreciate the spam problem, but decentralising implementations has so far worked out OK.
Marty writes up some of the variety of implementations here https://martymcgui.re/2020/07/15/what-we-talk-about-when-wer... (which I saw because he webmentioned my in it).
That's a bummer. Did you find out any other more robust ways to filter out spam ?
I gave up and stopped using it (well, I didn't bother reimplementing it on one of my various blog engine rewrites).

If I were to implement pingback or webmention today I'd use a moderation queue with the ability to allow-list trusted domains so they get to skip moderation in the future.

I've implemented WebMentions in a project that uses it as a push notification system for websites that integrate our widget (which is just a <script> tag they include on their page). That kinda works: if you integrate the widget, you know you can expect WebMentions from https://plaudit.pub, and thus add it to an explicit allowlist.
https://github.com/zerok/webmentiond has allow list and block lists :) I am very happy using it
Most POST spam is repetitive. It's very rare human POSTs are. If you filter out any POSTs that happen identically more than 3 times you remove most spam. It's not perfect but it makes it manageable. Of course this is a lot easier to implement if you batch process.