Hacker News new | ask | show | jobs
by hp6 1007 days ago
While the topic is intriguing, I dislike the use of "public services" for this type of research. For instance, adding substances to a water reservoir to study their effects is unacceptable, without permission or supervision. Similarly, conducting such research without Wikipedia's permission/supervision should not be accepted.
4 comments

Someone tried something similar but with higher risk: inserting security backdoors into the Linux kernel. They were caught and (AFAIK) their entire school was permabanned from sending pull requests.
This was also my thought. Search for hypocrite commits, and a link to an lwn article: https://lwn.net/Articles/853717/ . They did ban their whole school
The department said they'd report their findings to the community[1]. I wonder if they ever did?

1 = https://cse.umn.edu/cs/statement-cse-linux-kernel-research-a...

Thanks for pointing that out. That second link seems to close things out.
Isn't community plural? Perhaps they meant their own community but didn't say specifically which one?
I'm of quite the opposite opinion. Within reason (importantly), I believe any public service, which is also managed by an anonymous, decentralized community, ought to be under test constantly and by anyone. What's the alternative, really?

Imagine if it was taboo to independently test the integrity of bitcoin for example.

The sibling mentioned the linux kernel case. I admit that one felt wrong. It was a legitimate waste of contributor time and energy, with the potential to open real security holes.

I don't pretend to have reconciled why one seems right to me and the other wrong.

> Imagine if it was taboo to independently test the integrity of bitcoin for example.

> The sibling mentioned the linux kernel case. I admit that one felt wrong.

> I don't pretend to have reconciled why one seems right to me and the other wrong.

The "how" is what matters here, not just the "what". "Testing the integrity of Bitcoin" by breaking the hash on your own machine (and publishing the results, or not) is one thing. "Testing" it by sending transactions that might drain someone else's wallet is quite another. Similarly with Linux, hacking it on your own machine and publishing the result is one thing. Introducing a potential security hole on others' machines is another. Similarly with water: messing with your own drinking water is one thing. Messing with someone else's water is quite another.

> Similarly with Linux, hacking it on your own machine and publishing the result is one thing. Introducing a potential security hole on others' machines is another.

Playing devils advocate for a moment. How else do you test the robustness of the human process to prevent bad actors? Don’t you need someone to attempt to introduce a security hole to know that you are robust to this kind of attack?

You do it w/ a buy-in, e.g. permission from some of the maintainers - so they are aware. If you do not get permission, you do nothing. It's similar to penetration testing/
Interestingly, while I 100% agree with you regarding the parent's question about security holes, I'm actually not sure how an experiment like the one on Wikipedia could be performed even with proper buy-in from all the owning entities (Wikimedia Foundation?) Is it even in principle possible to test this ethically without risking misleading the users (the public)? If not, does that mean it's better if nobody researches it at all? The best I can think of is by making edits that as harmless as possible, but their very inconsequentiality would make them inherently less likely for them to be removed. Any thoughts?
The usual answer is the chain of trust. However, that might be against the wikipedia principles. There is "importance scale" for articles, for anything considered C+ class important, editing becomes similar to pull request, or the page has a warning of having unverified info.

It's a hard problem having fully editable storage by anyone, while maintaining integrity.

This seems really easy to test ethically.

You sift through the edit log to find edits correcting factual errors.

Then you find the edit where the error was introduced.

You can probably let an LLM do the first pass to identify likely candidates. With maybe 20 hours of work you could probably identify hundreds of factual errors. (Number is drawn from a hat.)

> Playing devils advocate for a moment. How else do you test the robustness of the human process to prevent bad actors? Don’t you need someone to attempt to introduce a security hole to know that you are robust to this kind of attack?

How do you test that the White House perimeters are secure, or that the president is adequately protected by the Secret Service?

I think the key difference is supervision, is there another party keeping an eye on what is tested and how. And maybe insuring no permanent damage is done at the end.
That's frankly one of the first thoughts that came to my mind.

I've asked the author about ethical review and processes on the Fediverse.

That said, both Wikipedia and the Linux kernel (mentioned in another response to this subthread) should anticipate and defend against either research-based or purely malicious attacks.

If it's a mature product, you should be able to pick it up and rattle it without it breaking. If it's still maturing, then maybe the odd shock here and there will prepare it for maturity?
It's true that the system must be tolerant to these sorts of faults, but that doesn't mean we have a right to stress it. The margin for error is not infinite, and by consuming some of it we increase the likelihood of errors going undetected for longer.

Sometimes it will be worth it anyway, and I don't have an opinion about this Wikipedia example, but I think it's pretty uncontroversial that the Linux example was out of line.

I think one would have to weigh the pros and cons of this kind of research. In particular, the main cons (IMO) are:

* users are misled about facts * trust is lost in Wikipedia * other users/organizations use this as a blueprint to insert false information

Harm 3 seems to be the most serious, but I suspect it has happened/will happen irrespective of this research. As opposed to the water reservoir example, these harms seem quite small by contrast. I would have liked to see a section discussing this in the blog post, but perhaps that's included in the original paper.

Everything was reverted with 48 hours, your arguments might all apply theoretically but given scope, size, practice and handling, I wonder - apart from the theory - what your opinion is how they practically apply for this case.
I didn't make it very clear, but I agree that the specific example isn't problematic. The false claims weren't meant to be any sort of targeted disinformation, and like you mention they reverted it in 48 hours.