Hacker News new | ask | show | jobs
by mhils 1512 days ago
We've been one of 666 repos, and I'm not too happy of having our repo used as advertising space. Some thoughts:

- I'm happy to receive fix-a-typo PRs from human users. In this case the other side demonstrated that they care by putting in a bit of manual effort, and a small PR often paves the way towards larger contributions. I also know that open source beginners get really excited about their first small contributions, and I'm honestly happy to support that.

- In contrast, the marginal effort for bot PRs is ~0. It's very easy to generate a small amount of work for a lot of people, and the nice side effect is that the bot's platform is advertised everywhere. As a maintainer, I have never given consent to this and I have no choice to opt out.

We are very happy users of some GitHub bots, but I feel it needs to be an active adoption decision by the maintainer. If you want to pitch me your service you may send me an unsolicited email, but don't use our public space to advertise your product without asking.

Edit: I don't want to be too harsh to OP here - at least they pointed out a small but valid issue in our case. I very much appreciate their apology at https://news.ycombinator.com/item?id=31210245

7 comments

I just... think you should reconsider your stance on this. If you made a mistake in a public repo and someone else caught it (via scan of your repo or otherwise), it's a pretty bad look to be anything but grateful at that point, PR benefits for the bot aside.
The problem with scanners is that they usually have a pretty high false positive rate. When automatically opening the PR, they are basically putting the human review part on the maintainer (burdening them with additional and possibly useless work) while also using their repo as advertising space without consent. When the scan goes wrong and has a lot of false positives or it looks like they just got lucky, it's easy for a maintainer to feel like most of the cost was handed to them, while most of the upsides (like QA and brand recognition) are reaped by the bot. When a human opens the PR, you at least know that they valued your time and checked the changes beforehand, even if it's based on the results of the bot and contains the same errors.

Now, if the bot catches an actual error and improves the software, the result is obviously net good and the tad of free advertising is deserved. But it can easily feel like a PR campaign paid for with carelessly annexed maintainer time and in quite a few cases, it simply is.

> The problem with scanners is that they usually have a pretty high false positive rate.

Did that happen in the example being discussed in this thread?

How high is the false positive rate? I would say even at 80%, the bots at least have found enough possible bugs that worth attentions that wouldn’t be found by human review only
The issue they had is being part of the advertisement, not that the bot did the work.

Everyone is out for notoriety and street cred instead of just doing good for the community.

I understand the sentiment but you should be judging the PR, not the source. Ask yourself: would you have happily accepted the same PR that the bot sent if it came from a human?

By all means, I am not against having bots identify themselves properly, my point is that "effort from bot PR is ~0", "it advertises their platform" are simply not the right reasons to judge this situation by.

Ask yourself: would you treat a PR differently if it came from a regular, trusted contributor, or some random person (or bot)?
Sure you would probably treat it differently but isn't it being elitist and harmful to the open source community in general to outright shoot down or discourage any PR from a lesser known or unknown source if it is a good PR? I think we should encourage novices to contribute and we shouldn't be hostile to them so that they can get past the novice phase and become trusted contributors. If a bot produces valid helpful and well formed PRs, why would you discriminate against them completely when they improve your codebase?
I would if the content of the PR were complicated. Not in this case.
> I have never given consent to this and I have no choice to opt out.

You have a public repository on GitHub. You are free to switch it to private, but otherwise this absolutely illogical. No one needs your consent to submit PRs to and highlight a public repository.

This is equivalent to having a website and then getting angry about linking to it, or putting your artwork up for public viewing and then getting angry at someone pointing out a small tear in the fabric.

Actually, now that I think of it, it’s better comparable to someone bringing in a handheld scanner with a company name on it, scanning the artwork and then pointing out the tear.

Which is still totally fine. You’ve given implied consent by making it available to the public. You have decided to make it possible for the public to view it, criticize it and link to it.

Quote Wikipedia,

> Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, *study*, change, and distribute the software and its source code to anyone *and for any purpose*

Case closed.

> Actually, now that I think of it, it’s better comparable to someone bringing in a handheld scanner with a company name on it, scanning the artwork and then pointing out the tear.

No, it's more like somebody sending to your lab, uninvited, an impersonal inspection bot with another company's branding on it, which doesn't only disclose potential issues to you but advertises them across the whole cyberspace.

And in case of OSS this lab may be my tiny garage where me and friends tinker on stuff.

Choosing to make the results of our passion or work free for all to study and use should not come with a liability of having to deal with hordes of such bots.

Only if you establish your lab in a tent on the street and put a sign that says "for public display" on it.

> And in case of OSS this lab may be my tiny garage where me and friends tinker on stuff.

That's not OSS. OSS would be leaving the garage door open, putting your garage on Google Maps and freely allowing anyone to walk in and see what you're doing. That's OSS.

Then getting angry about it is what you and OP are doing.

> Only if you establish your lab in a tent on the street and put a sign that says "for public display" on it.

I don't get how your flawed analogy has evolved now. Care to expound?

Edit: I see your edit, thanks. Yes, if we leave garage doors open we still don't welcome these bots, sorry.

But that is not up to you to decide in the case of OSS.

Public websites get crawled and indexed hundreds of times per day and sometimes linked to even with criticism. Would you not say this is the same concept?

Outside of my system vs. inside of my system. PRs count as the latter in my view. It's filing a task, for me to do, so it should not even seem like it is coming from a non-individual.
> No, it's more like somebody sending to your lab, uninvited, an impersonal inspection bot with another company's branding on it, which doesn't only disclose potential issues to you but advertises them across the whole cyberspace.

GitHub isn't your lab. It's Microsoft's lab. (They just rent out space free of charge.)

I stand corrected. (You forgot that they also mine my lab for data subsequently used in paid solutions.)

My point is, we should cherish the culture that enables progress and learning by routinely opening works of passion to free use and contribution. We could cherish it by employing a sense of ethics and adherence to certain protocols of behavior, these don't need to be spelled out but those of us who know better can lead by example. Putting OSS maintainers under undue stress has come under fire before, and this looks like one of those cases.

Sounds like we need a robots.txt for GitHub repos.
Isn't having a public github repo consent?
Is having a publicly reachable email address consent to receiving unsolicited emails? Is having a public postal address consent to receiving mailed advertisements? Legally, yeah; morally, less so.
Technically they are forms of abuse, a PR for a valid bug less so.
If you believe that's a material difference, replace "advertisements" with something more like "mailers informing you on how you can improve the [security/looks/whatever] of your building".
I feel the same way about those bots that tell you about insignificant security vulnerabilities in some project you abandoned. It's basically spam.

That said, this does seem like it is a bit more useful. As long as they actually read the changes and make sure they aren't false positives. Which I'm guessing they didn't do for 666 repos.

> As long as they actually read the changes and make sure they aren't false positives. Which I'm guessing they didn't do for 666 repos.

In the article they say that "really a bot found the problem and made the PR, but really a human developer at Code Review Doctor did triage the issue before the PR was raised)".

> I feel the same way about those bots that tell you about insignificant security vulnerabilities in some project you abandoned. It's basically spam.

If you "archive" your repos, dependabot and friends won’t bother you.

Or, you could just disable security alerts in your repo's settings.

Dependabot isn’t the only source of vulnerability fatigue, there are plenty of “researchers” who would spam your active projects about pointless “vulnerabilities”. For instance, I recently got one about a parsing issue in gmp from a human user, who probably found it by scanning PyPI. I’m not touching anything adjacent to the supposedly vulnerable codepath, and the fix isn’t even in a gmp release, meaning I would have to carry a patch if I were to “fix” it. I still responded amicably, but I was not happy.
There’s not really anything that can be done about that, yet, unfortunately. But if you’re not committing to the repo anymore, archiving it is an option. It’ll disable the issue tracker and pull request features. And if you change your mind, you can unarchive it.
It's your repo and your choice. You can reject the PR.

Your repo is public so you can't prevent people and bots from looking into it and having opinions on it (even public ones).