Hacker News new | ask | show | jobs
by bloppe 201 days ago
If you're publishing your code anywhere, it's getting trained on. MS does not restrict themselves to only training on GH-hosted code.
3 comments

Yet, not restricting themselves to train on permissively licensed code only.

The two ends of the spectrum, both source available and copyleft licensed code shouldn't be used for training, but who's listening.

The point still stands for private repos, and also not making the job easy for them.
They don't train on private repos, there has been no proof of that anyways
> If you're publishing your code anywhere, it's getting trained on

citation needed. first they need to know my code exists... spend time and traffic crawling it because it's sure as hell not going to be hosted on azure... probably get detected and banned.

No citation needed. It should be an assumption and thought as a malicious cybersecurity threat.
> It should be an assumption and thought as a malicious cybersecurity threat.

If you believe in absolute cybersecurity for anything you keep online boy I've got news for you. Literally all you can do is make it tougher but it will never be uncrackable. The degree of it depends on how much you can invest and suffer.

same here. codeberg makes in tougher so it's a measure.