Hacker News new | ask | show | jobs
by hnfong 1748 days ago
It's not even very clear whether training an AI on OSS code is violation of those licenses. So unless you make your code public clearly under a proprietary license that clearly rejects such use, you can't really prevent people from doing that anyway.

Just imagine, there's really nothing preventing people from scraping your blog to train their natural language processing AI or whatever, why would code be any different? Even if you put up a big sign saying you don't consent to having your data ingested by a neural network, I doubt it will get noticed anyway...

People have been taking large OSS codebases (eg. Linux kernel) for various statistical analyses. AI is just doing the same thing in a more sophisticated manner.

1 comments

I bet if I trained an AI on some vocalist and released an album I'd get some legal mayhem. I do concede it might go differently for code, but none of these issues are crystal clear for me.