|
|
|
|
|
by jillyboel
420 days ago
|
|
> Well, now I asked for yours, and I'm also still waiting. I asked first and I don't want to influence your response. So, go ahead. You first. If your only answer is that plagiarism is bad then I agree with that (in certain settings, such as education), but it's clearly no longer considered to be illegal (if it ever was?) or immoral. Just look at all the bigtech LLMs doing so while raising billions without getting into legal trouble. So apparently society has recently decided that this is fine. |
|
It's simple: I'm not dodging the question, it's just that I don't know. It's complicated. It's easy to punch someone in the face and say "I have harmed this person" but things go into the weeds quickly. Like, can you harm someone through inaction? It's a surprisingly deep philosophical question and I am not a philosopher. I don't think determining exactly what harm is to be relevant in this particular case, anyways, but any definition I could come up with would probably have holes in it and lead to a large debate that I'd argue isn't actually relevant to the point(s) being made anyways.
> If your only answer is that plagiarism is bad then I agree with that (in certain settings, such as education), but it's clearly no longer considered to be illegal (if it ever was?) or immoral. Just look at all the bigtech LLMs doing so while raising billions without getting into legal trouble. So apparently society has recently decided that this is fine.
Say we really did crack the code on how human learning works and distilled it into an algorithm. If you were able to use this algorithm to produce a representation of learned skills and knowledge, e.g. something lossy enough to be considered legally distinct rather than just a compressed form of the training data, then surely this would not be considered a derivative work of the copyright material used to train it. I think most people would agree with this. (Note the obvious caveats, e.g. if your weights do contain obvious artifacts of direct memorization then it would still be a legal problem.)
Clearly we haven't done that yet, but we did do something that sits between "lossless compression" and "human learning". The courts have the unenviable job of trying to figure out where to draw the line when we still don't really understand what's going on.
I don't really like the heist that occurred with machine learning, but I also lack a satisfactory answer on what exactly it is they did wrong (except for the obvious, e.g. committing massive amounts of piracy and DDoS'ing the entire Internet for the sake of training data.) I don't think anybody could have foresaw what would happened with machine learning decades ago to be able to make laws that would adequately cover it, and tech companies always move way too fast for regulators to keep up.
However, I don't believe that this means that all plagiarism is simply okay, either legally or morally. I just think we lack an adequate legal framework to represent our moral quandaries with big tech machine learning operations, as the traditional notion of plagiarism doesn't cover the complexities of model weights or model outputs. I also don't think that the current legal frameworks will last forever; it's a golden era for ML companies, but assuming they haven't and aren't cracking the code on artificial cognition (I strongly believe they're not near it atm) I believe regulations will eventually catch up some time after the hype has died down.