| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jillyboel 420 days ago
	So far I'm just waiting for any definition.

2 comments

saagarjha 420 days ago

You got examples and didn't like them. That's fine, that just means people won't indulge you anymore.

link

jillyboel 420 days ago

Can you clarify what examples of harm have been provided? Disrespecting someone is not harming them, if that is what you're getting at? Your comment is quite disrespectful towards my genuine question which you refuse to answer, and yet, I am not harmed. In fact, I am amused, since it's clear you don't have a real answer and are just resorting to ad hominem attacks instead.

link

saagarjha 419 days ago

No, I'm going to let 'jchw do it for me, because they are more patient than I would have been and make me thankful I didn't go down that route. I don't really want to engage with someone whose argument is "there's no harm because the harm is plagiarism and according to OpenAI plagiarism is OK".

link

jillyboel 418 days ago

> the harm is plagiarism

How is plagiarism harmful outside of an academic setting? Is it illegal? Who is hurt by it? In what way? Does this supposed harm outweigh the benefit it brings to the rest of society? And, mostly unrelated, why are you okay with bigtech doing it, but not a mere human?

Just admit you realized that you don't actually have an argument. It's a simple question, and you're not able to answer it.

It's okay to admit you were wrong. It shows growth.

link

saagarjha 417 days ago

I don't recall ever saying that plagarism by big tech was ok.

link

jchw 420 days ago

I'm more curious what your definition of harm is.

(To be clear, this is a completely pointless tangent, "harm" has nothing to do with whether or not you should condone plagiarism. But you seem rather interested in discussing it, so I am kind of curious what answer you're actually looking for.)

link

jillyboel 420 days ago

I'm specifically asking you (and other HNers) what definition of harm you think applies here. I'm still waiting.

As for not condoning plagiarism, grow up. We're not kids in school anymore. You're (hopefully) an adult who graduated already.

If you're so against plagiarism, how do you feel about LLMs plagiarizing the whole internet? Didn't all the techbros collectively decide for us that this is the future we want?

link

jchw 420 days ago

> I'm specifically asking you (and other HNers) what definition of harm you think applies here. I'm still waiting.

Well, now I asked for yours, and I'm also still waiting.

> As for not condoning plagiarism, grow up. We're not kids in school anymore. You're (hopefully) an adult who graduated already.

Look, man, I'm not saying we should go kill people for committing plagiarism, I don't think this is the worst thing ever, but it definitely reflects a lack of integrity even if the original authors explicitly don't care. It's dishonest and can put the legal status of a software library into genuine question.

i.e. I care if people lie to me even if the lie doesn't matter that much.

And it is not just a thing in school. Anyone who publishes or really writes anything (e.g. books, video scripts, blog posts, etc.) can ruin their career through plagiarism. It's a cultural faux pas.

https://en.wikipedia.org/wiki/Plagiarism

> If you're so against plagiarism, how do you feel about LLMs plagiarizing the whole internet? Didn't all the techbros collectively decide for us that this is the future we want?

That's a whole other can of worms.

link

jillyboel 420 days ago

> Well, now I asked for yours, and I'm also still waiting.

I asked first and I don't want to influence your response. So, go ahead. You first.

If your only answer is that plagiarism is bad then I agree with that (in certain settings, such as education), but it's clearly no longer considered to be illegal (if it ever was?) or immoral. Just look at all the bigtech LLMs doing so while raising billions without getting into legal trouble. So apparently society has recently decided that this is fine.

link

jchw 420 days ago

> I asked first and I don't want to influence your response. So, go ahead. You first.

It's simple: I'm not dodging the question, it's just that I don't know. It's complicated. It's easy to punch someone in the face and say "I have harmed this person" but things go into the weeds quickly. Like, can you harm someone through inaction? It's a surprisingly deep philosophical question and I am not a philosopher. I don't think determining exactly what harm is to be relevant in this particular case, anyways, but any definition I could come up with would probably have holes in it and lead to a large debate that I'd argue isn't actually relevant to the point(s) being made anyways.

> If your only answer is that plagiarism is bad then I agree with that (in certain settings, such as education), but it's clearly no longer considered to be illegal (if it ever was?) or immoral. Just look at all the bigtech LLMs doing so while raising billions without getting into legal trouble. So apparently society has recently decided that this is fine.

Say we really did crack the code on how human learning works and distilled it into an algorithm. If you were able to use this algorithm to produce a representation of learned skills and knowledge, e.g. something lossy enough to be considered legally distinct rather than just a compressed form of the training data, then surely this would not be considered a derivative work of the copyright material used to train it. I think most people would agree with this. (Note the obvious caveats, e.g. if your weights do contain obvious artifacts of direct memorization then it would still be a legal problem.)

Clearly we haven't done that yet, but we did do something that sits between "lossless compression" and "human learning". The courts have the unenviable job of trying to figure out where to draw the line when we still don't really understand what's going on.

I don't really like the heist that occurred with machine learning, but I also lack a satisfactory answer on what exactly it is they did wrong (except for the obvious, e.g. committing massive amounts of piracy and DDoS'ing the entire Internet for the sake of training data.) I don't think anybody could have foresaw what would happened with machine learning decades ago to be able to make laws that would adequately cover it, and tech companies always move way too fast for regulators to keep up.

However, I don't believe that this means that all plagiarism is simply okay, either legally or morally. I just think we lack an adequate legal framework to represent our moral quandaries with big tech machine learning operations, as the traditional notion of plagiarism doesn't cover the complexities of model weights or model outputs. I also don't think that the current legal frameworks will last forever; it's a golden era for ML companies, but assuming they haven't and aren't cracking the code on artificial cognition (I strongly believe they're not near it atm) I believe regulations will eventually catch up some time after the hype has died down.

link

jillyboel 420 days ago

Alright, my point is that any harm done here is significantly less than what the bigtech LLMs are doing. If plagiarizing code is bad then so is both building & using LLMs. If building & using LLMs is fine, then so is plagiarizing code.

In this case there's a non-commerical open source project that ignored some other project's licenses. This isn't great, but it doesn't affect me, a third party, in the slightest. I have no reason to be upset about this. It doesn't really affect the other projects either, nor does it negatively affect our society. If anything it adds to our society by giving something people are clearly interested in having.

In the case of RTEMS the only thing they're missing out on is attribution. Nintendo isn't missing out on anything at all, people will still be buying their hardware to run this software.

So my argument is that any harm that may have been done is insignificant at best. Hardly worth getting upset about, especially as a third party.

As for the legal argument, it's hypocritical at best. If someone wants to condemn what happened here they should first go after the big boys who are making billions by doing the same thing on a massive scale.

> If you were able to use this algorithm to produce a representation of learned skills and knowledge, e.g. something lossy enough to be considered legally distinct rather than just a compressed form of the training data, then surely this would not be considered a derivative work of the copyright material used to train it. I think most people would agree with this.

If it's okay for an algorithm to do then it's okay for a human to do. So in that case copyright would be dead since the conclusion is you (or a machine learning algorithm) are allowed to ingest some content, then produce similar content.

A simple example is using an LLM to draw an image of some disney characters. If we say the LLM is allowed to do this because it learned to do so, which we aren't considering to be plagiarism, then why are human artists being sued by disney for doing the same?

Or in this case, let's say the original authors used an advanced LLM to assist their coding. The LLM once happened to ingest Nintendo's binary blobs during training and was advanced enough to learn from them. It uses this knowledge to produce code that can interface with the hardware which just so happens to look like the original code because that's just how you do it. Is it suddenly not plagiarism anymore? Did it become morally okay because the LLM laundered the code? Is this any different from LLMs ingesting all of github and becoming coding assistants? Why are we okay with that, but not when a human does it?

I know that in the end the legal answer is that if you have enough money you can do whatever you want, but this doesn't answer the moral questions.

link