Hacker News new | ask | show | jobs
by Gormo 10 days ago
DDOSing websites seems to be an unrelated problem, and one that has traditionally been solved through response throttling and IP blocking.

Attribution is often required even on MIT or BSD licenses where code is being redistributed, either in original or modified versions, but that would relate to this discussion only to the extent that one regards using LLMs whose training data included a certain bit of code as itself constituting redistribution of that specific code -- but that in turn is a very debatable premise which really ought to be argued for, and not merely argued upon as though it is already generally recognized as true.

1 comments

Why? You stole my stuff and now are pretending I need to argue for you to stop stealing it. It's a joke.
This is the very question under debate. Training LLMs on publicly available data is a novel situation, and neither law nor social opinion have settled a consensus on the subject.

Copyright maximalists like to borrow unearned moral weight for their position by conflating copyright infringement with "stealing", but this is not actually true in any legal sense. It's not clear that training an AI on publicly available data should even constitute copyright infringement, much less "stealing".

What? What is being "stolen" from you?

Are you now layering the old and tired "copyright infringement = stealing" argument on top of the still unsubstantiated premise that all LLM training is copyright infringement?