| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by __forward__ 802 days ago
	The copyright situation around all this is very... interesting. Pretty clear that this dataset is not legal but what about resulting models? What if the texts actually where bought 'properly'?

3 comments

sp332 802 days ago

Buying a copy of the book would not give you any copyright license. You could only make copies for personal use.

link

CaptainFever 802 days ago

If you are in a jurisdiction with TDM exceptions, buying a personal copy does allow you to train on it.

link

chasd00 802 days ago

The race is on to figure out a way to get LLMs to produce content to be used for training other LLMs in a satisfactory way. Eventually the dataset question will get figured out in the courts but if there’s a technique to generate more training data in an automated way then the court decision doesn’t matter.

Edit: also, I don’t believe court decisions can be enforced retroactively so existing LLMs would be safe but I’m most definitely not a lawyer.

link

DeathArrow 802 days ago

If you steal a PC and use it to build a very successful app, would that app be legal? Would the use of the said app by third parties be legal?

link