| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by idle_zealot 1021 days ago
	Is intelligence really a factor here? Say I use the same training set as one of these LLMs, copyright protected text and all, and use it to derive a compression algorithm that uses very little space to store tokens and token sequences that are common in that huge collection of text. The resulting compression scheme includes some sort of statistical artifact derived from that copyrighted text. Is that allowed? And if so why is an LLM different?

2 comments

cj 1021 days ago

Very good question indeed.

A lot of these questions are somewhat ethical/moral in nature. E.g. is it okay to take someone else's creative work, process it through some algorithm, to create a service like ChatGPT? Or a compression algorithm? I don't know.

It's awesome to see the Copyright office request input from both sides of the argument.

link

livrem 1020 days ago

It worries me that so much focus is on two sides that may not have the end-users' best interest much in mind. The companies building the models may have an incentive to regulate models to keep smaller players or open source projects away. Artists mostly seem totally anti any solutions as even laws that allow models trained on purely public domain art would be bad for them. If laws around this are shaped primarily by the wishes of those two groups I am not sure things will end up well at all for those of us that want the tools to keep improving and remain reasonably free (including applications you can install locally and run on your own GPU).

link

chii 1021 days ago

> is it okay to take someone else's creative work, process it through some algorithm, to create a service like ChatGPT? Or a compression algorithm?

and the test i use is: if they currently allow a human to perform this same task, then it is allowed to be done using an AI model.

link

quickthrower2 1021 days ago

LLMs are generative though not just compressive

link

orbital-decay 1020 days ago

Generation, prediction, and compression are all the same - the only different thing is the intent.

link