| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Ekaros 533 days ago
	Can you launder AI model by feeding it to some other model or training process? After all that is how it was originally created. So it cannot be any less legal...

2 comments

benreesman 533 days ago

There are a family of techniques, often called something like “distillation”. There are also various synthetic training data strategies, it’s a very active area of research.

As for the copyright treatment? As far as I know it’s a bit up in the air at the moment. I suspect that the major frontier vendors would mostly contend that training data is fair use but weights are copyrighted. But that’s because they’re bad people.

link

qup 533 days ago

The weights are my training data. I scraped them from the internet

link

benreesman 533 days ago

That sentiment is ethically sound and logically robust and directionally consistent with any uniform application of the law as written.

But there is a group of people, growing daily in influence, who utterly reject such principles as either worthy or useful. This group of people is defined by the ego necessary to conclude that when the stakes are this high, the decisions should be made by them, that the ends justify the means on arbitrary antisocial behavior (c.f. the behavior of their scrapers) as long as this quasi-religious orgasm of singularity is steered by the firm hand that is willing and able to see it through.

That doesn’t distress me: L Ron Hubbard has that.

It distresses me that HN as a community refuses to stand up to these people.

link

bangaladore 532 days ago

To some extent this is how many models are being produced today.

Basically its just a synthetic loop of using a previously developed SOTA (was) model like GPT-4 to train your model.

This can produce models with seemingly similar performance at a smaller size, but to some extent, less bits will be less good.

link