Hacker News new | ask | show | jobs
by visarga 907 days ago
It's a matter of ensuring the synthetic content is different enough from the referenced content. We can filter.
1 comments

Sure, but what matters for copyright is output, not input. For now.

If we make the (poor, imo) decision to prevent training on copyrighted data, that's a restriction on the training process, not on its result.

And in the world where we're making bad decisions to put legal restrictions on the training process, "can't train on data obtained by models that were trained without these restrictions" seems on the table.