|
|
|
|
|
by furyofantares
910 days ago
|
|
> Synthetic data has many advantages - it is free of copyright issues, the downstream models can't possibly violate copyright if they never saw the copyrighted works to begin with. I feel like we don't know if this is true or not. If we decide models trained on copyrighted data aren't fair game, it's possible we'll decide "laundered" data also isn't. I mean, maybe that's not feasible. And I hope we don't decide training on copyrighted material is bogus anyway. But I don't think we know yet. But also - you can totally violate copyright of something you never saw. |
|