|
|
|
|
|
by spxneo
811 days ago
|
|
isnt that what Google did ? they scraped the internet but the public/econ advisors felt the benefits outweighed copyright violations, they were just "indexers", they weren't scraping "news" they were indexing it lol same thing with emulators and roms. somebody dumped the cartridges (copyrighted software) into ROM files to be played on emulators (copyrighted bios) but they were "archiving" and if you owned the original copy you could download them. I still vividly remember seeing on warez website disclaimer: "DMCA SAFE HARBOUR NOTICE: YOU MUST OWN THE ORIGINAL GAME OTHERWISE ITS ILLEGAL BUT YES, YOU CAN DOWNLOAD EVERY SINGLE GAME MADE ON THAT CONSOLE FOR FREE" I feel like the same outcome will be for LLMs trained on copyrighted material. It will be "training". The net benefit is too great than fretting over "training" tldr: "indexing" ---> "archiving" ---> "training" |
|