| > No one cares how much it might cost you to retrain your models. Playing tough? But it's misguided. "No one cares how much it might cost you to fix the damn internet" If you wanted to retro-fix facts, even if that could be achieved on a trained model, it would still get back by way of RAG or web search. But we don't ask pure LLMs for facts and news unless we are stupid. If someone wanted to pirate a content it would be easier to use Google search or torrents than generative AI. It would be faster, cheaper and higher quality. AIs move slow, are expensive, rate limited and lossy. AI providers have in-built checks to prevent copyright infringement. If someone wanted to build something dangerous, it would be easier to hire a specialist than to chatGPT their way into it. All LLMs know is also on Google Search. Achieve security by cleaning the internet first. The answer to all AI data issues - PII, Copyright, Dangerous Information - is coming back to the issue of Google search offering links to it, and websites hosting this information online. You can't fix AI without fixing the internet. |
We now have a new class of criminals infringing on copyright on a grand scale via their models and they seem desperate to avoid persecution hence all this bullshit.