Hacker News new | ask | show | jobs
by timschmidt 261 days ago
Absolutely. Though the smarter these things get, and the more layers of additional LLMs on top playing copyright police that there are, I do expect it to get more challenging.

My comment was intended more to point out that copyright cartels are a competitive liability for AI corps based in "the west". Groups who can train models on all available culture without limitation will produce more capable models with less friction for generating content that people want.

People have strong opinions about whether or not this is morally defensible. I'm not commenting on that either way. Just pointing out the reality of it.

1 comments

It's a matter of time. I imagine they'll get more effect suppressing activations of specific concepts within the LLM, possibly in real time. I.e. instead of filtering prompt for "Mickie Mouse" analogies, or unlearning the concept, or even checking the output before passing it to user, they could monitor the network for specific activation patterns and clamp them during inference.
They might, but we may also find they don’t function as well or as predictably if increasing amounts of their weights are suppressed. Research has so far shown that knowledge is incredibly, vastly diffuse, as are causes of different behaviors. There was some research that came out of Anthropic where a model being taught number sequences by another model, and that second model had fine tuning which with a stated preference for owls. The student model, despite no overt exposure to anything of the sort, expressed the same preference. The subtlety of influence that even very minor things have on the vast network of weights is, at least at present, too poorly understood to know what we’re getting in the bargain when holes are poked.