Hacker News new | ask | show | jobs
by saalweachter 1 day ago
I don't even know that I would call it slowing down so much as constraining/focusing innovation.

There's three basic paths for a company hit by this ruling to comply:

  1. Stop showing users generated content.
  2. Figure out how to generate the content with more quotes and attribution to source websites, to regain the protection offered to search engines.
  3. Figure out the hallucination problem, so that every statement in machine generated content is true, or at least defensible.
If this ruling forces companies to put more money into #3, whereas now they're coasting on good enough, I'd say it was speeding up innovation.
2 comments

There was already a ton of collective incentive for #3, I don't think the companies are choosing to "coast on good-enough."

Rather, they are stuck unable to do that much better, unwilling to admit (especially in a way that might spook shareholders) that it's a hallucination-machine all the way down. They're playing for time and market-share while hoping some unspecified and inherently-unpredictable new discovery arrives which will be compatible with their existing infrastructure and investments.

> If this ruling forces companies to put more money into #3, whereas now they're coasting on good enough, I'd say it was speeding up innovation.

The thing is, no one has the slightest idea how to stop hallucinations.

The models are fundamentally "hallucinatory" at core - they generate what is _probable to follow the string thus far in its training corpus_, modulo RLHF and friends.

Notice that nothing there has any rigorous relationship to truth.

Sure, the companies could start pumping money into pure research on what models other than transformers might yield something that can reason rigorously, but at that point you're talking about finding a way to throw out LLMs entirely in favor of a less-pathologically-broken model, like Gary Marcus keeps complaining people should be doing.

Eh, I think LLMs will stay a part of what comes next, it will just be used for language I/O instead of everything.