Hacker News new | ask | show | jobs
by Dylan16807 619 days ago
The complaints I see are almost always aimed at the output of an LLM, and that only contains a significant amount of a work when it breaks.

Going after the LLM itself, not the output, is a lot trickier. Anyone can make a big database of public website contents. And if they use it to make a search engine for example, that gets classified as entirely legitimate. If we're excluding the output of the LLM, what's the difference?

Also if you scrunch down into a small model, it mathematically can't contain very much of the input text.

1 comments

> Going after the LLM itself, not the output, is a lot trickier.

Exactly so, and this is why withdrawing from the open web is the only realistic solution at this time.