Hacker News new | ask | show | jobs
by mdotk 758 days ago
I know that a lot of content on the web is rehashed itself, but many creators actually review products themselves, speak from first hard experience, research, etc.

The original knowledge has to come from someone human.

How can Google promote its "1 trillion facts" database of scrapped content and just use it to show answers to searchers alongside ads.

Seems like they are doing this without regard for much anything else because OpenAI is a new entrant and about to threaten their search empire?

2 comments

> "1 trillion facts"

https://www.techdirt.com/2006/08/08/turns-out-major-league-b...

You cant copyright facts. There is a legal argument to be made that an LLM is a reduction of copyright material into its underlying data (its vectors).

The only people who are going to win in this fight are the lawyers.

Clearly there is a limit. Otherwise, you could circumvent all copyright by saying "The contents of Harry Potter and the Prisoner of Azkaban is <insert novel text here>". While technically a fact, it's protected by copyright.
Since when does original knowledge have to come from humans?