Hacker News new | ask | show | jobs
by watwut 26 days ago
It will happen and it already started to happen. It started to happen even before LLM, when google started to hide smaller personal blogs in its search result. Expectation of the monetary reward has nothing to do with it, discoverability does. Culture of creating content does not exist when people cant see what others created and know no one will see what they created. A lot of smaller open source was monkey see monkey do thing - we have seen other open source projects and wanted something like that. Likewise with tutorials, we have seen other people write cool tutorials and felt like creating own and showing it out.

That is not the dynamic with LLM. You see LLM output, but original creator is hidden. And if you write your own, no one will find it. Worst, other people will tell you "LLM could have write it" in reaction ... so people wont bother.

> search engines copied and indexed web pages

Notably, search engines sent people toward web pages. And when search engines stopped doing that and started to copy content, those original pages started to die out.

> Printing presses copied manuscripts

Printing press made dissemination easier. It is an equivalent of early internet, not of LLM.

> open-source software was incorporated into commercial products

Commercial product using open source library had different user then the library it is using. And crucially, it is not hiding that library from the library user.

> Wikipedia repackaged knowledge produced elsewhere

Yes, and we collectively create less encyclopedias. They are not worth writing and checking for correctness anymore, so we don't do that all that much anymore.

1 comments

The centralized choke point of web search is getting relaxed now. Unlike search engines and social networks, you can download a LLM and run it. A small one, but capable of using a library of search stubs to directly fetch information from hubs, feeds and other search engines. You can own the agent who can solve the web search part for you.

Imagine you have a 4B model and keep an equal size corpus of search stubs, small MD documents linking to hubs, feeds and search engines for millions of topics. You can use the LLM to read the stub and perform the search for you, all orchestrated locally, with greater privacy and independence. You can dis-intermediate the search chokepoint now. You can set the criteria for what to include, exclude, how to rank and present the results.

This works because good entry points for any topic change slowly over time. The construction of search stubs is trivial with existing AI agents, and can be shared as open source. A few GB for the model, a few for the search routing layer, and you got a sovereign local agent.

If this holds, access control shifts from whatever Google thinks maximizes profit to whatever the community thinks has value.

None of that will create a community of people sharing. You wont find what another guy wrote and he knows that no one will see it if he writes it.

But most crutially, what you described is not an actual thing people do.