Hacker News new | ask | show | jobs
by Gigachad 126 days ago
There are countless examples. Often I think about the fact that the google search AI is just rewording news articles from the search results, when you look at the source articles they have exactly the same points as the AI answers.

So these services depends on journalists to continuously feed them articles, while stealing all of the viewers by automatically copying every article.

4 comments

I actually often have the opposite problem. The AI overview will assert something and give me dozens of links, and then I'm forced to check them one by one to try to figure out where the assertion came from, and, in some cases, none of the articles even say what the AI overview claimed they said.

I honestly don't get it. All I want is for it to quote verbatim and link to the source. This isn't hard, and there is no way the engineers at Google don't know how to write a thesis with citations. How did things end up this way?

I have to say, I suffer from both problems, just not simultaneously.

Depending on what I am searching for, and how important it is to me to verify the accuracy and provenance of the result, I might stop at the AI, or might find, as you have, that there is no there there.

But, no matter what, the AI is essentially reducing the ability of primary sources to monetize their work. In the case where the search stops at the AI, obviously no traffic (except for incessant LLM polling) goes to the primary source.

And in the case you describe, identical traffic (your search) is routed to multiple sources, so if one of them actually was the source of something you were interested in, they effectively wind up sharing revenue with other sources, because the value of every one of your clicks is reduced by how often you click.

ChatGPT was a research prototype thrown at end users as a "product".

It is not a carefully designed product; ask yourself "What is it FOR?".

But the identification of reliable sources isn't as easy as you may think, either. A chat-based interaction really makes most sense if you can rely on every answer, otherwise the user is misled and user and conversation may go in a wrong direction. The previous search paradigm ("ten snippets + links") did not project the confidence that turns out is not grounded in truth that the chat paradigm does.

Yes, and it's slowly killing those websites. Mine is among them and the loss in traffic is around 60%.
Snippets were already getting Google in legal hot water (with Yelp in the US and news agencies in Australia in particular IIRC) long before LLMs and AI scraping. It's a debatable gray area of Fair Use growing out of early rulings on DMCA related cases, and also Google's win over the Author's Guild at SCOTUS.
Of course Google has a history of copying articles in whole (cf. Google Cache, eventually abandoned).