Hacker News new | ask | show | jobs
by CabSauce 1231 days ago
That's good for users, potentially. But not good for the sites where google is sourcing the information.
7 comments

Potentially good for users initially. But I can't see how anyone will be incentivized to create and post content for Google to scrape if they won't have any traffic from Google.

It could actually be a huge benefit in some ways if it chokes out the content mills. However, something tells me that they have little to no overhead compared to the people who actually toil away to post good, original content.

> But I can't see how anyone will be incentivized to create and post content for Google to scrape if they won't have any traffic from Google.

The problem right now is that the incentives have caused most of the output online to be garbage.

Exactly. If Google is able to provide better answers than the garbage websites with their SEO hacks, those garbage websites will not get clicks. I could see this improving the incentive system significantly
- collect creators content, make it searchable, become worlds largest company

- encourage more content because PROFIT

- content becomes garbage

- have to pay those pesky content creators

- slowly squeeze out entire industries by inlining more and more content

- still, not squeezing the juice all the way

- introduce "AI", it just laundries copyrighted content to look original

- bye creators

- for some odd reason people cheer you for this

- creators forced to make their content private

Garbage websites are 'content creators' now?
The incentive to what, be an unpaid content creator for Google? Search results lead to sources which land users on the content creators "property". Chat results, especially based on what we see right now, won't do that.
How can you call garbage SEO websites 'content creators'? I think we'll be better off with websites which spread knowledge for the sake of spreading knowledge. Down with ads and low quality copy-paste sites.
So everything being done for free except for Google taking all the profits?

Are you kidding me?

People need to make a living, creating good content isn't easy. Sure a few folks do it for free, but wholesale trying to kill off everyone who does it by making it financially non-viable is long term idiotic. What are you going to train on once everyone stops writing or letting your scrape their data to train from?

Search engines scraping your content is an agreement that they can look at it and use it, and send you traffic if it matches well with a user. Why would anyone subscribe to a deal that there is literally zero benefit except training some AI which will repurpose your knowledge.

This may actually be fantastic for the web. The current incentives are terrible anyway: cheat, scam and SEO your way to the first search result page and then do whatever since you'll get visits and decent ad revenue regardless of content.

Most people that make good content don't make it for money anyway. Did people back in the pre-google days think "oh I'd make this site but gosh darn there's nobody to pay me for it". They just went and made the site regardless.

> Did people back in the pre-google days think "oh I'd make this site but gosh darn there's nobody to pay me for it".

Google is a huge part of the reason the old web doesn't exist anymore. Artisanal websites cannot compete for visibility against corporate websites that have staff dedicated to figuring out SEO tricks from every imaginable source: page speed, HTTPS, image compression, meta tags.

The hobbyist back then didn't need to know all this. Today, not having HTTPS alone can cause your site to be hidden from search, even if it is read-only. In that kind of world, only the infinitesimally small minority will bother to make a website on their own dime.

I don't see why SSL is that much of an issue these days. Cloudflare does it for free, lots of hosting providers can handle let's encrypt for you.
I think this could be the death knell of sites that need to make money and publish content that can be easily understood in a short ChatBot answer, both mills like geeksforgeeks and hobbyists that wouldn't do it without a financial incentive. Is that such a bad thing? We've been complaining about SEO optimized crap for years.

Sites with complex or lengthy information will not fall to LLMs, IMO. No one interested in reading Paul Graham's blog posts is going to just read the AI summary and move on, for example.

Robots.txt to block anonymous scraping and offer google the opportunity to purchase your sites content if they value it. Web traffic can switch to another provider to index. Google search will be nothing but content mill garbage unless they want to pay for their AI fuel.
Website are not posting content for the sake of Google scraping it currently anyways. Look at reddit for example: fake internet points
Ya, but if you don't get fake internet points or any responses/comments on your work, what will be the incentive.
That's one of the big problems with these types of AI - piggybacking on everyone else's work, typically with no attribution _whatsoever_.
Yet you better cooperate otherwise you won't get any visitors at all.
Nobody owes you visitors.
Nobody owes you a normal life but it would be a shame if society actually thought that way and we never developed social security and universal healthcare.

Using the monopoly on search to dictate how you shall present your content to the google god is still a bad thing.

Google brings visitors to your site for free or no one would find it in the first place. You can always robot opt out and then your information is “secure”.
I've already banned the crawler from mine for the AMP and other crap they've been pulling. If you want to get more complete results, you'll need to use another search engine, which felt like the only thing in my power that I could actually do about their shenanigans. Now I'm glad I took this decision a few years ago: they've already got a search monopoly, but not being able to copy my content to further increase it? Yes please. I just hope the competition, with less deep pockets, is able to follow suit.
This seems like a personal dislike stance rather then principle, which of course you should fully take. But do you mind ChatGPT learning from your website?
Not sure. I don't like that I wasn't informed of them using my content, that there are no credits of any kind. I'd enjoy knowing it was useful and this info has been accessed by a chat user, the way I do when someone visits my site. That's the whole point of making it; I have no commercial interest or expectation but I enjoy helping others.

But from a "this is stealing" type of perspective? Nah, we're all standing on the shoulders of giants and everything we do is a remix of something else we've seen. A human can read my site and take "away" the content, just not at this scale. And they're more likely than a dead chat bot to let me know if my content has been valuable to them.

I’d rather get my information without having to click on a site. I get the need for attribution but Google can cite the sources it gets it’s content from, and I don’t really care about losing ad money.

But another issue is accuracy. Of course real sites aren’t always accurate, but they’re way more reliable than AI (and sometimes the site is ground truth like official docs so it can be trusted…unless…the official docs are wrong….).

Customers may not care, but content producers absolutely will. Chatbot like interfaces that lead to "no-click" searches are going to get sued out of existence, OR are going to lead to the establishment of lots of paywalls that lead the chatbot to have blindspots about information behind them.
News sites can reword information from other news sites, AI can train on reworded web text. It will keep the information but not learn the exact original expression, as it should be. Copyright protects expression, not ideas.
One thing I don’t get… I read blogs and articles for the experience I feel when I read them. There’s a look and feel to these sites.

Conversing with chatGPT is just giving me information and facts. I don’t see it replacing the experience of engaging with a website.

The web was a stepping stone.
if they cite sources, maybe they will create a new way to monetize a website based ln the amount of times it got sourced for an AI answer ?