| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mtlynch 429 days ago

>I think in the future, websites will learn to serve pure markdown to these bots instead of blocking. That way, websites prevent bandwidth overages like in the article, while still informing LLMs about the services their website provides.

Why?

There's no value to the website for a bot scraping all of their content and then reselling it with no credit or payment to the original author.

3 comments

wongarsu 429 days ago

Unless you're selling something. If you have articles praising your product/service/person and "comparison" articles of the "top 10 X 2025" (your offering happens to be number one) you want the bots to find you.

The LLM SEO game has only just begun. Things will only go downwards from here

link

sroussey 429 days ago

Or technical docs. For example:

https://bun.sh/llm.txt

link

RamblingCTO 429 days ago

I love that! That's one of my biggest pain points: wrong/outdated usage of dependencies.

link

randunel 429 days ago

OP in this case is by no means the original author. In this linked post, they mentioned they scrape third parties themselves. OP's bots might not be as sophisticated, but they're still "borrowing" others' content the same way.

link

andrethegiant 429 days ago

ChatGPT and others have some sort of attribution, where they link to the original webpage. How or when they decide to attribute is unclear. But websites are starting to pay attention to GEO (generative engine optimization) so that their brand isn’t entirely ignored by ChatGPT and others.

link

Incipient 429 days ago

I do agree that LLM-as-a-search is going to likely become more and more prevalent as inference gets cheaper and faster, and people don't too much care about 'minor' hallucinations.

What I don't see however is any way this new way of searching will give back. There is some handwaving argument about links, however the entire value prop of an llm is you DON'T need to go to the source content.

link

genewitch 429 days ago

could have just left it as SEO and changed the S to "Slop"

link