| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by _heimdall 736 days ago
	What does it mean exactly for the service to provide information about a website without scraping it? How could summaries or LLM responses be generated be made without scraping pages?

1 comments

dotancohen 736 days ago

Presumably the same way that Firefox makes an HTTP request to the webserver then formats the page for the human user. This is just formatting that page differently. This is no more a scraper than is Firefox's Reader Mode.

That said, lying about the UA is not cool.

link

Animats 736 days ago

I have something that sends a UA of "Sitetruth.com site rating system". Many sites won't talk to that.

link

_heimdall 736 days ago

I've used a reader mode library that I think as created by Mozilla and handles converting a site to reader mode locally. Does the Firefox browser do it locally, or at least on demand? If so I wouldn't really consider that scraping since they aren't parsing the site and storing data for later use.

link