Hacker News new | ask | show | jobs
by cfcfcf 646 days ago
I’m curious. Scraping seems to come up a lot lately. What is everyone scraping? And why?
3 comments

To add to others’ points, we can do two, more things:

1. Pretain models with any legal, scraped content. That includes updating existing models with recent data.

2. Have our own private collection of pages we’ve looked at. Then, we can search them with a local engine.

With people making LLMs act as agents in the world, the line between "scraping" and "ordinary web usage" is becoming very blurred.
Context for LLMs, and use cases uniquely enabled by LLMs, mostly I think.