Hacker News new | ask | show | jobs
by skeledrew 319 days ago
Crawling is done to discover and index content for search results (to relieve dependence on Google, etc). Scraping is done to get relevant content into the LLM's context window. And then the LLM generates the output. All the functions are there, so someone may emphasize just a subset to try making their point (which can cause issues if relevant context is left out, whether accidentally, ignorantly or maliciously).

> RAG crawlers

Very few people know what "RAG" is, so it makes little sense to mention it to any other than a technical audience.

> not LLM training

There's an issue of trust, because once content is scraped, it can also be used to train future models. That's really what ought to be emphasized IMO.

1 comments

Your answer is definitely clearer than the article, I get the point better, thanks for the feedback.