| HN Mirror

Yeah, that seems to be a big distinction. If I tell my AI to summarize the headlines from my three favorite news sites every morning, it's just carrying out my request same as if I'd clicked to them, so that seems fine.

But if I say, "Search the web for a low-carb chicken casserole recipe that takes squash and cottage cheese," then it's either going to A) send queries to a search engine like Google, in which case robots.txt already should have been respected, or B) check its own repository of information it's spidered before I asked the question, in which case it should have respected robots.txt itself.