| HN Mirror

They know you're scraping them because their site is the only source of the data you're scraping. The most common example here is airlines. Airlines that haven't agreed to be included in fare aggregators often have their booking information scraped. Even if your traffic blends in, they know that you're reading out fare data from them, because where else would you get it from? This is especially true if you follow it up with a link to buy the specific fare at the airline's site. The only plausible way to have that is to read it off of their site (and, even if you can use a template based on their URL structure, I think there would probably be a case to be made that URLs qualify for copyright and trademark protection).

As for the game of cat and mouse, it lasts until they call in their lawyers. Then it's a game of "quit now or get destroyed".

But yes, if you can scrape the data without ever tipping off the company you're scraping, you can probably continue indefinitely, but you have to consider whether you can plausibly argue that you're getting that data from someplace else. If they sued you on the suspicion that you're scraping them, they'll probably subpoena the code to confirm that (or similar -- IANAL), and then proceed to try to make a case on things other than CFAA violations.