Hacker News new | ask | show | jobs
by mtndew4brkfst 60 days ago
What is the specific concrete purpose of downloading millions of URLs per hour across different domains if it's "not doing anything wrong"?
2 comments

Mostly ecommerce and pricing data. I work for marketplaces, brands, retail stores and even our own saas competitors. We match the EAN (gtin) to the correct SKU within seconds (Google Shopping, Amazon, etc). Part of it is our own trained ML models.
Might be it for scrapping content for training an LLM? Oh no only big tech allowed to do it...
"The gangsters do it and get away with it so any random person should get to as well"? Not a particularly defensible position if that's an accurate paraphrase.