Hacker News new | ask | show | jobs
by shsbdncudx 642 days ago
Presumably “scraped” isnt the right term here. They already have the raw data, they Won’t be “scraping “ it from the website they’ll just be investing it from where they store it
5 comments

It’s interesting to think about what the right verb is.

I’d probably say Meta trained their models using all self-hosted, public AU citizen’s data.

But it doesn’t really sound as scary as “scraped” to non-technical users.

It does feel like scraped is being used for it's negative connotations.

Perhaps "consumed" would work just as well and be more accurate.

Used
Agreed, scraping is more appropriate for when one gathers data from a 3rd party site.
The authors of the title most likely wanted to suggest a similarity between metas use of the data and scraping.

In some legislations there are rules about scraping. And for many less technical people it sounds scary.

Au contraire. By calling it wrong you get the clicks and muddle legislation discussions. The Conversation will start with anti scraping, but then every user "gave" metabook the images and accepted privacy terms...
Maybe they have no idea what’s the difference between “scraping” and “mining”? I mean for non-tech people, these are just buzzwords…
maybe it's mined, as in data-mining.
Feels like they are scraping the near-empty jar of mayonnaise.