| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adolph 1968 days ago
	In what research contexts is API usage valid instead of scraping a view more similar to what people experience? If the Twitter site and API are retrospectively cleared of removed/suspended accounts with large impact, how does that affect retrospective studies? Are there ethical implications of working with Twitter to gather data? Despite Twitter TOS, legal, IRB ok, are there informed consent issues in studying the artifacts of social media use?

2 comments

fnord123 1968 days ago

Until now, none I think. The API only gave a partial view while scraping offered all tweets for a particular search term. The scraper had to be clever to juke the anti scraping systems but you would get a more complete data set than using the API.

And the streaming API was terrible. Even if there was no data on the stream you could consume tens of gigabytes of bandwidth a day. Dreadful.

link

dharmab 1968 days ago

One easy example is language, for example tracking the spread of new words or other language constructs. You don’t care how the site looks, you care about the text that was previously input.

link