We've been running our startup [1] in this "media research industry" for just a while. We're on the classic media use case side.
It is true that the vast majority of "research" is done by non-academics. Lots of companies doing market research want to mine media data.
Still, I believe that this "social media research" is a bit overvalued. There was this wave of "social media is the primary source where information appear". But now many realized how freaking difficult to separate this data from the noise comparing to traditional news published by journalists.
Also, take a look on this article [2] about how Dataminr sells insights from Twitter data to foreign governments (2017). Seems like just a way to punish the opposition channels.
It's a shame access to this API is limited to academic institutions, as many social media/misinformation researchers are now independent or affiliated with journalistic institutions.
So you're telling me API's are not useful while at the same time telling me that API's are useful. Trolls want API access for the same reason researchers want access. You could argue research is the first step to becoming an effective troll.
I don't see API access as being necessary for a small-scale trolling operation, and large scale operations have enough resources to work around the lack of API by scraping.
I'm not saying that API access is completely useless, I was raising the question of whether the potential (and relatively small) benefit to trolling outweighs the major upsides of API access being available for all.
In what research contexts is API usage valid instead of scraping a view more similar to what people experience? If the Twitter site and API are retrospectively cleared of removed/suspended accounts with large impact, how does that affect retrospective studies?
Are there ethical implications of working with Twitter to gather data? Despite Twitter TOS, legal, IRB ok, are there informed consent issues in studying the artifacts of social media use?
Until now, none I think. The API only gave a partial view while scraping offered all tweets for a particular search term. The scraper had to be clever to juke the anti scraping systems but you would get a more complete data set than using the API.
And the streaming API was terrible. Even if there was no data on the stream you could consume tens of gigabytes of bandwidth a day. Dreadful.
One easy example is language, for example tracking the spread of new words or other language constructs. You don’t care how the site looks, you care about the text that was previously input.
I wonder how are they going to enforce their rules, e.g. non-commercial use. I assume this will require some monitoring to be effective. Large scale Twitter API access is typically pricey, malicious actors might try to buy or steal researcher's credentials to cut costs.
I recently tried to sign up for Twitter API and the process is nothing like what it used to be. You have to give them a lot of information to even qualify, such as what you're going to use it for. It used to be that those were just some fields you need to fill out and you could sign up immediately. But nowadays the application process requires a direct approval from their team, which means they're monitoring every API account like Apple does with their app store. And if you like about your usage you are probably liable
> You are either a master’s student, doctoral candidate, post-doc, faculty, or research-focused employee at an academic institution or university.
This is gross. Rather than using the internet as a democratizing force for education, they restrict the program to those already inside credential-granting institutions. So much great research has been done from outside the institution and yet Twitter is actively pushing outsiders to resort to scraping.
I always wondered: how many of these tweets are just Justin Bieber fandom type posts, bots, spam, or other dross? Twitter is infamous for its bad signal to noise ratio. These researchers need to write algos to filter out all the noise
What's wrong with that? If you wanted to investigate, say, the rise of The Beatles - wouldn't you love to have access to the random thoughts of their fans in the 1960s?
Similarly, if you're researching bots and spam and how they manipulate people & markets - this is still useful.
Where's your sources, numbers, methodology? I mean anyone can make a kneejerk statement based on their perception (read: bubble), but that's not science.
I'd think libertarians would love twitter, deplatforming is the free market at work and the government has no right to make them do business with anyone they choose not to
Twitter should be judged by the way it governs its platform. And from libertarian perspective it's governed poorly. Sure, under current laws they can get away with deplatforming in the way they do now, but there is nothing commendable or desirable about it for libertarians specifically.
And one might think liberals, putting liberty above order, would be aghast at silencing opponents to maintain order, and yet here we are. Our political theater has gotten pretty weird. The whole thing is looking more and more like unprincipled tribes vying for power.
Except I keep hearing about it constantly, everywhere, including Twitter. And yet I haven't seen a single one being banned for discussing fiscal policy, less regulation, conservative views on social programs etc. "Conservative voices [being] silenced" are almost always some variation of spamming evidently real-world damaging conspiracy theories or clear ToS violations.
you would think from the amount of republican/conservative accounts they have banned that they have some AI or parser dedicated to banning these types of voices
It is true that the vast majority of "research" is done by non-academics. Lots of companies doing market research want to mine media data.
Still, I believe that this "social media research" is a bit overvalued. There was this wave of "social media is the primary source where information appear". But now many realized how freaking difficult to separate this data from the noise comparing to traditional news published by journalists.
Also, take a look on this article [2] about how Dataminr sells insights from Twitter data to foreign governments (2017). Seems like just a way to punish the opposition channels.
[1] https://newscatcherapi.com/
[2] https://www.theverge.com/2017/1/27/14412014/dataminr-twitter...