| I work with news data professionally, and have found two broad categories of suppliers: the buyers and the scrapers. Buyers are often a bit old school and frankly far more expensive than it's worth. Also good luck getting a usable, affordable API. I'm looking here at people like Meltwater, LexisNexus etc., who have licencing agreements with publishers. Then there are the scrapers. The one I use is newsapi.ai, and I can broadly recommend them. They've got a decent selection, are happy to add stuff for you, and have lots of nice goodies baked in (e.g, NERD). Most of the other ones you'll find with a cursory "news api" search also fall into this category AFAICT, but few, if any provide full text, which is what I need. From conversations I've had with my supplier, I believe they've got a scrapy box running somewhere pulling largely off RSS feeds. I wouldn't want their job to be honest, so much to look after. This approach is fine for some needs, but you can literally see the gaps in the time series where something has fallen over. I'm very interested in this space and would love to hear other's experiences. |