|
I work in the industry, can confirm, the tactics used to dissuade people for aggregating horse racing data so we can sell $2 PDFs are extremely counter-productive and reflect the age of the industry. Several orgs, including Equibase (US-based, the gate keeper of a good portion of handicapping data) will regularly send cease and desist orders to people who attempt to automate aggregation of data even with free, publicly available content. That's at least half the reason PDFs are used when customers purchase data access, to make aggregation harder (you should see some of the white space, character encoding fuckery they use to throw off aggregators). I suppose some of this often depends the quality of the data as well. Most data entry happens at the track during the race by a human, none of the data collection about races or the horse stats are collected by a computer, it's 95% hand entered. That also goes for pedigree information and other statistics including medications, weights, etc. And 100% of that is usually self-reported. Much of the current handicapping in the industry is everyone trying to protect their personal mountains of data. Tech-minded people would love to provide open, controlled, API services so that people can do what they will with our mountains of data. But "giving it away for free" is a non-starter for the good ole boys at the top.. |
I was involved for a number of years with a UK based horse racing ratings service (handicapping if in the US). This service used to license their base data from the Press Association[1] and then run algorithms on top to produce the ratings.
There's certain things I can't say due to NDAs which are probably still in effect, but the cost of licensing this basic data was in excess of £10k per annum. So, unless you were a serious bettor or were looking to operate a service of some kind, it's beyond the pocket of most individuals.
Timeform in the UK also license some of their own proprietory data, via an API[2]. They've published some pricing on their website and you're looking at between £6k - £12k per year. This is just to access data which is available via their website for a subscription fee of £75 per month, but via their API.
There's even a specific UK organisation which apparently has the permission from the British Horse Racing Authority to officially licence key racing data. This is who sells the data to bookmakers, form guides, racing newspapers etc. They have a rate card published on their website.[3] Private, pro-punter? £8.5k per year please.
It's a bit of a rort really. Most of the data is "freely" available online or in the racing press, but if you want to access it any useable format, either build a scraper (good luck with staying on top of the website changes) or pay a stack to access things programmatically.
[1] https://pa.media/racing-betting/horseracing/
[2] https://www.timeform.com/commercial/products/api
[3] http://www.racecoursedatacompany.com/