Hacker News new | ask | show | jobs
by listenallyall 2477 days ago
As you stated, the vast majority of racing data is collected, measured and entered by hand, by people who are paid to perform this job. It costs enormous amounts of money to employ all these people to watch every race in meticulous detail and gather all the data required to publish the Daily Racing Form. Why would you expect them NOT to protect this proprietary, valuable information?

Almost all tracks publish result charts online for free along with race videos. If you want free, why not compile the data yourself? How long would DRF or Equibase exist if people could access their data for free?

2 comments

The DRF relies on Equibase data for program and scratch data for all US and most International tracks. Even Churchill Downs relies on data agreements from Equibase to provide up-to-date information to feed to Totes. Result chart information is also almost exclusively Equibase data at least in the US. They make closed door deals with tracks, ADWs and Totes to provide data feeds.

Also, it's important to make the distinction between editorial content (analysis, predictions, subjective descriptions of a horse or jockey performance) and empirical information (horse weights, medication, surface conditions, weather, placements, jockey-horse combo win-rates, etc).

The DRF sells its speed ratings as well as analysis of pedigree and past performances. There's value in that and it definitely justifies the cost of their publication and the other publications that perform similar work.

The critical issue with your stance is that users have no options to aggregate their own data easily. The free PPs Equibase offers have been scrapped before and I know of several specific instances where the creators of those scrappers were sent cease and desist for collecting the information Equibase otherwise provides for free. Even to Github to remove the repository that contains the code.

I'm not advocating scrapping (please don't scrape sites like that) but there isn't any industry interest in providing modern consumable data. Wouldn't it be in Equibases best interest to put that information behind an API and sell access to the public? The industry actively discourages using publicly available data.

Charging a lot for the data is self defeating. In order for the sport to grow, more people need to be interested in the sport. One measure of interest is betting turnover. And a proportion of betting turnover is usually used to fund the industry. In order to increase betting turnover, one strategy could be to make the data free and easily accessible in an automated, machine readable form.

I really do not care about the likes of DRF or Equibase and how long they will or won't exist. I think it is upon the industry itself to ensure this data is available free and easily accessible. Look at Hong Kong as the alpha example. Loads of free data, huge betting turnover, well funded industry.

You may not care about DRF, but it is the sole source for a typical horseplayer to get reliable information about the horses, without which, these players would have zero guidance and likely abandon the sport.

DRF makes racing data easily accessible. If it was left to the tracks, which are independent entities (unlike NFL/NBA/MLB), an horseplayer would have to compile past performances from dozens of sources. The fields of a single day's race card may have run at 30 or more individual venues, in aggregate. Even if that data were free (well, the result charts and replay videos are already free, so technically this is already possible) if would take a ton of work to assemble it all in a digestible format -- which the DRF does for 6 bucks.

I don't believe HK offers free data that is not available from American tracks. There is no API, the result charts are less detailed than American tracks. If info was so freely available to everyone, how would someone like Bill Benter gain such a huge advantage? Why wouldn't he replicate his methods in the US? Probably because the US makes MORE data available.