Hacker News new | ask | show | jobs
by jparker165 4259 days ago
You'll surely be more efficient with handling data, but you don't have digital access to all the data you'd need.

For example, the chance that a player plays at all (if they are in questionable health) is one of the most important signals. But you can't just write a regression from past data on that player/team/coach. An intuitive guess from reading several media reports will be far for accurate. This guy is surely manually entering a "%chance of playing" driver into these models.

But, if you teamed up with a subject matter expert that fed meta-predictions into your model, you'd likely end up with better results.

1 comments

> You'll surely be more efficient with handling data, but you don't have digital access to all the data you'd need.

It occurs to me that collating said data in an accessible format for a modest fee would be a pretty good low hanging fruit business idea. I'm sure there must be something like that out there but last time I looked into this, 2-3 years ago, I couldn't even get the NBA schedule in JSON or some other programming-friendly format, let alone things like injuries and lineup changes.

Everything old is new again

http://en.wikipedia.org/wiki/STATS_LLC

"STATS LLC is a global sports statistics and information company – the company name originated as an acronym for "Sports Team Analysis and Tracking Systems". It was founded on April 30, 1981[1] by John Dewan,[2] who became the company's CEO. STATS was an outgrowth of the grassroots non-profit Project Scoresheet, a volunteer network created to collect baseball statistics, prompted by a suggestion made by Bill James, who later joined STATS for a time. In 1987, STATS developed a reporter network for Major League Baseball and provided research for NBC's postseason baseball coverage, and by 1989 was doing the same for ESPN's broadcasts.[1]"

Sounds like an interesting project. Where would one even get this data from? Scraping ESPN is always an option (a painful one at that) but I'm not sure they would appreciate that.

Edit: After a quick Google search I found several data providers: http://www.quora.com/Are-there-any-APIs-with-game-schedules-...

The services are out there, and they ain't cheap.