Hacker News new | ask | show | jobs
by autokad 1785 days ago
has anyone found a data set that has all years? football it seems kinda protected. its really easy to get all baseball data
1 comments

You can get 2020 data from the same source that the NFL tutorial uses but that's only two years. It must exist I guess?
This tool looks promising to keep it up to date.

> The lack of publicly available National Football League (NFL) data sources has been a major obstacle in the creation of modern, reproducible research in football analytics. While clean play-by-play data is available via open-source software packages in other sports (e.g. nhlscrapr for hockey; PitchF/x data in baseball; the Basketball Reference for basketball), the equivalent datasets are not freely available for researchers interested in the statistical analysis of the NFL. To solve this issue, a group of Carnegie Mellon University statistical researchers including Maksim Horowitz, Ron Yurko, and Sam Ventura, built and released nflscrapR an R package which uses an API maintained by the NFL to scrape, clean, parse, and output clean datasets at the individual play, player, game, and season levels.

https://www.kaggle.com/maxhorowitz/nflplaybyplay2009to2016