Here are some sources of freely available data that might be interesting to use in undergraduate or graduate courses. These are mostly data sources that could be used in data science, statistics, math, applied math, operations research courses.
If you have any other good ideas, contact us and we’ll add them!
Sports
Retrosheet.org
Fangraphs
Baseball Prospectus
Baseball-Reference.com
Basketball-Reference.com
Hockey-Reference.com
Government sites with data
- NOAA: https://www.ncdc.noaa.gov/data-access
- CDC: http://www.cdc.gov/DataStatistics/
- Data.gov: http://www.data.gov/, https://catalog.data.gov/dataset, https://www.data.gov.uk, https://www.data.gov.fr
- US Energy Information Administration: https://www.eia.gov/opendata/
- Listing of Data from Various Gov’t sources: http://gsociology.icaap.org/data.htm
- Open Payments Data (Pharm company payments to doctors): http://www.cms.gov/OpenPayments/Explore-the-Data/Dataset-Downloads.html
- Open Government Data: http://opengovernmentdata.org/data/
Graphs and Networks
- Stanford SNAP: http://snap.stanford.edu/data/index.html
- Mark Newman: http://www-personal.umich.edu/~mejn/netdata/
- UCINET: http://vlado.fmf.uni-lj.si/pub/networks/data/UciNet/UciData.htm
- The University of Florida Sparse Matrix Collection: http://www.cise.ufl.edu/research/sparse/matrices/
- TwitteR: http://cran.r-project.org/web/packages/twitteR/index.html
Other
- Kaggle: www.kaggle.com
- rOpenSci: https://ropensci.org/packages/ - R packages for scraping data from various sources on the web
- Gapminder: http://www.gapminder.org/data/
- Our World in Data: https://ourworldindata.org/
- Data used in 538 articles and visualizations: https://github.com/fivethirtyeight/data
- KDnuggets: http://www.kdnuggets.com/datasets/
- UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/
- StatLib: http://lib.stat.cmu.edu
- rfigshare: http://cran.r-project.org/web/packages/rfigshare/index.html
- Weather: http://climate.psu.edu/data/ida/index.php?t=3&x=faa_raw&id=KPNE
- Gulf Science Data: https://data.gulfresearchinitiative.org/data-discovery
- The International Disaster Database: http://www.emdat.be/database
- Yahoo Finance: http://finance.yahoo.com
- Yelp Data sets