Data Sources

Kaggle

Kaggle

A community of data science professionals which hosts data sets, competitions, and tutorials for learning data science.

UCI Machine Learning Repository

UCI Machine Learning Repository

A collection of datasets for machine learning, mostly from academic papers.

US Census Bureau

US Census Bureau

Data collected by the US government - includes data on populations, housing, income, and other socioeconomic statistics.

Data.gov

Data.gov

A collection of data from various departments of the U.S. government.

Unearthed

Unearthed

Hosts a variety of industrial competitions related to energy and natural resource data.

FRED

FRED

Federal Reserve Economic Data, useful for finance applications.

Python

Anaconda

Anaconda

The place to download Python onto your computer; helps manage various Python packages.

numpy

Numpy

Python package supporting multidimensional arrays for numerical computing.

pandas

Pandas

Python package that serves as the main tool for analyzing data in Python. Also useful for reading in data sheets from CSV and TSV files.

R

RProject

R Project

The place to download R onto your computer.

dplyr

Dplyr

An R package that allows for the manipulation of data frames.

ggplot2

ggplot2

The most powerful R package for visualizing data.

Amazon Web Services

AWS

Amazon Web Services

The source of one of the most powerful cloud computing services.

Other Resources

Tableau

Tableau

Software for producing powerful data visualizations - without any code!