Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python.
In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.
We plan to cover:
pandas data structures
loading data
subsetting and filtering
calculating summary statistics
dealing with missing values
merging data sets
creating new variables
basic plotting
exporting data
This workshop assumes basic familiarity with the Python programming language. New to Python? Consider attending D-Lab's Python Fundamentals series.
Be sure to install Python before the workshop starts - this can take up to an hour to complete (link below).
Download and install Python Anaconda Distribution 3.7
Download Workshop Materials