HIDSI

- User Portal

Data Wrangling Part 1

Workshop Description

This workshop introduces participants to using Jupyter Notebooks for consistent computational processes. We'll cover how to import data from file formats like .csv and .tsv, and guide you through the steps of cleaning and organizing this data for analysis. The workshop will conclude with attendees utilizing the skills they are taught to wrangle a real-life dataset and visualize its contents. Python libraries such as Pandas and Matplotlib will be utilized.

Prerequisites:

  • Basic understanding of Python
  • Basic understanding of Unix-like file systems
  • Computer with internet connections
  • MFA/DUO enabled on your UH Account

Learning Objectives:

By the end of this workshop attendees will be able to:

  • Understand why Jupyter notebooks are useful and how to document workflows with them
  • Apply the Python library 'Pandas' to load and clean data from different file formats
  • Use built in tools in 'Pandas' to analyze data that has been loaded
  • Visualize results in 'Pandas' with the 'matplotlib' plotting library

Tools used in this workshop:

  • Python
  • Pandas
  • Matplotlib
  • Jupyter Notebook


Registration Link

COVID-19 Guidelines: For in person attendance, attendees must be fully compliant with UH COVID guidelines