A Free, Interactive Course Using Tidy Tools

Predictive modeling, or supervised machine learning, is a powerful tool for using data to make predictions about the world around us. Once you understand the basic ideas of supervised machine learning, the next step is to practice your skills so you know how to apply these techniques wisely and appropriately. In this course, you will work through four case studies using data from the real world; you will gain experience in exploratory data analysis, preparing data so it is ready for predictive modeling, training supervised machine learning models, and evaluating those models.

To take this course, you need some familiarity with tidyverse packages like dplyr and ggplot2 and exposure to machine learning basics. Now let's get started!

Chapter 1: Not mtcars AGAIN

In this first case study, you will predict fuel efficiency from a US Department of Energy data set for real cars of today.

Chapter 2: Stack Overflow Developer Survey

Stack Overflow is the world's largest online community for developers, and you have probably used it to find an answer to a programming question. The second chapter of this course uses data from the annual Stack Overflow Developer Survey to practice predictive modeling and find which developers are more likely to work remotely.

Chapter 3: Get out the vote

In the third case study, you will use data on attitudes and beliefs in the United States to predict voter turnout. You will apply your skills in dealing with imbalanced data and explore more resampling options.

Chapter 4: But what do the nuns think?

The last case study in this course uses an extensive survey of Catholic nuns fielded in 1967 to once more put your practical machine learning skills to use. You will predict the age of these religious women from their responses about their beliefs and attitudes.