Let’s build our first data pipeline!#

Welcome to the fourth lecture of the Data Science Academy! In this lecture we will combine data preprocessing and ML models. We will showcase the use of data pipelines and some model improvements over last week’s models. In addition, we will use our data pipelines in two practical examples. The first one is the wine dataset, which is tabluar. The second one is the famous MNIST dataset, which contains hand-written digits and is non-tabular.

The Plan#

Section	Time
Intro to today’s topics	10 minutes
Data Pipelines	20 minutes
Wine Dataset	30 minutes
Break	15 minutes
MNIST dataset	40 minutes

If you missed the class, or you want to revisit some content, download the lecture recording part 1 and part 2.

Data Science Academy

Let’s build our first data pipeline!

Contents

Let’s build our first data pipeline!#

The Plan#