Let’s build our first data pipeline!#

Welcome to the fourth lecture of the Data Science Academy! In this lecture we will combine data preprocessing and ML models. We will showcase the use of data pipelines and some model improvements over last week’s models. In addition, we will use our data pipelines in two practical examples. The first one is the wine dataset, which is tabluar. The second one is the famous MNIST dataset, which contains hand-written digits and is non-tabular.

The Plan#

Section

Time

Intro to today’s topics

10 minutes

Data Pipelines

20 minutes

Wine Dataset

30 minutes

Break

15 minutes

MNIST dataset

40 minutes

If you missed the class, or you want to revisit some content, download the lecture recording part 1 and part 2.