Course curriculum

  • 1

    Introduction to Data Prep

    • Introduction to Data Prep for Machine Learning

    • Pre-requisite Knowledge

    • A Quick Guide to Course Structure, Notebooks, and Exercises

  • 2

    Importing & Cleaning Data

    • Chapter Intro - Importing & Cleaning Data

    • Importing Data - CSV, Excel, and SQL

    • Selecting Columns

    • Filtering Rows

    • Exercise - Import & Filter Data

    • Exercise Review - Import & Filter Data

    • Data Types Theory

    • Basic Data Validation

    • Comparing to a Trusted Datasource

    • Exercise - Data Validation

    • Exercise Review - Data Validation

    • Imputation Theory

    • Cleaning Data

    • Data Type Errors

    • Imputation with Zeros

    • Basic Imputation of Values

    • Exercise - Cleaning & Imputation

    • Exercise Review - Cleaning & Imputation

  • 3

    Exploratory Data Analysis

    • Chapter Introduction - EDA

    • Descriptive Stats for Numeric Features

    • Basic Plots for Numeric Features + Combining Axis & Functions

    • Basic Plots for Categorical Features

    • Basic Plots for Categorical Features

    • Exercise Review - Visuals for Numeric & Categoric Features

    • Continuous vs Continous Variable Analysis 1

    • Continuous vs Continous Variable Analysis 2

    • Categorical vs Continous Variable Analysis

    • Categorical vs Categorical Variable Analysis

    • Exercise Review - Creating and Analyzing Multivariate Plots

  • 4

    Feature Engineering Part 1 - Encoding & Transformation

    • Chapter Intro - Feature Engineering

    • Training Vs Testing in Feature Engineering

    • Encoding Theory (inc One Hot Encoding)

    • Identifying Categorical Columns & Values

    • One Hot Encoding in Pandas

    • One Hot Encoding in SKLearn

    • Exercise - One Hot Encoding

    • Exercise Review - One Hot Encoding

    • Exercise Review On Hot Encoding Pt 2

    • GetDummies vs OneHotEncoder

    • Transforming Distributions Theory

    • Identifying Skew in Python

    • Transforming Features in Python

    • Taking Logs Scenarios

    • Exercise - Transformations

    • Exercise Review - Transformations

  • 5

    Feature Engineering Part 2 - Outliers, Binning, and Scaling

    • Outliers Theory

    • Removing Outliers

    • Modifying Outliers

    • Exercise - Outliers

    • Exercise Review - Outliers

    • Binning Theory

    • Categorical Binning

    • Binning by Width & Frequency

    • Manual Binning

    • Final Thoughts on Binning

    • Smoothing

    • Smoothing in Practice

    • Exercise - Binning

    • Exercise Review - Binning

    • Final Thoughts on Binning

    • Why Feature Scaling Matters

    • Scaling Features Theory

    • Min Max Scaling

    • Scaling Testing Data

    • Standard Scaler

    • Final Thoughts on Scaling

    • Exercise - Scaling

    • Exercise Review - Scaling

    • Making Feature Engineering Decisions

  • 6

    Feature Selection

    • Chapter Intro - Feature Selection

    • Manual Feature Selection

    • Feature Selection with Continuous Target

    • Correlation Coefficients - Continuous Var + Continuous Feature

    • ANOVA - Continuous Target + Categorical Feature

    • Feature Selection with Categorical Target Variable

    • Box Plots - Categorical Var + Continous Feature

    • Chi-square - Categorical Var + Categorical Feature

    • Summary of Feature Selection Techniques

  • 7

    Course Summary

    • Course Conclusion

  • 8

    Qualified Assessment

    • Qualified Assessment