In today’s lesson, we will be building towards real-world examples for better context and implementation.

Over the Moon on Basic Neural Networks

Let’s open up the file(s) in the 01-Ins_OverTheMoon folder to get started.

Getting Hands On with Model Optimization

Let’s open up the file(s) in the 03-Ins_SynapticBoost folder to get started.

In ML, irregardless of whether it is deep learning or other forms of ML, it is important to:

  • Weigh the features (input variables, independent variables)
  • Reduce noise (removing columns or rows that might disrupt the real-world effectiveness of the model)

In the above activity, we see how some of the reaction times have extraordinary outliers. Based on business context, it may be necessary to remove these outliers as your input features so that your model can scale with a high degree of accuracy and precision.

Thus, exploratory data analysis (EDA) and preprocessing matters even with neural nets.

What is model convegence?

When we say a model has converged, we mean that in a loss function graph, increasing the number of epochs or iterations will not improve the model in terms of reducing errors.

Take the Guesswork Out of Model Optimization

Let’s open up the file(s) in the 04-Ins_AutoOptimization folder to get started.

What is hyperparameter tuning?

Hyperparameter tuning is how we should configure our model so that it performs optimally to the best results against our training and validation data sets.

This is in the hopes of deploying the best ML model/product that we can offer.

Hyperparameter tuning does not exist only in deep learning. In fact, it exists in almost all known ML models, such as:

  • XGBoost
  • Logistic algorithms
  • And so on…

A lot of the hyperparameter tunning process can be automated where the library would brute force a bunch of configurations to find out the best accuracy to loss in your outcome.

How does hyperparameter work in general?

In many cases, it is a brute force method of defining the best optimal configuration parameters for your objective (in our case, usually accuracy).

As shown in the above instructor’s activity, the tuner will iterate across different configurations in terms of:

  • Switching between the activation function
  • Running the number of neurons from 1 to 10 in the first layer
  • Configure from 1 to 6 hidden layers, where each layer can have 1 to 10 neurons.

This means hyperparameter is very time consuming and resource heavy. Typically, we will create a proof-of-concept model, run a few tests, ensure we are doing proper feature engineering, and create a summarized report of our model before attempting to tune the model.

The same concept works for other learning models as well.

Getting Real with Neural Network Datasets

Let’s open up the file(s) in the 06-Ins_GettingReal folder to get started.

Ref: https://static.bc-edx.com/data/dl-1-2/m21/lessons/2/img/NN_Preprocess_Flowchart.pdf

There could be an error on the 4th cell, where the code was:

  • encode_df.columns = enc.get_feature_names(attrition_cat)

There is a recent update on scikit-learn library, and they have renamed their method to:

  • encode_df.columns = enc.get_feature_names_out(attrition_cat)

What is the difference between pd.get_dummies() vs OneHotEncoding by sci-kit learn?

  • OneHotEncoding is generally preferred because of persistance.
    • When you use pd.get_dummies(), it will only encode the data into categorical variables that are currently present in the dataset. However, if you have additional data sets that contains additional categorical variables, but you did not encode them from the previous runs, you will get a conflict in terms of training data.
    • OneHotEncoding does not have the flaw because it stores the categorical variabels into memory. Thus, if additional categorical values appear within the dataset in the future, it will create new values appropriately without conflict.
  • pd.get_dummies() is good for preliminary analysis, but OneHotEncoding should be what you deploy on production.