In my previous post, we discussed how neural networks predict and learn from the data. There are two processes responsible for this: the forward pass and backward pass, also known as backpropagation. You can learn more about it here:
This post will dive into how we can optimise this “learning” and “training” process to increase the performance of our model. The areas we will cover are computational improvements and hyperparameter tuning and how to implement it in PyTorch!
But, before all that good stuff, let’s quickly jog our memory about neural networks!
Neural networks are large mathematical expressions that try to find the “right” function that can map a set of inputs to their corresponding outputs. An example of a neural network is depicted below:
Each hidden-layer neuron carries out the following computation:
- Inputs: These are the features of our dataset.
- Weights: Coefficients that scale the inputs. The goal of the algorithm is to find the most optimal coefficients through gradient descent.
- Linear Weighted Sum: Sum up the products of the inputs and weights and add a bias/offset term, b.
- Hidden Layer: Multiple neurons are stored to learn patterns in the dataset. The superscript refers to the layer and the subscript to the number of neuron in that layer.