The training samples are passed through the network and the output obtained from the network is compared with the actual output. This error is used to change the weights of the neurons such that the error decreases gradually. This is done using the Backpropagation algorithm, also called backprop.

Iteratively passing batches of data through the network and updating the weights so as to decrease the error is known as Stochastic Gradient Descent ( SGD ). The amount by which the weights are changed is determined by a parameter called Learning rate.

We adjust the value of weight such that the error becomes minimum. We need to figure out whether we need to increase or decrease the weight value. Once we know that, we keep on updating the weight value in that direction until the error becomes minimum. We might reach a point where if further updates occur to the weight, the error will increase. At that time, we need to stop, and that is our final weight value.

In the Backpropagation, movement happens from the output to the input layer. We calculate the error between the predicted output and the target output. The neural network begins to learn with the goal of minimizing the error (cost function) by tweaking the weights. This process is called backpropagation. Then with the help of an optimizer(gradient descent), we update the weight values.

Backpropagation allows us to adjust all the weights simultaneously. This helps us to see which part of the error each of our weights in the neural network is responsible for.