Artificial Neural Network

Artificial Neural networks(ANN’s) are the base or functional unit of deep learning. A neural network emerged from a very popular machine learning algorithm named perceptron. 

A Neuron is the basic unit of computation in a neural network. It is also called as a node or unit. The leftmost layer in this network is called the input layer, and the neurons within the layer are called input neurons. It is used to provide the input data or features to the network.

The middle layer is called a hidden layer since the neurons in this layer are neither inputs nor outputs. The number of hidden layers is termed as the depth of the neural network and deeper networks can learn more complex functions.

The rightmost or output layer contains the output neurons, or, as in this case, a single output neuron. This is the layer that gives out the predictions. The activation function to be used in this layer is different for different problems.

For a binary classification problem, we want the output to be either 0 or 1. Thus, a sigmoid activation function is used. For a Multiclass classification problem, a Softmax ( think of it as a generalization of sigmoid to multiple classes ) is used. For a regression problem, where the output is not a predefined category, we can simply use a rectified linear unit (RELU)

Neural networks take input data, train themselves to recognize patterns found in the data. Based on its understanding of the input data, it predicts the output for a new set of similar data. An ANN represents interconnected input and output units and each connection has an associated weight. We can say a neural network can mimic the behavior of the human brain to solve complex data-driven problems.

We assign weights to different input values and which are assigned on the basis of its relative importance to other inputs. Every interconnection in the neural network has a weight associated with it, and every neuron has a bias associated with it. The node applies a non-linear function to the weighted sum of its inputs. This function is known as the Activation Function.

The weights and bias are known as the parameters of the neural network whereas the activation function, number of layers, number of neurons in the input, hidden, and output layers are the hyperparameters of the Neural network.

A neural network architecture makes the following assumptions:

  1. Neurons are arranged in layers and the layers are arranged sequentially.
  2. Neurons within the same layer do not interact with each other.
  3. All the inputs enter the network through the input layer and all the outputs go out of the network through the output layer.
  4. Neurons in consecutive layers are densely connected, i.e. all neurons in layer l are connected to all neurons in layer l+1.
  5. Every interconnection in the neural network has a weight associated with it, and every neuron has a bias associated with it.
  6. All neurons in a particular layer use the same activation function.

Summarizing an Artificial Neural Network:

  1. Take inputs
  2. Assign random weights to input features
  3. Add bias
  4. Run the code for training.
  5. Find the error in prediction.
  6. Update the weight by gradient descent algorithm.
  7. Repeat the training phase with updated weights.
  8. Make predictions.

During the training (learning) phase, the network learns by adjusting the weights in order to keep the error minimum. Our goal is to achieve minimum error and maximum accuracy for our model.

Happy learning!

Comments are closed.