Recurrent Neural Network

Recurrent Neural Network(RNN) s are specially designed to work with sequential data, i.e. data where there is a natural notion of a ‘sequence’ such as text, videos (sequences of images), speech, etc.

RNN is a type of Neural Network where the output from the previous step is fed as input to the current step. In traditional neural networks, all the inputs and outputs are independent of each other, but in cases like when it is required to predict the next word of a sentence, the previous words are required and hence there is a need to remember the previous words. Thus RNN came into existence, which solved this issue with the help of a Hidden Layer. The main and most important feature of RNN is the Hidden state, which remembers some information about a sequence.

In summary, in a vanilla neural network, a fixed size input vector is transformed into a fixed size output vector. Such a network becomes “recurrent” when you repeatedly apply the transformations to a series of given input and produce a series of output vectors. There is no pre-set limit to the size of the vector. And, in addition to generating the output which is a function of the input and hidden state, we update the hidden sate itself based on the input and use it in processing the next input.

The Recurrent Neural Network consists of multiple fixed activation function units, one for each time step. Each unit has an internal state which is called the hidden state of the unit. This hidden state signifies the past knowledge that the network currently holds at a given time step. This hidden state is updated at every time step to signify the change in the knowledge of the network about the past.

Training an RNN is similar to training a traditional Neural Network. Since the parameters are shared by all-time steps in the network, the gradient at each output depends not only on the calculations of the current time step but also on the previous time steps. For example, in order to calculate the gradient at t=5, we would need to backpropagate 4 steps and sum up the gradients. This is called Backpropagation Through Time (BPTT).

Advantages of Recurrent Neural Network

  1. Possibility of processing input of any length
  2. Model size not increasing with the size of the input
  3. Computation takes into account historical information
  4. Weights are shared across time

Disadvantages of Recurrent Neural Network

  1. Computation being slow
  2. Difficulty in accessing information from a long time ago
  3. Cannot consider any future input for the current state
  4. Gradient vanishing and exploding problems.
  5. Training an RNN is a very difficult task.
  6. It cannot process very long sequences if using Tanh or Relu as an activation function.

In short, an RNN remembers each and every information through time. It is useful in time series prediction only because of the feature to remember previous inputs as well. Recurrent neural networks are even used with convolutional layers to extend the effective pixel neighborhood.

Happy learning!

Comments are closed.