Long Short Term Memory – An Artificial Recurrent Neural Network Architecture


In this post, I will explain an Artificial Neural (ANN) Network Architecture known as Long Short Term Memory (LSTM). Basically, it is a type of Recurrent Neural Network (RNN).

Comparing Different Types of Artificial Neural Networks (ANNs)

Before discussing LSTM, let us first understand the difference between a traditional Artificial Neural Network (ANN), and a Recurrent Neural Network (RNN). Since ANN processes the input in only a forward direction, it doesn’t learn from errors. Also, they have frequent unexplained behavior. Therefore, Artificial Neural Networks (ANN) are suitable for storing information. In contrast, a Backpropagation ANN has an error function that computes the gradient of the error function for the weights of the network.

Backpropagation ANN allows the propagation of error backward to the hidden layers of the network and forces the adjustment in weights. This how a backpropagation ANN learns.

Another type of Artificial Neural Network is known as a Recurrent Neural Network (RNN). Basically, RNNs are the self-learning networks. They are suitable when the data has a sequential pattern such as text or speech. The RNNs recursively feed their outputs to the inputs. Hence, the current output depends upon the previous outputs in RNN.

The drawback of Recurrent Neural Network (RNN)

Basically, RNNs suffer from a problem known as Long Term Dependency Problem. In other words, the gap between the past information learned and the current task may become very wide. Since Recurrent Neural Networks uses backpropagation for learning, the partial derivative of the error is computed and fed back to the network in order to adjust the weights.

Whenever this partial derivative becomes very small and multiplied to a small learning rate, the resulting quantity becomes too small. It actually vanishes. As a result, no further change in weights occurs and the learning stops. This situation is known as Vanishing Gradience. In order to overcome the problem of Vanishing Gradience, Long Short Term Memory (LSTM) is created.

Long Short Term Memory (LSTM)

For the purpose of avoiding the problem of vanishing gradients, the LSTM network maintains a state known as the Cell State in the network. Each cell in LSTM has gates that control the flow of information and determine what information is remembered and what is discarded. Consequently, a cell has the following gates.

  • Forget Gate takes the output from the previous state and determines which information should be transferred ahead and makes use of a sigmoid function in order to filter the information.
  • Input Gate adds new information from the current input vector.
  • The Output Gate makes use of the sigmoid function in order to determine what value should be provided as the output.


This article on Long Short Term Memory – An Artificial Recurrent Neural Network Architecture describes the different variants of the Artificial Neural Networks (ANN) and compares them. Further, the Recurrent Neural Network (RNN) is also described in brief and the Problem of Vanishing Gradience is explained here. Finally, the Long Short Term Memory architecture is described.

Further Reading

Deep Learning Tutorial

Text Summarization Techniques

How to Implement Inheritance in Python

Find Prime Numbers in Given Range in Python

Running Instructions in an Interactive Interpreter in Python

Deep Learning Practice Exercise

Python Practice Exercise

Deep Learning Methods for Object Detection

Understanding YOLO Algorithm

What is Image Segmentation?

ImageNet and its Applications

Image Contrast Enhancement using Histogram Equalization

Transfer Learning and its Applications

Examples of OpenCV Library in Python

Examples of Tuples in Python

Python List Practice Exercise

Understanding Blockchain Concepts

Edge Detection Using OpenCV

Predicting with Time Series

Example of Multi-layer Perceptron Classifier in Python

Measuring Performance of Classification using Confusion Matrix

Artificial Neural Network (ANN) Model using Scikit-Learn

Popular Machine Learning Algorithms for Prediction

Long Short Term Memory – An Artificial Recurrent Neural Network Architecture

Python Project Ideas for Undergraduate Students

Creating Basic Charts using Plotly

Visualizing Regression Models with lmplot() and residplot() in Seaborn

Data Visualization with Pandas

A Brief Introduction of Pandas Library in Python

A Brief Tutorial on NumPy in Python


You may also like...