Long Short Term Memory (LSTM) Networks simply explained tutorial
August 15, 2019
LSTM Recurrent Neural Networks in Python with Keras
August 19, 2019
Show all

How to Reshape Input Data for Long Short-Term Memory Networks in Keras

When I started working with the LSTM networks, I was quite confused about the Input and Output shape. This article will help you to understand the Input and Output shape of the LSTM network. I assume that you already know about the LSTM theoretically. I am using the Keras library in this tutorial.


First, let’s understand the Input and its shape in LSTM Keras.

Input shape for LSTM network

You always have to give a three-dimensional array as an input to your LSTM network (refer to the above image). Where the first dimension represents the batch size, the second dimension represents the number of time-steps you are feeding a sequence. And the third dimension represents the number of units in one input sequence. For example, input shape looks like (batch_size, time_steps, seq_len). Let’s look at an example in Keras.

Code Snippet 1

Let’s look at the input_shape argument. Though it looks like that input_shape requires a 2D array, it actually requires a 3D array. In the example above input_shape is (2,10) which means the number of time steps is 2 and the number of input units is 10. And you can give any size for a batch. So your input array shape looks like (batch_size, 2, 10).

Code Snippet 2

You can also give an argument called batch_input_shape instead of input_shape. The difference here is that you have to give a fixed batch size now and your input array shape will look like (8, 2, 10). If you try to give a different batch size, you will get an error.


Now, let’s look at the output and its shape in the LSTM network.

Code Snippet 3

Let’s look at the other arguments. Argument units is the number of output units in the LSTM which is 3 here. So output shape is (None, 3). The first dimension of output is None because we do not know the batch size in advance. So actual output shape will be (batch_size, 3) here.

Code Snippet 4

Here you can see that I defined batch_size in advance and the output shape is (8, 3) which makes sense.

Code Snippet 5

Now, look at the other argument which is return_sequences. This argument tells Whether to return the output at each time steps instead of the final time step. Now the output shape is 3D array, not a 2D array. And the shape of the array is (8, 2, 3). You can see that there is one extra dimension in between which represent the number of time steps.

Summary

  • The input of the LSTM is always is a 3D array. (batch_size, time_steps, seq_len)
  • The output of the LSTM could be a 2D array or 3D array depending upon the return_sequences argument.
  • If return_sequence is False, the output is a 2D array. (batch_size, units)
  • If return_sequence is True, the output is a 3D array. (batch_size, time_steps, units)

It can be difficult to understand how to prepare your sequence data for input to an LSTM model.

Often there is confusion around how to define the input layer for the LSTM model.

There is also confusion about how to convert your sequence data that may be a 1D or 2D matrix of numbers to the required 3D format of the LSTM input layer.

In this tutorial, you will discover how to define the input layer to LSTM models and how to reshape your loaded input data for LSTM models.

After completing this tutorial, you will know:

  • How to define an LSTM input layer.
  • How to reshape a one-dimensional sequence data for an LSTM model and define the input layer.
  • How to reshape multiple parallel series data for an LSTM model and define the input layer.

Discover how to develop LSTMs such as stacked, bidirectional, CNN-LSTM, Encoder-Decoder seq2seq and more in my new book, with 14 step-by-step tutorials and full code.

Let’s get started.

How to Reshape Input for Long Short-Term Memory Networks in Keras

How to Reshape Input for Long Short-Term Memory Networks in Keras
Photo by Global Landscapes Forum, some rights reserved.

Tutorial Overview

This tutorial is divided into 4 parts; they are:

  1. LSTM Input Layer
  2. Example of LSTM with Single Input Sample
  3. Example of LSTM with Multiple Input Features
  4. Tips for LSTM Input

LSTM Input Layer

The LSTM input layer is specified by the “input_shape” argument on the first hidden layer of the network.

This can make things confusing for beginners.

For example, below is an example of a network with one hidden LSTM layer and one Dense output layer. model = Sequential() model.add(LSTM(32)) model.add(Dense(1))

123 model = Sequential()model.add(LSTM(32))model.add(Dense(1))

In this example, the LSTM() layer must specify the shape of the input.

The input to every LSTM layer must be three-dimensional.

The three dimensions of this input are:

  • Samples. One sequence is one sample. A batch is comprised of one or more samples.
  • Time Steps. One time step is one point of observation in the sample.
  • Features. One feature is one observation at a time step.

This means that the input layer expects a 3D array of data when fitting the model and when making predictions, even if specific dimensions of the array contain a single value, e.g. one sample or one feature.

When defining the input layer of your LSTM network, the network assumes you have 1 or more samples and requires that you specify the number of time steps and the number of features. You can do this by specifying a tuple to the “input_shape” argument.

For example, the model below defines an input layer that expects 1 or more samples, 50 time steps, and 2 features. model = Sequential() model.add(LSTM(32, input_shape=(50, 2))) model.add(Dense(1))

123 model = Sequential()model.add(LSTM(32, input_shape=(50, 2)))model.add(Dense(1))

Now that we know how to define an LSTM input layer and the expectations of 3D inputs, let’s look at some examples of how we can prepare our data for the LSTM.

Example of LSTM With Single Input Sample

Consider the case where you have one sequence of multiple time steps and one feature.

For example, this could be a sequence of 10 values: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0

1 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0

We can define this sequence of numbers as a NumPy array. from numpy import array data = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0])

12 from numpy import arraydata = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0])

We can then use the reshape() function on the NumPy array to reshape this one-dimensional array into a three-dimensional array with 1 sample, 10 time steps, and 1 feature at each time step.

The reshape() function when called on an array takes one argument which is a tuple defining the new shape of the array. We cannot pass in any tuple of numbers; the reshape must evenly reorganize the data in the array. data = data.reshape((1, 10, 1))

1 data = data.reshape((1, 10, 1))

Once reshaped, we can print the new shape of the array. print(data.shape)

1 print(data.shape)

Putting all of this together, the complete example is listed below. from numpy import array data = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]) data = data.reshape((1, 10, 1)) print(data.shape)

1234 from numpy import arraydata = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0])data = data.reshape((1, 10, 1))print(data.shape)

Running the example prints the new 3D shape of the single sample. (1, 10, 1)

1 (1, 10, 1)

This data is now ready to be used as input (X) to the LSTM with an input_shape of (10, 1). model = Sequential() model.add(LSTM(32, input_shape=(10, 1))) model.add(Dense(1))

123 model = Sequential()model.add(LSTM(32, input_shape=(10, 1)))model.add(Dense(1))

Example of LSTM with Multiple Input Features

Consider the case where you have multiple parallel series as input for your model.

For example, this could be two parallel series of 10 values: series 1: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 series 2: 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1

12 series 1: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0series 2: 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1

We can define these data as a matrix of 2 columns with 10 rows: from numpy import array data = array([ [0.1, 1.0], [0.2, 0.9], [0.3, 0.8], [0.4, 0.7], [0.5, 0.6], [0.6, 0.5], [0.7, 0.4], [0.8, 0.3], [0.9, 0.2], [1.0, 0.1]])

123456789101112 from numpy import arraydata = array([ [0.1, 1.0], [0.2, 0.9], [0.3, 0.8], [0.4, 0.7], [0.5, 0.6], [0.6, 0.5], [0.7, 0.4], [0.8, 0.3], [0.9, 0.2], [1.0, 0.1]])

This data can be framed as 1 sample with 10 time steps and 2 features.

It can be reshaped as a 3D array as follows: data = data.reshape(1, 10, 2)

1 data = data.reshape(1, 10, 2)

Putting all of this together, the complete example is listed below. from numpy import array data = array([ [0.1, 1.0], [0.2, 0.9], [0.3, 0.8], [0.4, 0.7], [0.5, 0.6], [0.6, 0.5], [0.7, 0.4], [0.8, 0.3], [0.9, 0.2], [1.0, 0.1]]) data = data.reshape(1, 10, 2) print(data.shape)

1234567891011121314 from numpy import arraydata = array([ [0.1, 1.0], [0.2, 0.9], [0.3, 0.8], [0.4, 0.7], [0.5, 0.6], [0.6, 0.5], [0.7, 0.4], [0.8, 0.3], [0.9, 0.2], [1.0, 0.1]])data = data.reshape(1, 10, 2)print(data.shape)

Running the example prints the new 3D shape of the single sample. (1, 10, 2)

1 (1, 10, 2)

This data is now ready to be used as input (X) to the LSTM with an input_shape of (10, 2). model = Sequential() model.add(LSTM(32, input_shape=(10, 2))) model.add(Dense(1))

123 model = Sequential()model.add(LSTM(32, input_shape=(10, 2)))model.add(Dense(1))

Longer Worked Example

For a complete end-to-end worked example of preparing data, see this post:

Tips for LSTM Input

This section lists some tips to help you when preparing your input data for LSTMs.

  • The LSTM input layer must be 3D.
  • The meaning of the 3 input dimensions are: samples, time steps, and features.
  • The LSTM input layer is defined by the input_shape argument on the first hidden layer.
  • The input_shape argument takes a tuple of two values that define the number of time steps and features.
  • The number of samples is assumed to be 1 or more.
  • The reshape() function on NumPy arrays can be used to reshape your 1D or 2D data to be 3D.
  • The reshape() function takes a tuple as an argument that defines the new shape.

Further Reading

This section provides more resources on the topic if you are looking go deeper.

Summary

In this tutorial, you discovered how to define the input layer for LSTMs and how to reshape your sequence data for input to LSTMs.

Specifically, you learned:

  • How to define an LSTM input layer.
  • How to reshape a one-dimensional sequence data for an LSTM model and define the input layer.
  • How to reshape multiple parallel series data for an LSTM model and define the input layer.

https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/

https://medium.com/@shivajbd/understanding-input-and-output-shape-in-lstm-keras-c501ee95c65e

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

Amir Masoud Sefidian
Amir Masoud Sefidian
Data Scientist, Researcher, Software Developer

Comments are closed.