Multivariate multi-step time series forecasting using sequence models (4/4)

4 min readMar 17, 2023

Models -

Option — 3

In this approach, we make use of stacked lstms so that we can compare the performance of the encoder-decoder based multi-step output models.

In this approach, we make use of stacked lstms to predict just a single-step output.

The model architecture looks like -

Although, it doesn’t make much sense to initialize the second lstm’s initial states with the first lstm’s last states, but it gave me marginally better results. You can try it both ways.

Train code -

Inference code -

During inference, we iterate over the model’s output for the same number of steps as the earlier models for a fair comparison.

In this approach, we make use of stacked lstms to predict a multi-step output.

The model architecture looks like -

Although, it doesn’t make much sense to initialize the second lstm’s initial states with the first lstm’s last states, but it gave me marginally better results. You can try it both ways.

Also, in this model the input and outputs steps need to be equal because the second lstm has return_sequences=True.

Train code -

Inference code -

Option — 4

These models are a hybrid between the previous models.

Although, these models are slow to converge and inefficient, using these we can compare the performance of the encoder-decoder based multi-step output models.

With teacher forcing

In this approach, we make use of stacked lstms along with teacher forcing to predict a multi-step output.

Train code -

Inference code -

Without teacher forcing

In this approach, we make use of stacked lstms without teacher forcing to predict a multi-step output.

Train code -

Inference code -

Per feature per time step performance

I’ve left out per feature per time step performance analysis out of the notebooks to keep them concise, but here’s how it can be added to the existing code -

Here, gt_list and pred_list are two lists of length n_out = 5, where each element is an array of shape — (x,features), where x and features depend on our data, which in this case is — (5954, 24).

Thus, we have divided the expected output and the output we got from the models into separate buckets corresponding to each time step at the decoder side for easy comparison.

In this code snippet —

plt.plot(gt_list[i][:,j], c='blue', label="gt")

plt.plot(pred_list[i][:,j],c='red', label="pred")

i represents time steps which goes from 0–4 and j stands for features which goes from 0–23. Thus, we will have 5 * 24 = 120 such comparisons per model.

Future work

Apart from teacher forcing and the normal approach, there is another method which is a combination of the two, called scheduled sampling, where during the training process, we feed the model ground truth values from the previous time steps during the initial epochs and gradually more of its predicted outputs during the last epochs.

Data source

pump_sensor_data

Pump sensor data for predictive maintenance

www.kaggle.com

All parts

Github & Linkedin

GitHub — pc90/Multivariate-multi-step-time-series-forecasting-using-sequence-models

You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

https://www.linkedin.com/in/puneet-chandna-050486131/

References

Neural Machine Translation by Jointly Learning to Align and Translate

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical…

arxiv.org

Effective Approaches to Attention-based Neural Machine Translation

An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on…

arxiv.org

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent…

arxiv.org

https://machinelearningmastery.com/teacher-forcing-for-recurrent-neural-networks/

https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/

https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/

https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/

Training an RNN with examples of different lengths in Keras

I am trying to get started learning about RNNs and I’m using Keras. I understand the basic premise of vanilla RNN and…

datascience.stackexchange.com

How to connect LSTM layers in Keras, RepeatVector or return_sequence=True?

I’m trying to develop an Encoder model in keras for timeseries. The shape of data is (5039, 28, 1), meaning that my…

stackoverflow.com

Seq2Seq with Attention and Beam Search

This post is the first in a series about im2latex : its goal is to cover the concepts of Sequence-to-Sequence models…

guillaumegenthial.github.io

Making new Layers and Models via subclassing | TensorFlow Core

import tensorflow as tf from tensorflow import keras One of the central abstraction in Keras is the Layer class. A…

www.tensorflow.org

Neural machine translation with attention | Text | TensorFlow

This tutorial demonstrates how to train a sequence-to-sequence (seq2seq) model for Spanish-to-English translation…

www.tensorflow.org

Multivariate multi-step time series forecasting using sequence models (4/4)

Models -

Option — 3

Option — 4

pump_sensor_data

Pump sensor data for predictive maintenance

GitHub — pc90/Multivariate-multi-step-time-series-forecasting-using-sequence-models

You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

Neural Machine Translation by Jointly Learning to Align and Translate

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical…

Effective Approaches to Attention-based Neural Machine Translation

An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on…

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent…

Training an RNN with examples of different lengths in Keras

I am trying to get started learning about RNNs and I’m using Keras. I understand the basic premise of vanilla RNN and…

How to connect LSTM layers in Keras, RepeatVector or return_sequence=True?

I’m trying to develop an Encoder model in keras for timeseries. The shape of data is (5039, 28, 1), meaning that my…

Seq2Seq with Attention and Beam Search

This post is the first in a series about im2latex : its goal is to cover the concepts of Sequence-to-Sequence models…

Making new Layers and Models via subclassing | TensorFlow Core

import tensorflow as tf from tensorflow import keras One of the central abstraction in Keras is the Layer class. A…

Neural machine translation with attention | Text | TensorFlow

This tutorial demonstrates how to train a sequence-to-sequence (seq2seq) model for Spanish-to-English translation…

Applied Roots

We know how challenging changing careers can be. Our Applied AI/Machine Learning Courses are designed as whole learning…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Puneet Chandna

Responses (1)