Recognizing Temporal Patterns using an Echo State Network icon Recognizing Temporal Patterns using an Echo State Network
University Projects #Machine Learning#Python Featured

Echo State Network#

An Echo State Network is a type of recurrent neural network used to recognize temporal patterns like K-Step Ahead Forcasting. Specifically, ESNs use backpropagation to feed optimized hyperparameters back into its recurrent internal state.

K-Step Ahead Forcasting#

K-Step Ahead Forcasting predicts the value of the timeseries data k timesteps ahead of the current time t. In this implementation, I use Mean-Square Error to determine prediction accuracy. K-Step Ahead Forcasting is used in both the validation stage and testing stage of model evaluation.

Process#

For a detailed description of the process of model training and evaluation, view the project.

Essentially, batch training is done using a subset of the original dataset using input weights, recurrent weights, a sigmoid activation function, and the total number of hidden layer neurons to find the states and then ridge regression is used to calculate the optimized weights. Cross-Validation is done on a subset of the original data, where hyperparameters are optimized and the error of the model is calculated via mean-square error. K-Step Ahead Forcasting is done in both the validation and testing stages, where 1 step is used to evaluate hyperparameters and k>1 steps is used to calculate model predictions.

Results#

Evaluation of 2Sine Timeseries Data#

Original 2Sine Timeseries Data

The model is trained with 50 hidden neurons and a regularization parameter of 0.5.

  • The optimized number of hidden layer neurons was found to be 10
  • The optimized regularization parameter was found to be 5.0
  • The optimized data split (ie the split with lowest MSE) was found to be 40/30/30 (training/validation/testing)

1-Step Ahead

  • 1-Step Ahead Prediction had a MSE of 0.142, indicating that the model generalizes well

2-Step Ahead

  • 2-Step Ahead Prediction had a MSE of 1.24, demonstrating that error compounds overtime

3-Step Ahead

  • Predictions with k>2 had higher MSEs and were less accurate the more steps ahead

Evaluation of Lorenz Timeseries Data#

Original Lorenz Timeseries Data

The model is trained using 20 hidden neurons and a regularization parameter of 0.1.

  • The optimized number of hidden layer neurons was found to be 10
  • The optimized regularization parameter was found to be 0.01
  • The optimized data split (ie the split with the lowest MSE) was found to be 80/10/10 (training/validation/testing)

1-Step Ahead

  • 1-Step Ahead Prediction had a MSE of 7.626, likely because the dataset varies more than the 2Sine dataset though the predictions and targets look visually similar

2-Step Ahead

  • 2-Step Ahead Prediction had a MSE of 111.611, indicating that error compounds overtime

3-Step Ahead

  • Predictions with k>2 had higher MSEs and were less accurate the more steps ahead

Conclusions#

  • The Lorenz model demonstrates that increased model complexity leads to overfitting and a failure to generalize, as its data points had higher varience and its function was more complex compared to the 2Sine dataset. Additionally, the complexity of the Lorenz function causes a smaller regularization parameter, causing outliers to significantly affect weight optimizations.
  • K-Step Ahead prediction produces less accurate results the more steps ahead you attempt to predict because error compounds overtime and predictions are fed back into the model.
← Back to Projects