Singular Spectrum Analysis — A Hidden Treasure of Time Series Forecasting

Valeriy Manokhin, PhD, MBA, CQF
4 min readMay 10, 2023

Unlock powerful SSA methods to generate highly accurate forecasts.

As a machine learning researcher and data science practitioner, I am always interested to learn and discover new time series forecasting methods early.

Follow myself on LinkedIn and Twitter to get regular updates on some of the most innovative technologies in machine learning and AI, including time series, forecasting and blazing-hot Conformal Prediction and check out my book ‘Practical Guide to Applied Conformal Prediction: Learn and apply the best uncertainty frameworks to your industry applications.’

At NeurIPS2019, an interesting paper, ‘On Multivariate Singular Spectrum Analysis and its Variants’, was published by MIT researchers, prompting me to explore Singular Spectrum Analysis methods for time series forecasting.

It turned out, Singular Spectrum Analysis (SSA) methods have been around for some time, but this powerful technique was less familiar to the mainstream whilst it has been successfully used in meteorology, hydrology, geophysics, climatology, economics, biology, physics, medicine and other sciences.

The Singular Spectrum Analysis (SSA), also known as the ‘Caterpillar’ method for reasons explained below, is a model-free technique for time series analysis. The ideas for the caterpillar methods were independently developed in the USSR and the USA.

Singular Spectrum Analysis (SSA) is a non-parametric technique used to analyse and forecast time series data. It is a powerful method that combines elements from classical time series analysis, multivariate statistics, and signal processing. SSA can decompose a time series into a set of interpretable components, each having a meaningful interpretation.

Singular Spectrum Analysis (SSA) is a rather general time series method used for decomposition, trend extraction, periodicity detection and extraction, signal extraction, denoising, filtering, forecasting, missing data imputation, change point detection, and spectral analysis. The method is model-free and nonparametric, making it well suited for exploratory analysis of time series.

The main steps in Singular Spectrum Analysis (SSA) include embedding, Singular Value Decomposition (SVD), Grouping and Reconstruction.

In the embedding step, the time series is transformed into higher dimensional space by forming a trajectory matrix X. This is done by dividing the time series into overlapping segments of equal length (the length of these vectors is called the window length L), called lagged vectors; the segments are then arranged as columns of the trajectory matrix.

Embedding step maps time series of length N into the Hankel matrix of L x K dimension

The resulting matrix is a Hankel matrix, named after Hermann Hankel, in which each ascending skew-diagonal from left to right is constant.

Singular Value Decomposition (SVD): Next, Singular Value Decomposition is applied to the trajectory matrix. SVD is a technique that decomposes a matrix into the product of three other matrices: an orthogonal matrix (U), a diagonal matrix (Σ), and the transpose of another orthogonal matrix (V). The diagonal elements of the Σ matrix are called singular values and are arranged in descending order. The columns of the U and V matrices are called left and right singular vectors, respectively.

The singular values and their corresponding left and right singular vectors are used to construct a set of Principal Components (PCs) and their associated time series, known as Elementary Series (ES). The PCs are computed as a product of the trajectory matrix and the right singular vectors, while the ES are obtained by projecting the original time series onto the left singular vectors.

The result of the trajectory matrix X decomposition is a decomposition of the matrix into matrix components: X = X1 + … + Xd.

Grouping: once the expansion X = X1 + … + Xd has been obtained, the grouping procedure partitions the set of indices {1,..,d} into m disjoint sets I1,…, Im and for each subset, the matrix Xi corresponding to group i is defined as simply sum of the component matrices that constitute the group.

The result of grouping X = Xi_1 + … + Xi_m.

The grouping of the PCs and ES can be made according to the features they represent, such as trends or periodic components. The grouping can be done using various criteria, such as variance contribution or visual inspection of the time series.

Reconstruction: this is the final step where each of the matrixes in the decomposition Xi_1,…, Xi_m is transformed back to the form of the input object X — the form of time series. This step is called diagonal averaging, where the matrices are transformed into one-dimensional time series. The resulting time series components can be used for further analysis, such as trend extraction, noise reduction, or forecasting.

To recap, the whole process looks like this.

Back to the NeurIPS2019 paper ‘Multivariate Singular Spectrum Analysis and its Variants’ — the paper was accompanied by Time Series Prediction Database (tspDB) enabling predictive functionality for a collection of time series in SQL and, more importantly, the release of mSSA open source Python package.

The package is very easy to run (see mSSA jupyter notebook example) and allows producing both point and probabilistic forecasts (caveat: probabilistic forecasts use normality assumption, so if you need better-calibrated forecasts, you might want to calibrate them after using conformal prediction).



Valeriy Manokhin, PhD, MBA, CQF

Principal Data Scientist, PhD in Machine Learning, creator of Awesome Conformal Prediction 👍Tip: hold down the Clap icon for up x50