Unlocking Predictive Power: Harnessing Permutation Entropy for Superior Time-Series Forecasting

Valeriy Manokhin, PhD, MBA, CQF
8 min readOct 16, 2023

The complexity of time series data is a crucial aspect of forecasting, as it often determines the forecasting methods to apply. Different measures have been employed to quantify the complexity of time series, and their relationship with forecasting performance has been a subject of study.

Various complexity measures have been devised to analyse and compare time series, distinguishing between regular (periodic), chaotic, and random behaviours. The primary categories of complexity metrics include entropies, fractal dimensions, and Lyapunov exponents. Significant connections exist among these measures, enabling the computation of one from another.

In this article, we will dive deep into permutation entropy, a robust measure that excels in capturing the complexity and underlying dynamics of time series data. The beauty of permutation entropy lies in its ability to unravel intricate temporal patterns, making it a potent tool for a myriad of applications, including forecasting tasks.

By dissecting the mathematical essence and computational nuances of permutation entropy, we aim to shed light on its efficacy in enhancing the precision of time-series forecasting models. We will illustrate how permutation entropy quantifies the disorder and unveils pivotal insights that empower predictive models to forecast with heightened accuracy and reliability.

Let’s dive in.

Permutation Entropy (PE) is a sturdy tool for analysing time series data, offering a means to quantify the complexity inherent in dynamic systems.

Unveiling the ordinal relationships within a time series by deriving a probability distribution from these ordinal patterns encapsulates the system's intricacy.

Among the hallmark attributes of the Permutation Entropy (PE) approach are:

- Its non-parametric nature liberates it from the confines of restrictive parametric model assumptions.
- Robustness to noise, computational efficiency, flexibility, and invariance to non-linear monotonic transformations of the data, showcasing its resilience and adaptability.
- Foundation on the principles of entropy and symbolic dynamics, anchoring its methodology in established theories.
- Recognition of the temporal ordering structure (time causality) inherent in a given real-value time series, enabling a nuanced analysis of time-sequenced data.
- The empowerment of users to delve into and decode the complex dynamic content encapsulated in nonlinear time series facilitates a deeper understanding of underlying patterns and dynamics.

Before we illustrate how to calculate permutation entropy, we need to understand the concepts of embedding dimension and embedding delay.

The embedding dimension D is crucial in analysing time series data, especially when delving into techniques like permutation entropy or reconstructing phase space. The embedding dimension D is a parameter that signifies the number of consecutive values in a time series data grouped to form a vector.

Each of these vectors represents a point in a multi-dimensional phase space. The embedding dimension helps unfold the underlying system dynamics that generated the time series.

Arranging the data in a multi-dimensional space makes it possible to study the structure and dynamics of the data from a geometric perspective. Choosing an appropriate embedding dimension is crucial as it can significantly impact the analysis results.

— A too-low embedding dimension might not capture the underlying dynamics fully, whereas a too-high dimension can add noise and complicate the analysis.
— Various methods, such as the False Nearest Neighbors (FNN) or Cao’s method, exist to help determine a suitable embedding dimension for a given time series data.

In the context of permutation entropy (PE), the embedding dimension dictates the length of the ordinal patterns or permutations used in the analysis. By examining the ordinal relationships among the values within these vectors, Permutation Entropy measures the complexity and predictability of the time series.

Illustration:
For instance, with an embedding dimension of 3, triples of consecutive values are formed, and their ordinal patterns are analysed. If the time series is {4, 7, 9, 10, 6, 11, 3}, the vectors would be (4, 7, 9), (7, 9, 10), (9, 10, 6), and so on.

The embedding time delay, often denoted as tau, is a fundamental parameter in time series analysis, especially when exploring techniques like permutation entropy or reconstructing the phase space of a dynamical system.

The embedding time delay is the step size or interval used when selecting data points from the time series to construct vectors in a multi-dimensional phase space.

Instead of using consecutive data points, the time delay allows selecting data points spaced apart by time steps. Implementing a time delay helps unveil the underlying dynamics of the system that generated the time series. It assists in avoiding redundancy and revealing more about the system’s structure by considering data points that are not immediately adjacent.

Choosing an appropriate time delay is critical as it impacts the quality of the reconstructed phase space and the subsequent analysis. Various methods, such as the Average Mutual Information (AMI) method or autocorrelation function, can determine a suitable time delay.

In Permutation Entropy (PE) context, the embedding time delay helps form ordinal patterns or permutations from the time series data, which are central to computing the permutation entropy. The time delay influences the spacing of values within each vector, affecting the ordinal patterns derived and the resulting entropy value.

Illustration

For example, with an embedding time delay of tau = 2 and an embedding dimension of d = 3, if we have a time series {4, 7, 9, 10, 6, 11, 3}, the vectors would be formed as (4, 9, 6), (7, 10, 11) and so on.

The embedding time delay is an important parameter allowing for a more meaningful reconstruction of the phase space and a deeper understanding of the underlying dynamics of the time series data. By appropriately selecting the embedding time delay, one can obtain more accurate and insightful complex time series data analyses.

We are now ready to look at Permutation Entropy in detail.

Permutation Entropy (PE) is a method used to quantify the complexity of time series data.

The computation involves several steps, including creating a matrix of ordinal patterns based on embedding dimensions and time delays.

Here, we’ll follow the literature, an embedding dimension of 3, and an embedding time delay of 1.

Suppose we have a time series {4, 7, 9, 10, 6, 11, 3}.

  1. We are defining embedding dimension d = 3 and embedding time delay tau = 1.
  2. Time series embedding matrix.

Rearrange the time series data into a matrix using the embedding dimension and time delay values. Each row in the matrix will be a vector of length d, and the columns are constructed with a delay of tau.

Time series rearranged into the matrix

3. Computing Ordinal Patterns:
For each row, rank the values from smallest to largest and note the order of these ranks.

Transform time series embedding matrix into matrix of ordinal patterns

4. Calculating Frequencies of Ordinal Patterns:

All possible combinations of rank values — 6(3!) in total

Matrix of all possible combinations of rank values

Now we can calculate frequencies of various patterns:

Frequencies of various patterns

5. Computing Permutation Entropy:
— The formula for Permutation Entropy is:

The formula for computing Permutation Entropy

Substituting computed frequencies into the PE formula:

Which can also be normalised by dividing by the Log2 (D).

Value of normalised Permutation Entropy

Normalised entropy is between 0 and 1; our time series is of moderate complexity.

Here is how to compute normalised Permutation Entropy in Python with a few lines of code:

!pip install ordpy
from ordpy import permutation_entropy

time_series = np.array([4, 7, 9, 10, 6, 11, 3])
pe_library = permutation_entropy(time_series)
print(f'Library Permutation Entropy (ordpy): {pe_library}')

Using the ordpy, we obtain the exact value of the Normalized Permutation Library.

Now that we understood permutation entropy and how it is calculated, let’s consider some of it’s applications:

  • As an input feature: The permutation entropy of a time window can be fed directly into a forecasting model to help it assess the degree of noise vs structure in the data. You can compute the permutation entropy of a sliding window of the time series and use it as an additional feature for forecasting models. This can be especially useful for models like decision trees, random forests, and gradient boosting machines that can utilize this feature to detect certain patterns or regime shifts.
  • Model selection: Entropy can help determine if a linear vs nonlinear model is more appropriate for a given time series. High entropy favors nonlinear models. Different states of the system (e.g., chaotic vs. orderly) might have distinct permutation entropies. By monitoring the permutation entropy, one might decide to employ different forecasting models or tune the parameters differently depending on the detected state of the system.
  • Anomaly detection: Sudden major shifts in entropy may signal an upcoming change point or anomaly in the data. Anomalies or regime changes in time series data can be detected by monitoring sudden shifts in permutation entropy. These detected anomalies can help adjust forecasting models or act as a warning that the current forecasting model might be inadequate.
  • Confidence estimation: Permutation entropy provides a useful measure of unpredictability and randomness in time series data. By calculating this nonlinear statistic on trailing windows, forecasters can assess when entropy has been elevated above normal levels, indicating the recent historical data has been more erratic and unpredictable. Periods of higher permutation entropy suggest greater uncertainty and noise in the underlying data generating process. Forecasters can adapt to this information by widening prediction intervals and adjusting confidence bounds to be more conservative when permutation entropy is higher.
  • Hybrid Models:
    Permutation entropy can be combined with other forecasting models to create hybrid models. For instance, it might be used to weigh predictions from multiple models: models that perform well in high entropy states might be given more weight during such states and vice versa.

So in summary, permutation entropy captures useful information about predictability, irregularity, and complexity in time series data that can help improve the forecasting process. It provides an accessible nonlinear statistic for gaining insights into the nature of a dataset.

References:

  1. Bandt, C. and B. Pompe, 2002, “Permutation Entropy: A Natural Complexity Measure for Time Series,” Physics Review Letters, 88, 174102:1–174102:4.
  2. Ordpy — A Python package for data analysis with permutation entropy and ordinal network methods.

--

--

Valeriy Manokhin, PhD, MBA, CQF

Principal Data Scientist, PhD in Machine Learning, creator of Awesome Conformal Prediction 👍Tip: hold down the Clap icon for up x50