The Makridakis Forecasting Competitions: Four Decades of Overhyped Simplicity and Stagnation

For over four decades, the Makridakis forecasting competitions (M-competitions) have been held up as the “standard” for evaluating forecasting methods.

Yet, despite their longevity, these competitions have become a monument to intellectual stagnation, a celebration of mediocrity, and a glaring example of how a small group can dominate a field without contributing meaningful innovation.

Let’s be clear: the M-competitions have consistently preached the gospel of “simple methods are better,” while failing to produce a single groundbreaking forecasting method of their own.

All the real progress in forecasting — methods like N-BEATS, Transformer-based models, and other machine learning innovations — has come from outside their insular circle, from the machine learning community that they’ve largely ignored or dismissed.

The Cult of Simplicity: A Self-Fulfilling Prophecy

The M-competitions have long championed the idea that simple statistical methods — like exponential smoothing or naive forecasts — outperform more complex approaches.

This mantra has been repeated ad nauseam, often with a tone of superiority. But let’s examine this claim critically: the competitions were designed in a way that inherently favored simplicity.

They relied on small, low-dimensional datasets that didn’t reflect the complexity of real-world forecasting problems. By focusing on these narrow, artificial benchmarks, the M-competitions created an echo chamber where simplicity was rewarded, and innovation was stifled.

For example, in the M4 competition paper (Makridakis et al., 2018), the authors conclude that “statistically significant differences do not exist among the top-performing methods,” implying that simple methods are just as good as more sophisticated ones.

However, this conclusion is misleading. The competition’s design — limited datasets, short forecasting horizons, and a narrow set of evaluation metrics — skewed the results in favor of simplicity. Real-world forecasting often involves high-dimensional data, external variables, and complex dependencies, none of which were adequately addressed in the M4 competition.

Where Are the Groundbreaking Contributions?

Here’s the most damning indictment of the M-competitions: after four decades, what groundbreaking forecasting methods have Makridakis and his collaborators actually invented?

The answer is simple: none. They didn’t invent exponential smoothing, ARIMA, or any of the other “simple methods” they so fervently champion.

These methods were developed by others, often decades before the M-competitions even began. Makridakis and his followers didn’t innovate; they merely repackaged and promoted existing ideas, all while dismissing more sophisticated approaches as “overkill.”

In their M3 competition paper (Makridakis & Hibon, 2000), the authors claim that “simple statistical methods are often more accurate than more complex ones.” Yet, this conclusion is based on a narrow set of datasets and metrics that fail to capture the complexity of modern forecasting problems.

The paper’s insistence on simplicity ignores the fact that industries like finance, retail, and energy demand more accurate, scalable, and adaptive forecasting methods — methods that can handle high-dimensional data, incorporate external variables, and adapt to changing conditions. The M-competitions, with their myopic focus on simplicity, were utterly unequipped to address these challenges.

A Failure to Adapt

Perhaps the most frustrating aspect of the M-competitions is their refusal to adapt to the changing landscape of forecasting. While the machine learning community has embraced large-scale datasets, deep learning, and automated model selection, the M-competitions have remained stuck in the past.

Their datasets are often small and outdated, their evaluation metrics are simplistic, and their insistence on “simple methods” has become a crutch to avoid engaging with more powerful, albeit complex, methods.

For instance, in the M5 competition paper (Makridakis et al., 2022), the authors continue to emphasize the superiority of simple methods, even as the competition itself included more complex datasets. Yet, the results were still interpreted through the lens of simplicity, ignoring the potential of machine learning methods to outperform traditional approaches in more complex scenarios. This failure to adapt has rendered the M-competitions increasingly irrelevant in a world where businesses are using machine learning to forecast demand, optimize supply chains, and predict market trends.

The Real Legacy of the M-Competitions

The real legacy of the M-competitions isn’t one of innovation or progress; it’s one of missed opportunities and intellectual stagnation. For four decades, they’ve dominated the forecasting conversation, yet they’ve contributed almost nothing to the advancement of the field.

Instead, they’ve served as a cautionary tale: a reminder of what happens when a small group becomes so enamored with their own ideas that they lose sight of the bigger picture.

The future of forecasting belongs to those who are willing to embrace complexity, leverage new technologies, and push the boundaries of what’s possible. It belongs to the machine learning community, which has consistently delivered groundbreaking innovations while the M-competitions were busy patting themselves on the back for using methods that were outdated before they were even born. The M-competitions had their chance, and they squandered it. It’s time to move on.

--

--

Valeriy Manokhin, PhD, MBA, CQF
Valeriy Manokhin, PhD, MBA, CQF

Written by Valeriy Manokhin, PhD, MBA, CQF

Principal Data Scientist, PhD in Machine Learning, creator of Awesome Conformal Prediction 👍Tip: hold down the Clap icon for up x50

Responses (1)