4 min

How to Prepare for the M6 Financial Forecasting Competition

Published on October 22, 2021

The details of the M6 Forecasting competition have been announced. This time it is the hardest possible test and, as discussed in The Future of Forecasting Competitions, it’s good to see that we’re going real-time. But why, you ask, would we need a forecasting competition for the equity markets when they are already a forecasting competition? I’ll let the organizers discuss that and you can read the details here.

The efficient market hypothesis (EMH) posits that share prices reflect all relevant information, which implies that consistent outperformance of the market is not feasible. The EMH is supported by empirical evidence, including the yearly “Active/Passive Barometer” Morningstar study which regularly finds that active, professional investment managers do not beat, on average, random stock selections. On the other hand, legendary investors like Warren Buffett, Peter Lynch and George Soros, among others, as well as celebrated firms including Blackstone, Bridgewater Associates, Renaissance Technologies, DE Shaw and many others have achieved phenomenal results over long periods of time, amassing returns impossible to justify by mere chance, and casting doubts about the validity of the EMH. It is the express purpose of the M6 competition to empirically investigate this paradox.

If you plan on entering this contest, I offer six ideas.

1. Practice making distributional predictions

If you’re coming from Kaggle or somewhere where you’ve been making point estimates, I’d encourage you to move on from that. It is extremely unlikely, in my view, that alpha generation is going to be the deciding factor.

Take a close look at the rules. You are being asked to provide probabilistic predictions of the *ranks* of return of each asset within its class (a number between 1 and 5). The benchmark is the uniform distribution. The metric is the ranked probability score. You’re going to need to predict the distribution of the returns of all assets, clearly, and probably their joint-distribution.

Perhaps it is too much to hope for, but the M6-Contest might spur the development of better open-source packages for distributional and quantile estimation. I’ll try to add more to the listing of Python packages here and let’s see if we can’t come up with a decent new sub-section.

If you’re in Julia and use Flux, take a peek at Rusty Conover’s distributional NYISO Electricity models built with Flux (and exposed with Tensorflow.js — see this repo and some of his others).

Predicting distributions and quantiles will also get you interested, I hope, in the subtleties of scoring rules for quantiles and expectiles.

2. Predict volatility, covariances, or other ancillary quantities, not the mean

It goes without saying that if you want to make good distributional forecasts you might nonetheless benefit from point estimates of related quantities. Competitors in live microprediction contests have found this to be the case, and so although the distributional estimation is the goal, it is certainly worthwhile to collate whatever point estimates can be found.

Since volatility can be related to volume and other things observed in markets as well, it might pay to practice predicting volume directly. Here’s the volume of crude oil trading (Brent, if you care).


You can see the live version here. You can also just search ‘volume’ in the stream listing. You could also create a crawler that only goes after volume (or volatility) streams. Instructions for telling your crawler where to go are provided here.

3. Practice using shrinkage and residual prediction

Shrinkage is almost always worth a look. In the case of equity markets, that is true in spades. Indeed the only way Warren Buffett would have a chance of winning M6 is if he shrunk just about everything towards zero except for one or two of his key opinions.

If you look at the best performing time-series algorithms in the Elo ratings for the fastest algorithms (here) you’ll see that there are a bunch of ensembles on top — more on that in a moment — but then the algorithm called thinking_slow_and_fast comes in next. Then, if you dig into the code for that algorithm (here) you’ll notice two things. First, it is merely using one moving average to predict the errors of a different moving average (but shrinking).

Second, you can do the same for your own model with one line of code. That line of code is a call to a function like quickly moving hypocratic residual factory which I admit isn’t the catchiest name. But it is easy. Keep in mind that when you predict the market you are predicting the residuals of an existing prediction. I hope the M6 contest leads to new, nice open-source code for martingale hypothesis testing, if nothing else.

There’s a little more discussion here.

4. Practice online stacking

Stacking can be quite simple, or more elaborate. Some references and ideas are found in a recent paper by Yao, Pirs, Vehtari and Gelman (arXiv). A simple approach tracks the recent empirical errors and weights models accordingly.

There is the matter of book-keeping to keep track of the accuracy, and one must be careful to avoid data leakage. It’s usually simplest to do it in an online fashion, avoiding that danger completely. As I mentioned in this post, I’ve provided some utilities to make this convenient.

The parade is simple but handy, in this respect (it tracks k-step ahead predictions and their accuracy against incoming data). Also, we need a few minor conveniences to track running estimates of bias, squared error and if necessary, higher moments (momentum functions suffice). More tools can be found in skatertools/ensembling modules.

5. Collaborate and contribute to open-source

I suspect this contest will be more competitive than most. If you want to chat with a bunch of people with an interest in time-series prediction and open-source development I imagine you can find them in various venues like Kaggle. Another is the microprediction Slack.

Perhaps by the end of this competition, we’ll have lots of pull requests for open-source packages. Some model-search tools like tslearn, autots, pycaret might get you headed in the right direction, but also benefit from your contribution, hopefully. I’m not trying to play favourites. There’s a long list here.

But chances are there are ideas in feature generation, anomaly detection, or motif discovery libraries like liminol, tsfresh, or less well-known saxpy, stationarizer, or luinaire (from Zillow) that are complementary to the better-known libraries. Maybe you can split the task by cases, using classification packages like rocket.

I think it might come down to how you combine tools. I’ve written a short article on using timemachines with pycaret, for example. But what would you do with deepecho or timesynth? Well, you could train your models not to overfit using synthetic data, for one thing.

Some tools like magi might help you do that in parallel. And tigramite, causalnex or lingam might help you discover connections between time series — such as one volatility versus another.

You are welcome to collaborate on a Python package that helps people enter the M6 specifically. You can find it on pypi/m6. Perhaps by the time you read this, there will be a few benchmark entries to modify. If you have ideas, file an issue or a pull request.

6. Form a team of algorithms

When you use model search tools like autogluon, autokeras TimeSeriesForecaster, autots, or autosklean (used here effectively) you are, in a sense, forming a team of algorithms.

However, there’s also the possibility of forming a team of live autonomous algorithms. This approach is a little bit like entering a contest with a contest. You can use a free API where you publish data points in real-time. When you do that, you get predictions because some algorithms swarm around your data and try to predict it.

The difference between this and using open-source software packages is that you may never know, and might not care, what the algorithms on the other side are doing. Although often the leaderboards are full of entries with CODE badges you can click on (see examples).

I’ll definitely be using the microprediction API to enter the M6 contest, but it is all about how you do it. The microprediction API is intended for shorter horizon prediction (up to one hour) but if you are clever, that won’t stop you from improving month ahead forecasts — or at least creating plausible approaches to the same.

Do it for science

I hope you enjoy the M6-Contest and are grateful for the longitudinal efforts by Spyros Makridakis and helpers to make this happen. Competitions are a catalyst for advancing research, and an antidote to fashion and hype. That’s been true since the first one in 1982.

Also, you might not even appreciate what you might be contributing to when you enter the M6 and contests like it. Time will tell if the automated assessment of algorithms is just getting started and if it might lead to something extraordinary one day. More speculations here.


Image by John Arano


I’m very pleased to say that my firm will be a sponsor for the M6-Competition. We feel this is very strongly aligned with our open-source time-series efforts and we’re proud to be associated with M6. We’ll be doing what we can to ensure the M6 fosters new open-source work. Expect to see more connector code at m6 and timemachines that makes it easier to use open-source packages in M6, and vice-versa.