Model misfires raise questions over training data

Quants wrestle with how far into the past their machine learning models should peer

Training-machine-learning-bots-with-data montage

Wisdom accumulated over many decades is highly prized in most cultures. Less so for machine learning in investing, it would seem.

Algorithms that use long histories of data to build their understanding of markets flopped during the Covid-19 pandemic. Such models failed “pretty spectacularly” in the extreme events of 2020, says Michael Heldmann, head of multi-factor equity investing for North America at Allianz Global Investors, citing research conducted by the firm.

“They have been hammered during Covid, especially in the beginning, when they were betting on things that have been very successful in the long term, like betting on a big drawdown not being followed by another big drawdown. That has led to massive underperformance in those models,” he says.

The failure is forcing investors that use machine learning to reassess the datasets they use and, crucially, how far back the information should stretch.

Quants routinely train models on real-world data to evaluate how effective they are. Machine learning investment systems can rapidly filter oceans of information, looking for patterns that humans would miss, but they have a tendency to overfit to the numbers. This means they may read significance into data that turns out to be just noise.

Using a longer period of data to train the models is one way to address the overfitting problem. But when something like Covid occurs, such an approach can go badly wrong.

“Quants train models on lots of long-term data so we have a good representation of the major relationships in the market,” says Aric Whitewood, chief executive of artificial intelligence-powered hedge fund XAI Asset Management. The difficulty is to build enough adaptability into the system so that when the situation changes – when history no longer provides a good guide to the future – the models can keep up. “How you do that is a key question in financial markets predictions,” Whitewood adds.

The conclusions that different machine learning users have reached on this question – some preferring models that learn from decades of data, others choosing far shorter horizons – may explain why some stumbled during the pandemic but others didn’t.

The Eurekahedge AI Hedge Fund Index is up 4.65% year-to-date. This unremarkable figure masks a wide dispersion in performance, though. When spoke to practitioners in April their comments ranged from the optimistic (“the bots are winning”) to the less sanguine (“machine learning models are not heroes”). The worst performers “lost as much as anyone” in the fund industry in March, Marcos Lopez de Prado, global head of quantitative research at Abu Dhabi Investment Authority, said at the time. The best were up 5% to 10% for the month.

Some experts suggest a conceptual flaw in the training process is partly to blame for this hit-and-miss performance. Using historical data to help better understand the future may work when fundamental forces of markets hold firm. But when those forces twist and snap, as happened during the Covid pandemic, backward-looking data is of limited use.

“The recent struggles of some models underscores the static nature of the way that many quants view the world: what worked decades ago should work decades from now. But the reality of a changing world seems to repeatedly interfere with the theory,” says Andrew Beer, whose firm Dynamic Beta Investments replicates leading hedge fund portfolios.

Training regimes

There are three broad approaches to training machine learning models, XAI’s Whitewood says. Models can be repeatedly retrained over one- to five-year windows, optimising for a particular set of circumstances and then running those models out of sample or trading for some period before retraining them again.

At the other extreme, models can be trained on a longer history, so the models react to new things based on knowledge of the past.

Then there is a middle way, in which models are trained on a long history and retrain continually. The idea is they will change behaviour but without forgetting what they learned about markets from years before. “It’s a fine balancing act,” Whitewood says.

Heldmann favours faster learners. One of Allianz’s best performing signals during the last six months, for example, has been to use NLP analysis of earnings call transcripts and news flow to train neural networks to predict future returns. Heldmann attributes its success to the speed at which the strategy picks up on change.

Because NLP algos often are trained on more abundant data from outside finance, quants can afford to train them on a much shorter time period, Heldmann says. “For these NLP models, you need just a couple of years and not 10 years. And you can train them in a pretty reasonable way.”

Conversely, to illustrate the things that might trip up a slower machine-learner, Heldmann points to the US regulation on fair disclosure of financial information, which was enacted in 2000. Reg FD forced firms to publish all material information at the same time, visible to everybody. It speeded up the dissemination of information through the market.

If a machine learning algorithm had been trained on 10 years of data prior to 2000, it would continue to use those data values into the future, Heldmann says. “If it learned only on the past 12 months, it would adapt more quickly to the new situation,” he adds.

If you use higher frequency models you stand more chance of adapting quickly enough. But if you use high frequency models you typically trade so much that capacity is very limited

Raul Leote de Carvalho, BNP Paribas Asset Management

Richard Craib, CEO at machine learning-driven fund Numerai, says systems that learn from all of history are likely to come up with more linear rules. “You might say, value investing has worked well over the last 200 years that we have data for, so we’re going to be long value all the time, no matter what’s going on in the world,” he says.

Machine learners with a shorter span of attention have their own drawbacks, though. This means the difficulty for investors is in knowing which parts of history to set aside. Raul Leote de Carvalho, deputy head of BNP Paribas Asset Management’s quantitative research group, agrees that models with a longer investment horizon are more likely to underperform in a crisis like Covid-19. But there’s a proviso.

“If you use higher frequency models you stand more chance of adapting quickly enough. But if you use high frequency models you typically trade so much that capacity is very limited,” he says. “The implementation [of the strategy] is key because you have to worry a lot about avoiding market impact.” In other words, there is a trade-off. “You cannot have very high turnover and very high capacity.”

XIA’s Whitewood gives another example. Take a model that uses NLP with a one- or five-year window. The model will pick up on hot topics such as Brexit or Covid, which are likely to affect markets for a short period.

But it can be difficult to be certain which narratives are likely to be dominant at any one time. Covid was the driving force in markets during the early part of the year, while Brexit is becoming a stronger factor as the December 31 transition deadline approaches. Equally, individual narratives affect markets in different ways, possibly driving change in certain geographies or influencing markets differently over time.

“How do you build trust in models that are short term and for which you can’t test out of sample?” Whitewood asks. “There are a lot of assumptions here and lots of potential for overfitting. While I think some faster bias of models could be useful, you need to take care in how it’s implemented.”

The more, the merrier

Some quants make the case for using all the data at hand. Sheedsa Ali, head of quantitative equity research at $100 billion asset manager PineBridge, says slow-learning parts of the firm’s model that were built for diversification did as expected earlier this year, even outperforming at aggregate level.

“Of course, there were pockets of the investment universe – sectors and sub-segments – where everything struggled,” she says. “But you have to look at things in aggregate and [consider] your own investment time horizon.”

Andrew Chin, chief risk officer and head of quant at AllianceBernstein, is also a proponent of longer-term models. He concedes that short-term models have done better recently because markets are moving quickly, but Chin’s view is that models cannot hope to respond well to events like Covid if they use only one, two or three years of history.

“I like to use as much history as possible,” he says. “I don’t know what the next crisis is going to look like. I want to use as much experience as possible to help me. If you go back far enough, even though we never saw Covid before, maybe we went through other pandemics, maybe we went through other experiences that would have been similar. If you look at the short horizon, you will never pick that up.”

Update, November 25, 2020: This article has been updated with the latest Eurekahedge index return data.

Editing by Alex Krohn

  • LinkedIn  
  • Save this article
  • Print this page  

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact [email protected] or view our subscription options here:

You are currently unable to copy this content. Please contact [email protected] to find out more.

You need to sign in to use this feature. If you don’t have a account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here: