Random matrix theory provides a clue to correlation dynamics

A growing field of mathematical research could help us understand correlation fluctuations, says quant expert

Harry Markowitz famously quipped that diversification is the only free lunch in investing. What he did not say is that this is only true if correlations are known and stable over time.

Markowitz’s optimal portfolio offers the best risk-reward trade-off – for a given set of predictors – but requires the covariance matrix of a potentially large pool of assets to be known and representative of future realised correlations.

The empirical determination of large covariance matrices is, however, fraught with difficulties and biases. But the vibrant field of random matrix theory (RMT) has provided original solutions to this big data problem – and has droves of possible applications in econometrics, machine learning and for other large dimensional models. 

Correlation is not stationary, however. Even for the simplest, two-asset bond/equity allocation problem, being able to model forward-looking correlation has momentous implications. Will this correlation remain negative in years to come, as it has been since late 1997 – or will it revert to positive territory?

Compared to our understanding of volatility, our grasp of correlation dynamics is remarkably poor. And, surprisingly, the hedging instruments that can mitigate the risk of bond/equity correlation swings are nowhere as liquid as the Vix volatility index.

There are in effect two distinct problems in estimating correlation matrices: one is a lack of data; the other is the non-stationarity of time.

Consider a pool of assets, N, where N is large. We have at our disposal T observations – daily returns, say – for each N time series. The paradoxical situation is this: even though every individual, off-diagonal covariance is accurately determined when T is large, the covariance matrix as a whole is strongly biased – unless T is much larger than N. For large portfolios, where N is a few thousand, the number of days in the sample should be in the tens of thousands – say, 50 years of data.

But this is absurd: Amazon and Tesla, to name but two, did not exist 25 years ago. So, perhaps we should use five-minute returns, say, and increase the number of data points by a factor of 100. Except that five-minute correlations are not necessarily representative of the risk of much lower-frequency strategies and other biases can creep into the resulting portfolios.  

So, in what sense are covariance matrices biased when T is not very large compared to N? The best way to describe such biases is in terms of eigenvalues. Empirically, the smallest eigenvalues are found to be much too small and the largest are too large. This results in the Markowitz optimisation programme – a substantial over-allocation to a combination of assets that happened to have a small volatility in the past – with no guarantee that this will continue to be the case. The Markowitz construction can therefore lead to a considerable underestimation of the realised risk in the next period.

Compared to our understanding of volatility, our grasp of correlation dynamics is remarkably poor

Out-of-sample results are, of course, always worse than expected, but RMT offers a guide to at least partially correcting these biases when N is large. In fact, RMT gives an optimal, mathematically rigorous recipe to tweak the value of the eigenvalues so that the resulting, cleaned covariance matrix is as close as possible to the true but unknown one, in the absence of any prior information.

Such a result, first derived by Ledoit and Péché in 2011,1 is already a classic and has been extended in many directions. The underlying mathematics, initially based on abstract free probabilities, are now in a ready-to-use format – much like Fourier transforms or Ito calculus.2 One of the more exciting and relatively unexplored directions is to add some financially motivated prior, such as industrial sectors or groups, to improve upon the default agnostic recipe. 

Stationarity, still

Now that we’ve addressed the data problem, the stationarity problem pops up. Correlations – like volatility – are not set in stone but evolve over time. Even the sign of correlations can suddenly flip, as was the case with the S&P 500 and Treasuries during the 1997 Asian crisis, after 30 years of correlations being staunchly positive. Ever since this trigger event, bonds and equities have been in so-called flight-to-quality mode.

More subtle, but significant changes of correlation can also be observed between single stocks and/or between sectors in the stock market. For example, a downward move of the S&P 500 leads to an increased average correlation between stocks. Here again, RMT provides powerful tools to describe the time evolution of the full covariance matrix.3

As I discussed in my previous column, stochastic volatility models have made significant progress recently and now encode feedback loops that originate at the microstructural level. Unfortunately, we are very far from having a similar theoretical handle to understand correlation fluctuations – although in 2007, Matthieu Wyart and I proposed a self-reflexive mechanism to account for correlation jumps like the one that took place in 1997.4

Parallel to the development of descriptive and predictive models, the introduction of standardised instruments that hedge against such correlation jumps would clearly serve a purpose. This is especially true in the current environment, where inflation fears could trigger another inversion of the equity/bond correlation structure, which would be devastating for many strategies that – implicitly or explicitly – rely on persistent negative correlations.

As it turns out, Markowitz’s free lunch could be quite pricy.

Jean-Phillipe Bouchaud is chairman of Capital Fund Management and member of the Académie des Sciences.


1. Ledoit, O, & Péché, S (2011). Eigenvectors of some large sample covariance matrix ensembles. Probability Theory and Related Fields, 151(1–2), 233–264.

2. Potters, M, & Bouchaud, JP (2020). A first course in random matrix theory: for physicists, engineers and data scientists. Cambridge: Cambridge University Press. doi:10.1017/9781108768900

3. Reigneron, PA, Allez, R, & Bouchaud, JP (2011). Principal regression analysis and the index leverage effect. Physica A: Statistical Mechanics and its Applications, 390(17), 3026–3035.

4. Wyart, M, & Bouchaud, JP (2007). Self-referential behaviour, overreaction and conventions in financial markets. Journal of Economic Behavior & Organization, 63(1), 1–24.

Editing by Louise Marshall

  • LinkedIn  
  • Save this article
  • Print this page  

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact [email protected] or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact [email protected] to find out more.

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here: