Capturing fat tails

Financial institutions are more aware of the risks posed by high-impact events since the crisis, but the question is how to encapsulate these in models. Zari Rachev, Boryana Racheva-Iotova and Stoyan Stoyanov discuss three approaches for capturing fat tails

In this post-crisis era, there is universal agreement that financial assets are indeed fat-tailed and that investment managers must take extreme events into account as part of their everyday risk management processes. But deliberation continues at a high level on how risk management approaches and practices should change. While academic research has provided a vast offering of modern risk methods and analytic techniques, the race to integrate them into standard risk platforms has just begun.

This work has shown there is more to the production of accurate risk estimates than simply acknowledging asset returns have a higher probability of extreme events than had been thought. There are numerous phenomena that, if left out of the equation, will render risk measures such as value-at-risk virtually useless in accurately estimating levels of risk and the probability of extreme price movements. Skewness, auto-regression and volatility clustering are recognised phenomena that must be considered. However, the issue of varying tail-thickness from asset to asset and across time is widely ignored.

Focusing on daily returns across multi-asset class portfolios, the purpose of this article is to compare and contrast the more popular fat-tailed methodologies currently being discussed. These approaches include the classical Student’s t model, extreme value theory (EVT) and stable Paretian distributions. All these methods have been widely researched, with long academic histories outside the financial arena, and now provide the foundations for a range of commercial applications by leading risk management service providers.

 

Real-world models

When constructing realistic models, it is necessary to assume a distributional hypothesis capable of describing both fat tails and asymmetry. Several classes of distributions have been used to capture fat tails, both in academia and by practitioners. Perhaps the most popular is the classical Student’s t distribution. Other examples include extreme value distributions, stable distributions, operator stable distributions, the class of tempered stable distributions that include stable distributions as a limiting case, and the class of infinitely divisible distributions that include all previous classes except extreme value distributions.

All these classes of models, except extreme value distributions, share one feature – they include the normal distribution as a special (limiting) case. In effect, if the data is Gaussian, the fitted distribution would be close to, or would coincide with, the normal distribution. Therefore, these families of models can be regarded as an extension to the classical Gaussian framework and not an alternative to it.

As far as modelling asymmetry is concerned, neither the Gaussian distribution nor the classical Student’s t can account for skewness. Instead, we have to turn to models such as the stable Paretian distribution, which look at the respective left and right tails. One way to capture the difference between the upside and the downside potential is by calculating expected tail loss and expected tail return.

It is important to note that the degree of tail thickness varies across assets and asset classes. We carried out an empirical study that included the stocks in the S&P 500 universe during the 12-year period from January 1, 1992 to December 12, 2003. We fitted a Garch model to clean the volatility clustering effect and then fitted the classical Student’s t model on the residual. The degrees of freedom (DOF) parameter, which governs the tail behaviour, is shown in figure 1.

The plot illustrates that tail behaviour can be quite diverse – from very fat-tailed, with a DOF below five, to less fat-tailed, with a DOF above 15. In fact, the study found 21% of the S&P 500 stocks are very fat-tailed, with a DOF below four and just 35% have a DOF above seven. Since the DOF parameter governs the potential for extreme events, realistic estimates are essential in ensuring that portfolio tail risk contributors and diversifiers are properly identified.

Not only does tail behaviour vary across assets, it also varies through time. In relatively calm periods, asset returns are almost Gaussian, while in turbulent periods, the tails become fatter. Figure 2 illustrates this behaviour in the Dow Jones Industrial Average (DJIA) index returns from October 1997 to October 2009. The top and middle plots show the value and return of the DJIA, respectively. The bottom plot shows the fitted DOF parameter of the residuals of a Garch model fitted on a 500-day rolling window. Clearly, the tail behaviour changes through time. In the period from October 2003 to about January 2006, the tail behaviour is almost Gaussian as the fitted DOF is above 30.

It is crucial for the model to take into account the differences in tail behaviour both across assets and through time so they can be reflected in the risk statistics on both a marginal and aggregate level.


Applications of fat-tailed models

Risk management software vendors are offering approaches based on fat-tailed models including classical Student’s t distribution, EVT and stable Paretian distributions.

The Student’s t distribution.

A typical approach in building a framework based on the classical Student’s t distribution is:

  • An auto-regressive component to capture auto-regressive behaviour.
  • Volatility clustering by means of Garch or alternative short- or long-memory Arch-type processes.
  • The classical symmetric Student’s t distribution with the DOF parameter fixed for all variables (typically a DOF of four or five) to capture fat tails.

First introduced in 1908, Student’s t distribution is probably the most commonly used alternative to the normal distribution as a model for asset returns. Like the normal distribution, classical Student’s t densities are symmetric and have a single peak. Unlike the normal distribution, Student’s t densities are more peaked around the centre and have fatter tails.

While these two properties make them acceptable for asset returns modelling, the real reason behind the widespread use of Student’s t is its ease of use – numerical methods are easily implementable and are widely available.

The Garch component could be replaced with alternative models using an exponential or logarithmic decay of the observation weights when calculating the volatility based on a pre-defined parameter (such as 0.94 for the exponentially weighted moving average decay parameter (see Zumbach, 2006)). This forces the relative importance of the observations in the past to be the same for all risk drivers and across time. While this universal parameter makes these models simpler and easier to grasp, there is an important trade-off between simplicity and precision: these models are less accurate and only work ‘on average’ in a universe of risk drivers.

The most significant limitation in a classical Student’s t distribution-based framework, however, is that the residual in the time-series model is assumed to have a Student’s t distribution with the DOF parameter fixed (typically to four or five). This value is assumed to be one and the same for all risk drivers, irrespective of their type and the time period under consideration. This assumption is not realistic as empirical analyses indicate that tail behaviour varies across different risk drivers. Fixing the DOF parameter does not allow for a smooth transition between Gaussian data and fat-tailed data. As a result, the risk will be significantly overestimated for assets with returns that are close to being normally distributed.

Finally, the classical Student’s t model is symmetric. In cases where there is a significant asymmetry in the data, it will not be reflected in the risk estimate. By forcing the tails to be identical, it becomes impossible to reveal which assets are true tail risk contributors and diversifiers.

EVT – generalised Pareto distribution.

The key characteristics based on suggested approaches include:

  • Volatility clustering by means of a Garch model.
  • EVT to explain the fat tails of the residuals from the Garch model.
  • Skewness captured by using a generalised Pareto distribution (GPD) to fit the two tails separately.

EVT has been applied for a long time when modelling the frequency of extreme events, including extreme temperatures, floods, winds and other natural phenomena. From a general perspective, extreme value distributions represent distributional limits for properly normalised maxima of random independent quantities with equal distributions, and therefore can be applied in finance as well.

A common approach to EVT-type modelling is the peaks-over-threshold method that follows GPD and models those events in the data that exceed a high threshold. This is presented in a non-normal framework by Embrechts, Klüppelberg & Mikosch (1997).

GPD is the limiting distribution of the exceedances of a given return distribution over a certain threshold when the threshold (which can be viewed as the right tail of the original distribution) goes to infinity. It is a model for the tail only, left or right. GPD does not model the body of the corresponding distribution.

As a consequence, the parameters of GPD can be fitted using only information from the respective tail. There are two big challenges stemming from this restriction:

  • An extremely large sample is needed to get a sufficient number of observations from the tail.
  • We need to know where the body of the distribution ends and where the tail begins.

The large sample size is a significant challenge. Academic publications indicate that minimum requirements are in the range of 5,000 to 10,000 observations. However, it should also be noted that using a sample this large minimises any current fat-tailed market behaviour and would only be suitable for long-term projections. For financial time series, where the standard time window for risk estimation is two years of daily data, a sample size of just 500 observations is far too short to ensure an accurate GPD fit. Goldberg, Miller & Weinstein (2008) suggest 1,000 days, with approaches to generate synthetic data where enough observations are not available.

The second challenge – separating the body of the distribution from the tail – may seem easy to surmount by resorting to statistical methods that would indicate where the tail begins. Unfortunately, no reliable methods exist.

Typically, this high threshold is chosen subjectively by looking at certain plots, such as the Hill plot or the mean excess plot, which are standard in EVT. As a consequence, identifying the threshold between the body and the tail is a matter of subjective choice based on visual inspection, which cannot be achieved on a large scale. Kuiper’s test has been suggested as a numerical method for determining the optimal threshold selection, as suggested in Goldberg, Miller & Weinstein (2008). However, this test is very difficult to automate for large universes, because the resulting optimisation problem does not have good optimality properties, with the global minimum being hard to find.

As a result, unreliable threshold selections will be made in the absence of thorough visual inspection. An automated approach remains elusive. The choice of this threshold has a great impact on the parameter estimates of GPD and, therefore, on the final risk estimates. This will be especially acute when the sample is relatively small. This deficiency is acknowledged in Goldberg, Miller & Weinstein (2008) and del Castillo & Daoudi (2008).

In using GPD, one is always faced with the classical tail-estimation trade-off problem. For the GPD estimates to be unbiased, they must be fit with the largest possible threshold. Unbiased estimators are obtained when the threshold is infinity. This implies that one should use a very small number of extreme observations from the original sample. On the other hand, using such a small number of observations drastically increases the variance of the estimators. GPD estimation fits become more of an art than a science in balancing this trade-off between being unbiased and small variance of the estimators.

Stable Paretian distributions.

The key characteristics of a stable Paretian distribution implementation include:

  • An auto-regressive component to capture auto-regressive behaviour.
  • Volatility clustering captured by means of a Garch model.
  • Stable Paretian distributions to explain the fat tails and the skewness of the residual from the Garch model with asset-specific parameters.
  • Temporal behaviour of tail thickness captured by fitting the full distribution from historical data.

Applications of stable distributions in the field of finance have a long history. In 1963, the mathematician Benoit Mandelbrot first used the stable distribution to model empirical distributions that have skewness and fat tails. To distinguish between Gaussian and non-Gaussian stable distributions, the latter are commonly referred to as stable Paretian or Lévy stable distributions.

Stable Paretian tails decay more slowly than the tails of the normal distribution and therefore better describe the extreme events present in the data. Since they represent a model for the entire distribution and not just the tails, reliable model parameter estimation methods exist. The instability of parameter estimation, inherent in EVT, is not present for stable distributions because the entire sample is taken into account, rather than just the tail of the distribution. The operator stable version of the stable Paretian distribution allows for varying tail-fatness from asset to asset. A tail-tempering process ensures a finite second moment (variance). Tail tempering is achieved by imposing an additional exponential, or faster than exponential, decay in the tail very far away from the centre of the distribution.

A detailed description of the stable methodology is available in Rachev et al (2009).

The tail-tempering process is detailed in Kim et al (2008), Young et al (2010) and Bianchi et al (2010).

Like the Student’s t distribution, stable Paretian distributions have a parameter responsible for the tail behaviour, which is called the tail index or index of stability. In contrast to the DOF parameter, the index of stability is between zero and two. The closer it is to two, the more Gaussian-like the distribution is. Therefore, small values of the index of stability imply a fatter tail.

The fact the tail index changes through time is demonstrated in figure 3 with DJIA returns in the period from October 1997 to October 2009. The tail index is very close to two in the upward market from 2003 to 2005, but then starts decreasing right before the market crash and is smallest at the crash itself. This implies the tail thickness is smallest in the bullish market from 2003 to 2005 and is largest during the crisis periods.


Comparing models

Using the DJIA, a back-testing study was conducted to compare the three fat-tailed models: stable Paretian, Student’s t with a DOF of five and EVT, alongside the normal distribution model. All the models have filters for auto-regression and volatility clustering based on Arma-Garch, with the Student’s t model using the particular method described in Zumbach (2006). For each of the four models, exceedances – the number of times the real loss is larger than the calculated VAR – are tracked.

The back-testing was run with the following settings:

  • Back-test period: July 8, 2005 to December 31, 2009.
  • VAR confidence level: 99%.
  • Time window: 500 rolling days for normal, classical Student’s t and stable Paretian, and 3,000 rolling days for EVT.8
  • EVT threshold: 1.02% (as suggested by Goldberg, Miller & Weinstein, 2008).

Figure 4 shows the DJIA performance for the back-test period. Figure 5 shows the daily forecast for the four models versus the daily returns of the DJIA across the full back-test period. The number of exceedances for the four models is reported in figure 6. Using a 95% confidence interval, the number of exceedances is compared. The results show the normal-Garch model is too optimistic, with its daily 99% VAR forecasts being too low. In contrast, the Student’s t and EVT approaches are overly pessimistic, with their forecasts being too high.

Figure 7 zooms in on the period between July 1, 2006 and June 30, 2007. This 12-month history has relatively low volatility, but includes a drop of 3.4% on February 27, 2007. Neither the Student’s t nor the EVT model can distinguish between normal and fat-tailed markets. As observed in figure 7, the Student’s t and EVT models simply overreact to extreme events when they occur. The EVT model, the most pessimistic, is unable to adjust to current market conditions because of the very large sample size that is required to ensure stable estimates. Garch alone does not help since there are changes in the tail behaviour of the markets as well.

The stable-Garch model appears to be the only one with realistic VAR forecasts with exceedances within the confidence interval. Additionally, the daily forecasts of the stable-Garch model react with extreme events. They change according to new market information, sometimes with estimates lower than the normal model, making the spread between the normal and stable risk estimates indicative of the market’s probability of extreme events.

Figure 8 shows the minimum and maximum VAR for the EVT model based on selecting different thresholds (1–15%). It shows minimum EVT and Student’s t with a DOF of five almost coincide for calm periods and are still too conservative. The percentage spread between minimum EVT and normal does not change, indicating indifference to market conditions. The maximum EVT is constantly one-and-a-half to two times the minimum EVT, which points to the dramatic influence of the tail threshold selection on the risk estimates.

Conclusion

Suddenly, it seems everyone agrees fat tails exist and are crucial to properly estimate and proactively manage risk. Vendors and practitioners are now trying harder to incorporate fat-tailed risk assessment into their risk management systems. Models based on normal distributions cannot capture them. To accurately measure risk, we need to focus on how the behaviour of assets relates and contributes to fat tails.

Empirical properties including auto-regressive behaviour, clustering of volatility, skewness and fat tails, including differences in thickness across assets and through time, can all be pronounced in daily asset returns. Modelling these phenomena is a two-step process: fit a time-series model to explain the clustering of volatility and auto-correlation, and employ a fat-tailed model for the residual.

We considered three approaches that all start with capturing the fat tails. Taking into account these results and the preceding model descriptions, a final model comparison is as follows:

  • EVT approach. This is overly pessimistic at all times. There is a slow relative response when a market becomes more likely to have large losses. Automated methods to fit the tails are impractical for daily risk management use because of the pessimism penalty, data requirements and tail-fitting implementation issues. The approach is potentially capable of reviewing downside risk from time to time, but not over time. It will require manual set-up and intervention to produce reasonable results.
  • Classical Student’s t approach. While not as pessimistic as EVT, it still requires taking a large penalty and has little ability to discriminate between assets and how they change over time. This is further compounded by ignoring skewness. It is easily implemented for day-to-day risk management use but with overly conservative results and limited insight into the key contributors to specific risk.
  • Generalised stable Paretian approach. The risk estimates do not impose a risk penalty during normal market times and are responsive when fat-tailed behaviour becomes more probable. There is strong differentiation between assets and over time. It is a practical approach for day-to-day risk management with high discrimination of key risk drivers in the tail.

Additional new commercial approaches are likely to appear in 2010. It seems clear many of these will provide alternative lenses through which users may view and analyse risk, but will not be replacements or upgrades to their core models. How and when users should fit this into their process is unclear. Are market participants expecting a separate model to evaluate tail risk, for example?

The debate will continue around what is best to implement and how. We suggest that if you have a secondary model specifically for use in times of crisis, by the time you implement it, the chances are it will be too late.

Zari Rachev is chair-professor of statistics, econometrics and finance at KIT in Germany and FinAnalytica chief scientist. Boryana Rachev-Iotova is president of FinAnalytica. Stoyan Stoyanov is a professor of finance at Edhec Business School and the scientific director for Edhec-Risk Institute in Asia

FOR THE FULL VERSION OF THIS ARTICLE, WITH CHARTS, FOOTNOTES AND REFERENCES, PLEASE CLICK ON THE LINK BELOW.

Tail risk

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@risk.net to find out more.

The new rules of market risk management

Amid 2020’s Covid-19-related market turmoil – with volatility and value-at-risk (VAR) measures soaring – some of the world’s largest investment banks took advantage of the extraordinary conditions to notch up record trading revenues. In a recent Risk.net…

ETF strategies to manage market volatility

Money managers and institutional investors are re-evaluating investment strategies in the face of rapidly shifting market conditions. Consequently, selective genres of exchange-traded funds (ETFs) are seeing robust growth in assets. Hong Kong Exchanges…

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here