Skip to main content

The ABC of PCA

Often, the costs associated with implementing advanced statistical models can outweigh the potential benefits. Brett Humphreys shows how to smooth and speed up choppy simulations using principal components analysis

Risk managers face the constant problem of determining when the benefits of new or more advanced statistical techniques outweigh the costs. Benefits of these techniques include greater accuracy in risk measures, faster calculation,or greater model flexibility. On the cost side, the risk manager must consider the calculation time required as well as the complexity of the proposed solution. Unfortunately, as solutions become more complex they become more difficult to explain, and the process itself may be viewed as a black box.

There is a long list of concepts that, while statistically beneficial, may not justify the additional costs, including Garch models of volatility, extreme value theory and copulas. In most situations, companies will find that the costs associated with implementing more advanced statistical models outweigh the potential benefits. However, most of these concepts attempt to address a specific issue. Therefore, if a risk manager faces that specific issue then these tools may generate significantly better results, justifying their cost.

It is always better for a risk manager to know the benefits of a potential solution and the issue it addresses and then decide against it than to never even consider the change. The process of considering a change itself has value, as it highlights weaknesses that may exist in current processes and keeps risk managers alert for situations where these weaknesses could be exploited.

Principal components analysis (PCA) is one method that may be an exception. PCA is a statistical method for modelling changes in forward curves or sets of forward curves. The basic theory is that it may be possible to explain how the entire forward curve could shift using only a few variables.

This is a significant difference from standard approaches to movements of forward curves, whether for simulation or risk calculations. Most models are based upon a multi-variate normal distribution of the forward curve. In other words, we model each point on the forward curve as if it follows a random walk process that is correlated with every other point along the forward curve. If we have monthly points for a four-year forward curve, this means that we model 48 different points and use a 48x48 correlation matrix. The correlation matrix is necessary to guarantee that the joint distribution of movements along the entire forward curve is correct.

The PCA point of view is somewhat different. While the specifics of calculating a PCA are complex (see box) the underlying theory is not. Instead of modelling all the related variables, the PCA attempts to identify the unique factors that, taken together, could explain the movements of all points on the forward curve in the most efficient way. Effectively, the goal of a PCA is to reduce the dimensionality of the problem. Instead of having a forward curve with 48 dimensions, we now have a forward curve with only two or three factors that can explain most forward curve movements.

Reducing dimensionality can be a difficult concept to grasp intuitively. However, it is something we deal with every day. For example, a photograph reduces a three-dimensional space to two-dimensional space. We listen to how major stock indexes perform and use that as an estimate for changes in our stock holdings. Even using risk proxies in a value-at-risk calculation is an example of reducing the dimensionality of a problem1. In each of these cases, we understand that we are losing some information by focusing on the reduced dimensions. However, the result still conveys a large portion of the original information, and in a more efficient manner.

For a single forward curve, a principal components analysis will generally identify two or three factors that drive the changes across the entire forward curve. These factors can usually be described as ‘directional shift’, ‘twisting’ and ‘bowing’. The directional shift occurs when the entire forward curve shifts up or down in a generally parallel fashion. For most commodities, we see that the forward contracts that are closest to expiration move more dramatically than contracts with more time to expiration. The directional shift in commodities tends to create a ‘tail-wagging’ effect in the forward curve. Twisting occurs when the front of the forward curve moves in one direction and the back of the forward curve moves in another direction. Finally, bowing occurs when the front and back of a forward curve move in one direction and the middle moves in a different direction.

Once we have identified the factors we can also calculate factor impact – the impact of a factor on any point in the forward curve. Figure 1 shows how the three factors explain movements in the Nymex natural gas forward curve.

It is important to understand that a positive factor impact does not mean that it will only explain increases for that point in the forward curve. Instead, what matters more are relative factor impacts. For example, looking at the figure we see that the factor impact of a directional shift for the first nearby contract is approximately 0.45 while the factor impact for the twelfth nearby is approximately 0.15. We interpret this information as saying that for a directional shift (positive or negative) the first nearby will move approximately three times as much as the twelfth nearby.

A beneficial property of the identified factors is that each is independent of the others. In other words, because PCA focuses on explaining forward-curve movements in the most efficient way possible, there is no correlation between each of these factors. This property makes a PCA model useful for forward-curve simulation.

To perform a Monte Carlo simulation, we create random shocks for the identified factors. We then combine these shocks with the factor impacts to estimate the movement of any point in the forward curve. This represents a significant gain in efficiency, as instead of simulating all points in the forward curve and correlating them, we can now describe all movements with only two or three uncorrelated variables.

The resulting movements of the forward curve are frequently more intuitively appealing, and avoid odd forward-curve shapes that can arise from multivariate normal simulations. Figure 2 shows an example of changes in forward curves created using a multivariate normal model and using a PCA model. As we can see, the PCA generates smoother shifts in forward curves that more closely match what we see in reality.

While PCA has some advantages, it is not without problems. Some data is always lost by reducing the dimensionality of the problem. If this information is simply extraneous noise to begin with, then ignoring it is a good thing. But in some cases it may represent meaningful information about shifts in the forward curves. Ignoring this information may bias the results. Seasonal markets present another challenge to this technique.

Principal components analysis is therefore not the perfect solution for every problem. However, under the right set of conditions, a principal components analysis provides an efficient method for simulating realistic forward curves. In this case, the benefits will frequently outweigh the costs.


Brett Humphreys is a managing director at Risk Capital, a consulting firm in New York. Email: bhumphreys@riskcapital.com

Calculating PCA

Calculating a principal components analysis is relatively simple, and depends on some characteristics associated with matrices: eigenvalues and eigenvectors. To calculate a PCA we first estimate the correlation matrix or covariance matrix associated with the forward curve or curves of interest. The next step is to calculate the eigenvalues of the matrix. Each eigenvalue can be interpreted as the variance associated with a single factor.
The next step is to calculate the eigenvectors associated with each eigenvalue. Each eigenvector represents the factor loading associated with a specific eigenvalue. By multiplying the eigenvector by the square root of the eigenvalue, we get the factor impact. This is all the information we need to begin to apply PCA.*
Finally, we need to select the number of factors needed to explain the majority of changes in the forward curve. We know that the sum of the eigenvalues equals the total variance of the data. This allows us to easily determine that the percentage of changes in the forward curves can be explained by a few factors. We simply select the largest eigenvalues until we explain a sufficient amount of the variation. For example, examining the Nymex natural gas data that generated the factors in figure 1, we find that the first three factors explained over 98% of the variation, and the vast majority (85%) is explained by parallel shifts.
In some cases, PCA may not significantly be able to reduce the dimensionality of a problem. But in most cases, two or three factors will be sufficient to model changes in the entire forward curve.

* Note that while Microsoft Excel does not contain functions to calculate the eigenvalues and eigenvectors of matrices, many other software programs do, and it is easy to download add-ins for Excel that will allow the user to perform this calculation. A demonstration file is available at www.riskcapital.com, which contains a simple PCA example and contains eigenvalue and eigenvector VBA code.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@risk.net to find out more.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here