 Cutting Edge Introducing the factor Heath-Jarrow-Morton term structure modelling framework A framework for rates that links real-world and risk-neutral measures is presented

Andrei Lyashenko and Yevgeny Goncharov introduce a risk-neutral interest rate term structure modelling framework based on the factor modelling approach widely used to model yield curves in real-world applications. The new modelling framework is very attractive as it combines the simplicity, intuitiveness and computational efficiency of the factor modelling approach with the no-arbitrage rigour of pricing term structure models. This makes it a convenient practical tool for model development and brings clarity and intuition to the yield curve modelling process

In real-world ($\mathbb{P}$-measure) applications, yield curves are usually represented using observable rates with fixed tenors (times to maturity) $\tau=T-t$. Modelling is done either by directly modelling a vector of key term points or, more typically, by first projecting the curve onto a functional basis composed of a small number of intuitive basis functions $a_{i}(\tau)$ so that the time-$t$ yield curve $y_{t}(\tau)$ is given as (approximately) a sum of product terms $x_{t}^{i}a_{i}(\tau)$, where $x_{t}^{i}$ are time-dependent factors modelled using some simple stochastic process. The Nelson & Siegel (1987) basis and its Svensson (1995) extension are especially popular among practitioners and academics because of their simplicity and intuitiveness.

The factor modelling approach is very intuitive since it decomposes any shock to the yield curve as the sum of standard shocks (eg, level, slope and curvature) given by the basis functions $a_{i}(\tau)$. In this respect, the approach is similar to the principal component decomposition, with the basis functions $a_{i}(\tau)$ often referred to as the synthetic (or functional) principal components. By construction, the factor modelling approach is very efficient as it requires the simulation of only a small number of factors $x_{t}^{i}$, with the whole yield curve being given analytically as the sum of the factor-scaled analytic functions $a_{i}(\tau)$.

Despite its attractiveness, the factor modelling approach cannot be directly used with the pricing ($\mathbb{Q}$-measure) applications because of the no-arbitrage requirement, which proves to be a very severe constraint on both the functional basis and the factor dynamics. In this paper, we derive the conditions required for the factor approach to admit no-arbitrage dynamics. Unlike the approach typically used to derive the conditions for which a model admits a separable (affine) solution, we start directly with the separable yield curve representation $y_{t}(\tau)=\smash{\sum x_{t}^{i}a_{i}(\tau)}$ and derive the conditions required for it to satisfy the no-arbitrage condition given by the Musiela (1993) formulation of the Heath-Jarrow-Morton (HJM) equations (Heath et al 1992).

The main contribution of the paper is the new risk-neutral term structure modelling framework, where the yield curve has an intuitive Cheyette-like representation as the sum of a separable stochastic term, a separable locally deterministic convexity adjustment term and a deterministic implied-forward term. Unlike in Cheyette (1992), this decomposition comes naturally in our derivation. Our modelling framework, which we call the factor HJM (FHJM), is, by construction, very practical and combines the simplicity, intuitiveness and computational efficiency of the factor modelling approach with the no-arbitrage rigour of term structure pricing models. By formulating the framework in terms of the tenor $\tau=T-t$ instead of the customary maturity $T$ dimension, we both simplify the formulas and naturally allow for time homogeneity. Using the tenor dimension is more natural when modelling the evolution of a yield curve as a whole since that is how yield curves are observed and quoted in the market.

The modelling framework we propose is closely related to the popular affine yield curve modelling approach widely used on the buy side and the quasi-Gaussian (Cheyette) modelling framework quite popular on the sell side. Our results are closely related to the academic research by Björk and others (see, for example, Björk & Christensen (1999) and the references therein). As a special case, we replicate the results on the arbitrage-free property of the Nelson-Siegel term structure model presented in Christensen et al (2009, 2011) and, in particular, explain the mystery behind a very peculiar form of the mean reversion matrixes they used.

## Yield curve factor modelling in real-world applications

Consider a yield curve expressed in the form of (continuous) spot rates $r_{t}(\tau)$ defined by the zero-coupon bond price formula:

 $P(t,t+\tau)=\exp(-\tau r_{t}(\tau))$

where $t$ is the current time and $\tau$ is the rate tenor (time to maturity).

Consider a set of $K$ linearly independent basis functions $a_{1}(\tau),\dots,a_{K}(\tau)$, and define the spot rate basis as a row vector:

 $A(\tau)=(a_{1}(\tau),\dots,a_{K}(\tau))$

We can decompose the spot rate curve into a part spanned by the basis and a remainder:

 $r_{t}(\tau)=A(\tau)X_{t}+\bar{r}_{t}(\tau)$ (1)

where $X_{t}$ is a $K$-dimensional column-vector process.

Using decomposition (1), we can model the yield curve by modelling the factor vector $X_{t}$ and the error term $\bar{r}_{t}$. In real-world applications, the factor vector is usually modelled using a simple econometric model such as vector autoregression of order $1$, while the error term is typically modelled deterministically by, for example, assuming it remains unchanged or decays to zero with time $t$. Once the factor vector and the error term are simulated, the spot rate curve evolution is given by (1).

Assuming both the spot rate curve $r_{t}(\tau)$ and the basis functions $a_{k}(\tau)$ are continuously differentiable, we can rewrite (1) in the forward rate form:

 $f_{t}(\tau)=B(\tau)X_{t}+\bar{f}_{t}(\tau)$ (2)

where $f_{t}(\tau)$ is the time-$t$ instantaneous forward rate for tenor $\tau$, defined by the equation:

 $\tau r_{t}(\tau)=\int_{0}^{\tau}f_{t}(u)\mathrm{d}u$

and:

 $B(\tau)=(b_{1}(\tau),\dots,b_{K}(\tau))$

is the forward rate curve basis row vector, where $b_{k}(\tau)=(\tau a_{k}(\tau))^{\prime}$.

Among the functional bases often used in practice to model yield curve evolution, the Nelson-Siegel basis and its Svensson (1995) extension play a prominent role as they are widely used by both financial market practitioners and central banks around the world. Their forward rate bases are of a particularly simple form, composed of terms $\tau^{i}\mathrm{e}^{-\lambda_{j}\tau}$:

 Nelson-Siegel: $\displaystyle\quad B(\tau)=(1,\mathrm{e}^{-\lambda\tau},\tau\mathrm{e}^{-% \lambda\tau})$ Svensson: $\displaystyle\quad B(\tau)=(1,\mathrm{e}^{-\lambda_{1}\tau},\tau\mathrm{e}^{-% \lambda_{1}\tau},\tau\mathrm{e}^{-\lambda_{2}\tau})$

where the lambda parameters, often called the shape parameters, are positive. One reason for the popularity of the Nelson-Siegel and Svensson bases, apart from their simplicity, is that the basis functions $1$, $\mathrm{e}^{-\lambda\tau}$ and $\tau\mathrm{e}^{-\lambda\tau}$ represent level, slope and curvature terms, respectively, with $\lambda$ determining the position of the convexity term hump. The Svensson basis has two curvature terms, which allows for better fits to both short-term and medium- to long-term parts of the forward curve.

## Yield curve factor modelling in pricing applications

In pricing applications, the factor modelling approach described above cannot be directly used since the $\mathbb{Q}$-measure evolution of the short rate $r_{t}(0)$ fully determines the evolution of all other points on the yield curve through the risk-neutral pricing equation:

 $\exp(-\tau r_{t}(\tau))=\mathbb{E}_{t}^{Q}\bigg{[}\exp\bigg{(}-\int_{t}^{t+% \tau}r_{u}(0)\mathrm{d}u\bigg{)}\bigg{]}$

where $\mathbb{E}_{t}^{Q}$ is the $\mathbb{Q}$-measure expectation conditional on the information available at time $t$.

The no-arbitrage condition for $f_{t}(\tau)$ is given by the Musiela (1993) parameterization of the HJM equation:

 $\mathrm{d}f_{t}(\tau)=\bigg{(}f_{t}^{\prime}(\tau)+\varSigma_{t}(\tau)^{% \mathrm{T}}\int_{0}^{\tau}\varSigma_{t}(u)\mathrm{d}u\bigg{)}\mathrm{d}t+% \varSigma_{t}(\tau)^{\mathrm{T}}\mathrm{d}W_{t}$ (3)

where $f_{t}^{\prime}(\tau)$ is the derivative of $f_{t}(\tau)$ with respect to tenor $\tau$, $\varSigma_{t}(\tau)$ is a family of $N$-dimensional adapted processes and $W_{t}$ is the standard $N$-dimensional Brownian motion under the $\mathbb{Q}$-measure.

Assume that the forward rate curve $f_{t}(\tau)$ is given by the factor decomposition (2), with the factor vector $X_{t}$ following the dynamics:

 $\mathrm{d}X_{t}=O(\mathrm{d}t)+\bm{\varSigma}_{t}\mathrm{d}W_{t}$ (4)

where $\bm{\varSigma}_{t}$ is an adapted $K\times N$-dimensional volatility process. Then the forward rate volatility process $\varSigma_{t}(\tau)$ is given by:

 $\varSigma_{t}(\tau)^{\mathrm{T}}=B(\tau)\bm{\varSigma}_{t}$ (5)

Plugging the forward rate representation (2) into the Musiela HJM equation (3) and using (5) we get:

 $\displaystyle B(\tau)\mathrm{d}X_{t}+\mathrm{d}\bar{f}_{t}(\tau)$ $\displaystyle\qquad{}=\bigg{(}B^{\prime}(\tau)X_{t}+\bar{f}_{t}^{\prime}(\tau)% +B(\tau)\bm{\varSigma}_{t}\bm{\varSigma}_{t}^{\mathrm{T}}\int_{0}^{\tau}B(u)^{% \mathrm{T}}\mathrm{d}u\bigg{)}\mathrm{d}t$ $\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad{}+B(% \tau)\bm{\varSigma}_{t}\mathrm{d}W_{t}$ (6)

We eliminate the deterministic error term $\smash{\bar{f}_{t}(\tau)}$ from both sides of this equation by setting $\smash{\mathrm{d}\bar{f}_{t}(\tau)=\bar{f}_{t}^{\prime}(\tau)\mathrm{d}t}$, which corresponds to $\smash{\bar{f}_{t}(\tau)}$ following the implied-forward evolution:

 $\bar{f}_{t}(\tau)=\bar{f}_{0}(t+\tau)$ (7)

Then (6) becomes:

 $B(\tau)\mathrm{d}X_{t}=\bigg{(}B^{\prime}(\tau)X_{t}+B(\tau)\bm{\varSigma}_{t}% \bm{\varSigma}_{t}^{\mathrm{T}}\int_{0}^{\tau}B(u)^{\mathrm{T}}\mathrm{d}u% \bigg{)}\mathrm{d}t+B(\tau)\bm{\varSigma}_{t}\mathrm{d}W_{t}$ (8)

Therefore:

 $f_{t}(\tau):=B(\tau)X_{t}+\bar{f}_{0}(t+\tau)$ (9)

defines an arbitrage-free forward rate curve evolution if $X_{t}$ satisfies (8).

The no-arbitrage condition (8) is nontrivial. In particular, it imposes an admissibility constraint on the basis $B(\tau)$ discussed in the next section. In the section following the next one, we show how to extract the dynamics of the factor vector $X_{t}$ from (8). To incorporate the HJM convexity term, we enlarge both the factor vector $X_{t}$ and the basis $B(\tau)$ while preserving the factor structure (9) of the solution.

## Class of admissible bases $B(\tau)$

For the factor approach outlined in the previous section to have practical value, it must work for a broad range of volatility processes without requiring a change to the functional basis. In particular, the approach should work in the implied-forward case of zero volatility, which is the common limit case for all volatility processes. We can arrive at the zero-volatility case requirement more formally by requiring that, for a given $B(\tau)$, the approach should work for all constant volatility matrixes, or alternatively, that scaling the volatility term down by a scalar $\alpha\in[0,1)$ should not negatively affect the solvability of (8).

Setting the volatility term to zero in (8), we get:

 $B(\tau)\mathrm{d}X_{t}=B^{\prime}(\tau)X_{t}\mathrm{d}t$

For this equation to admit a solution for any initial vector $X_{0}$, all entries of vector $B^{\prime}(\tau)$ should be spanned by the basis $B(\tau)$, which implies that there exists a square matrix $\bm{D}$ such that:

 $B^{\prime}(\tau)=B(\tau)\bm{D}$ (10)

Therefore, the basis $B(\tau)$ should satisfy:

 $B(\tau)=B_{0}\mathrm{e}^{\tau\bm{D}}$ (11)

where $B_{0}=B(0)$ is a constant row vector.

Note that (11) implies that all elements of basis $B(\tau)$ can be written as linear combinations of basic functions of the form $\tau^{i}\mathrm{e}^{\lambda_{j}\tau}$, where $\lambda_{j}$ are eigenvalues of the matrix $\bm{D}$.111 Note that the polynomial factors $\tau^{i}$ appear only in the case of the non-diagonalizable matrix $\bm{D}$. Following Björk & Christensen (1999), we call the functions that can be represented in this form the exponential polynomial (EP) family. Similar to polynomial and exponential functions, the EP family is closed with respect to differentiation, integration, multiplication and addition.

Following the notation we used with the Nelson-Siegel and Svensson bases, we will write the basic components of each EP function with a negative sign in front of the lambda parameters: $\tau^{i}\mathrm{e}^{-\lambda_{j}\tau}$.

Both the Nelson-Siegel and Svensson bases are composed of EP functions, but while the Nelson-Siegel basis admits representation (11), the Svensson basis does not, since the derivative of its last basis function, $\tau\mathrm{e}^{-\lambda_{2}\tau}$, is not spanned by the basis. Thus, condition (11) is stronger than just being composed of EP functions. We will call any basis $B(\tau)$ admitting the representation (11) a complete EP basis and call the corresponding matrix $\bm{D}$ the generating matrix of basis $B(\tau)$.

We next derive a standardized form for the complete EP bases. Let $\bm{D}$ be the generating matrix of a complete EP basis $B(\tau)$, and let $-\lambda_{1},\dots,-\lambda_{L}$ and $\pi_{1},\dots,\pi_{L}$ be, respectively, its (distinct) eigenvalues and their (algebraic) multiplicities.222 For $B(\tau)$ given by (11) to be a proper functional basis (ie, consisting of linearly independent functions), the geometric multiplicities should equal 1. Define $\varLambda=\{\lambda_{1},\dots,\lambda_{L}\}$ and $\varPi=\{\pi_{1},\dots,\pi_{L}\}$, which we call the spectral and the multiplicity sets of basis $B(\tau)$, respectively. By replacing the generating matrix $\bm{D}$ in representation (11) with its Jordan normal form, we get the following equivalent standardized form for basis $B(\tau)$ (see Lyashenko & Goncharov (2021) for details):

 $\mathrm{EP}_{\varLambda,\varPi}(\tau):=(\mathrm{EP}_{\lambda_{1},\pi_{1}}(\tau% ),\dots,\mathrm{EP}_{\lambda_{L},\pi_{L}}(\tau))$

where for any real number $\lambda$ and positive integer $n$:

 $\mathrm{EP}_{\lambda,n}(\tau):=\bigg{(}\mathrm{e}^{-\lambda\tau},\frac{\tau}{1% !}\mathrm{e}^{-\lambda\tau},\dots,\frac{\tau^{n-1}}{(n-1)!}\mathrm{e}^{-% \lambda\tau}\bigg{)}$

Thus, any complete EP basis is uniquely determined by its spectral and multiplicity sets. The Nelson-Siegel basis is a simple example of the standardized complete EP basis with $\varLambda=\{0,\lambda\}$ and $\varPi=\{1,2\}$.

For the forward rate curve to have intuitive features such as relaxation at longer tenors, we will limit our consideration to the case of non-negative real-valued parameters $\lambda_{j}$. Therefore, without loss of generality, we can assume the lambda values are ordered as follows: $0\leq\lambda_{1}<\cdots<\lambda_{L}$.

By limiting our consideration to spectral sets $\varLambda$ composed of non-negative values $\lambda_{i}$, we prevent forward rates from growing exponentially with tenor $\tau$. There are two distinct cases in terms of whether the first spectral value $\lambda_{1}$ is positive or zero. In the $\lambda_{1}>0$ case, the generating matrix $\bm{D}$ is non-singular and the finite-dimensional space spanned by $B(\tau)$ is invariant with respect to both differentiation and integration. The $\lambda_{1}=0$ case, which is the case of the Nelson-Siegel basis, requires special care since the generating matrix $\bm{D}$ is singular. In this case, we require $\pi_{1}=1$ to prevent the forward rates from growing unboundedly with tenor $\tau$. The presence of a constant as the first basis function makes this case degenerate, in the sense that differentiation reduces the dimension of the space spanned by $B(\tau)$, and integration makes this space incomplete, in the sense that it is no longer spanned by a complete EP basis.

## Factor HJM modelling framework

Let the forward rate basis $B(\tau)=(b_{1}(\tau),\dots,b_{K}(\tau))$ be a complete EP basis with a generating matrix $\bm{D}$ and the spectral set $\varLambda=\{\lambda_{1},\dots,\lambda_{L}\}$ composed of non-negative values. Using (10), we can rewrite the no-arbitrage equation (8) as follows:

 $\displaystyle B(\tau)\mathrm{d}X_{t}$ $\displaystyle=B(\tau)\bm{D}X_{t}\mathrm{d}t+B(\tau)\bm{\varSigma}_{t}\mathrm{d% }W_{t}$ $\displaystyle\qquad\qquad\qquad{}+B(\tau)\bm{\varSigma}_{t}\bm{\varSigma}_{t}^% {\mathrm{T}}\bigg{(}\int_{0}^{\tau}B(u)\mathrm{d}u\bigg{)}^{\!\mathrm{T}}% \mathrm{d}t$ (12)

Note that the first three terms in the above equation are spanned by the basis $B(\tau)$ while the HJM convexity term $B(\tau)\bm{\varSigma}_{t}\bm{\varSigma}_{t}^{\mathrm{T}}\smash{(\int_{0}^{\tau% }B(u)\mathrm{d}u)^{\mathrm{T}}}$ is spanned by functions of the form $\smash{b_{i}(\tau)\int_{0}^{\tau}b_{j}(u)\mathrm{d}u}$ that also belong to the EP family but generally lie outside the space spanned by $B(\tau)$.

Let $\smash{\tilde{B}(\tau)}$ be the smallest standardized complete EP basis that spans functions $\smash{b_{i}(\tau)\int_{0}^{\tau}b_{j}(u)\mathrm{d}u}$, and let $\smash{\tilde{\bm{D}}}$ be its generating matrix. It is easy to check that $\smash{\tilde{B}(\tau)}$ is an extension of the basis $B(\tau)$ and that its spectrum set is given by $\smash{\tilde{\varLambda}}=\varLambda\cup\{\lambda_{i}+\lambda_{j}\mid i,j=1,% \dots,L\}$. For instance, in the case of the Nelson-Siegel basis we have:

 $\displaystyle B(\tau)$ $\displaystyle=(1;\mathrm{e}^{-\lambda\tau},\tau\mathrm{e}^{-\lambda\tau})$ $\displaystyle\tilde{B}(\tau)$ $\displaystyle=(1,\tau;\mathrm{e}^{-\lambda\tau},\tau\mathrm{e}^{-\lambda\tau},% \tfrac{1}{2}\tau^{2}\mathrm{e}^{-\lambda\tau};\mathrm{e}^{-2\lambda\tau},\tau% \mathrm{e}^{-2\lambda\tau},\tfrac{1}{2}\tau^{2}\mathrm{e}^{-2\lambda\tau})$

Define a column vector process $\varOmega_{t}$ using the equation:

 $\displaystyle\tilde{B}(\tau)\varOmega_{t}$ $\displaystyle=B(\tau)\bm{\varSigma}_{t}\bm{\varSigma}_{t}^{\mathrm{T}}\bigg{(}% \int_{0}^{\tau}B(u)\mathrm{d}u\bigg{)}^{\!\mathrm{T}}$ $\displaystyle=\sum_{i,j}(\bm{\varSigma}_{t}\bm{\varSigma}_{t}^{\mathrm{T}})_{i% ,j}b_{i}(\tau)\int_{0}^{\tau}b_{j}(u)\mathrm{d}u$ (13)

Solving for $\varOmega_{t}$ is straightforward since the right-hand side of the above equation is spanned by $\smash{\tilde{B}(\tau)}$, and thus all entries of vector $\varOmega_{t}$ are just fixed linear combinations of entries of matrix $\smash{\bm{\varSigma}_{t}\bm{\varSigma}_{t}^{\mathrm{T}}}$, so we can compute $\varOmega_{t}$ by a simple rearrangement of terms.

Using (13), we can rewrite (12) as follows:

 $B(\tau)\mathrm{d}X_{t}=B(\tau)\bm{D}X_{t}\mathrm{d}t+B(\tau)\bm{\varSigma}_{t}% \mathrm{d}W_{t}+\tilde{B}(\tau)\varOmega_{t}\mathrm{d}t$ (14)

We clearly have a space mismatch, where the first three terms belong to the finite-dimensional space spanned by $B(\tau)$ while the last term belongs to the space spanned by a larger basis $\smash{\tilde{B}(\tau)}$.

To resolve this seemingly unsolvable situation, we borrow the key idea from the seminal paper of Cheyette (1992): we add to the forward curve decomposition (9) a locally deterministic term that incorporates the convexity term $\smash{\tilde{B}(\tau)\varOmega_{t}\mathrm{d}t}$ in (14) without adding a new convexity term. More specifically, we enlarge the forward curve decomposition (9) by adding a locally deterministic term spanned by $\smash{\tilde{B}(\tau)}$ as follows:

 $f_{t}(\tau):=B(\tau)X_{t}+\tilde{B}(\tau)Y_{t}+\bar{f}_{0}(t+\tau)$ (15)

where $Y_{t}$ is an auxiliary factor vector that follows a locally deterministic process.333 Unlike in Cheyette (1992), where $K(K+3)/2$ auxiliary factors are added to incorporate the convexity terms $\smash{b_{i}(\tau)\int_{0}^{\tau}b_{j}(u)\mathrm{d}u}$, the size of our auxiliary factor $Y_{t}$ is the smallest possible. In the section entitled ‘Selecting basis $B(\tau)$’, we show that the size of $Y_{t}$ can be made as small as $3K-1$ by selecting the main basis $B(\tau)$ in an optimal way. Plugging this extended decomposition into the Musiela HJM equation (3), we get:

 $\displaystyle B(\tau)\mathrm{d}X_{t}+\tilde{B}(\tau)\mathrm{d}Y_{t}$ $\displaystyle\qquad{}=(B(\tau)\bm{D}X_{t}+\tilde{B}(\tau)\tilde{\bm{D}}Y_{t}+% \tilde{B}(\tau)\varOmega_{t})\mathrm{d}t+B(\tau)\bm{\varSigma}_{t}\mathrm{d}W_% {t}$ (16)

Grouping together terms with the same basis, we can write this equation as follows:

 $B(\tau)(\mathrm{d}X_{t}-\bm{D}X_{t}\mathrm{d}t-\bm{\varSigma}_{t}\mathrm{d}W_{% t})+\tilde{B}(\tau)(\mathrm{d}Y_{t}-\tilde{\bm{D}}Y_{t}\mathrm{d}t-\varOmega_{% t}\mathrm{d}t)=0$ (17)

We satisfy this equation by setting the terms in parentheses to zero, which leads to our main result.444 Note that decomposition (17) is not unique, because the main basis $B(\tau)$ is part of the extended basis $\smash{\tilde{B}(\tau)}$. For instance, we could incorporate all of the terms into the second term or, alternatively, incorporate all of the terms spanned by $B(\tau)$ into the first term. Using a different decomposition would generally lead to different equations for factors $X_{t}$ and $Y_{t}$. However, the forward rate curve evolution given by (15) will be the same as long as the Musiela HJM equation (3), with a volatility process of the form (5), admits a unique solution.

###### Proposition 1 (FHJM approach).

Let the factor vectors $X_{t}$ and $Y_{t}$ satisfy:

 $\displaystyle\mathrm{d}X_{t}$ $\displaystyle=\bm{D}X_{t}\mathrm{d}t+\bm{\varSigma}_{t}\mathrm{d}W_{t}$ (18) $\displaystyle\mathrm{d}Y_{t}$ $\displaystyle=(\tilde{\bm{D}}Y_{t}+\varOmega_{t})\mathrm{d}t$ (19)

where $\varOmega_{t}$ is defined by (13). Then, for any differentiable function $\smash{\bar{f}_{0}(\tau)}$, (15) defines a no-arbitrage evolution of the forward rate curve.

###### Remark 1.

Equation (15) represents a Cheyette-style decomposition of the forward rate curve into an implied-forward part $\smash{\bar{f}_{0}(t+\tau)}$, the main stochastic term $B(\tau)X_{t}$ and an auxiliary locally deterministic term $\smash{\tilde{B}(\tau)Y_{t}}$, regarded as a convexity term needed to keep the dynamics arbitrage-free. Since $\smash{\tilde{B}(\tau)}$ is an extension of the basis $B(\tau)$, we can incorporate $X_{t}$ into $Y_{t}$ to obtain the solution (15) written in the standard factor form (9), where $B(\tau)$ is replaced with $\smash{\tilde{B}(\tau)}$. Therefore, the extended basis $\smash{\tilde{B}(\tau)}$ can be regarded as the solution basis, with the original basis $B(\tau)$ treated as the shock basis.

###### Remark 2.

Setting the initial values for both the main and auxiliary factors ($X_{0}$ and $Y_{0}$, respectively) to zero leads to the remainder-term $\smash{\bar{f}_{0}}$ being equal to the initial forward curve: $\smash{\bar{f}_{0}(\tau)=f_{0}(\tau)}$. This is consistent with the original Cheyette approach, where factors $X_{t}$ and $Y_{t}$ describe the deviation of the curve from the implied-forward evolution, and that is how Cheyette-type models are typically set. Using non-zero initial factor values $X_{0}$ and $Y_{0}$, which are determined by projecting the initial curve to the main and auxiliary bases, can be beneficial in the local volatility case with the process $\bm{\varSigma}_{t}$ modelled as a function of the factor values.

###### Remark 3.

Both factors $X_{t}$ and $Y_{t}$ mean revert if the spectral set $\varLambda=\{\lambda_{1},\dots,\lambda_{L}\}$ of the main EP basis $B(\tau)$ consists of strictly positive values. Indeed, in this case the spectral set $\smash{\tilde{\varLambda}}=\varLambda\cup\{\lambda_{i}+\lambda_{j}\mid i,j=1,% \dots,L\}$ of the auxiliary EP basis $\smash{\tilde{B}(\tau)}$ also consists of strictly positive values, and thus the generating matrixes $\bm{D}$ and $\smash{\tilde{\bm{D}}}$ have negative eigenvalues.

###### Remark 4.

In the case of the deterministic volatility process $\bm{\varSigma}_{t}$, the auxiliary factor $Y_{t}$ is deterministic and we can rewrite (15) in the form:

 $f_{t}(\tau)=B(\tau)X_{t}+G(t,\tau)$ (20)

where $G(t,\tau)$ is a deterministic function that can be computed analytically from the no-arbitrage condition. Many well-known analytically tractable models have solutions of the form (20). For instance, the Ho-Lee and Hull-White models correspond to the $B(\tau)=(\mathrm{e}^{-\lambda\tau})$ case, where $\lambda$ is the mean reversion parameter (set to zero under the Ho-Lee model). The two-factor Hull-White model corresponds to the $B(\tau)=(\mathrm{e}^{-\lambda_{1}\tau},\mathrm{e}^{-\lambda_{2}\tau})$ case, where $\lambda_{1}$ and $\lambda_{2}$ are mean reversion parameters. This, in particular, provides some intuition as to why we need $\lambda_{1}\neq\lambda_{2}$ for the model to be genuinely two dimensional. Generally, any affine model with a deterministic volatility matrix has a solution of the form (20), where $B(\tau)$ is a complete EP basis.

###### Remark 5.

Even though we formulate the FHJM modelling framework using instantaneous forward rates, we can easily derive the corresponding equations for any other continuous rates. For instance, consider a grid of maturities $0 and define a set of spot rates:

 $R_{t}^{m}=\frac{1}{T_{m}}\int_{0}^{T_{m}}f_{t}(\tau)\mathrm{d}\tau,\quad m=1,% \dots,M$

From the forward rate representation (15), we find that the spot rate column vector $R_{t}=(R_{t}^{1},\dots,R_{t}^{M})^{\mathrm{T}}$ is given by:

 $R_{t}=\bm{B}_{R}X_{t}+\tilde{\bm{B}}_{R}Y_{t}+\bar{R}(t)$

where $\bm{B}_{R}$ and $\smash{\tilde{\bm{B}}_{R}}$ are matrixes of full rank, as shown in Lyashenko & Goncharov (2021), and $\smash{\bar{R}(t)}$ is a deterministic vector function.

We can use the property that $\mathrm{d}R_{t}=O(\mathrm{d}t)+\bm{B}_{R}\bm{\varSigma}_{t}\mathrm{d}W_{t}$ to parameterize $\bm{\varSigma}_{t}$ and to estimate its correlation structure from historical data.

###### Remark 6.

The FHJM modelling approach described above is closely related to the quasi-Gaussian (Cheyette) modelling framework proposed in Cheyette (1992) and developed further by Andersen and Andreasen (see Andersen & Piterbarg (2010, chapter 13) for details). In fact, our approach formally belongs to the multi-factor quasi-Gaussian model family, and thus the analytic machinery developed and discussed in the above references can be directly applied. This becomes especially useful when calibrating the model using approximate pricing formulas and when dealing with the stochastic volatility case.

## Nelson-Siegel and Svensson bases

In this section we consider the case of the Nelson-Siegel basis (Nelson & Siegel 1987) and its Svensson (1995) extension. Because of their popularity in the econometric modelling of yield curves, the Nelson-Siegel and Svensson bases have also been studied in terms of their ability to sustain an arbitrage-free dynamics of a forward curve. For instance, Filipović (1999) showed that, for the Nelson-Siegel basis:

 $B_{\mathrm{NS}}(\tau)=(1,\mathrm{e}^{-\lambda\tau},\tau\mathrm{e}^{-\lambda% \tau})$

the forward curve evolution given by $f_{t}(\tau)=B_{\mathrm{NS}}(\tau)X_{t}$ is arbitrage-free only if $X_{t}$ follows a deterministic process. This is fully consistent with (12), which shows that in the stochastic case the basis should be enlarged to include the non-spanned part of the HJM convexity term. In the non-degenerate case of $\lambda>0$, the Nelson-Siegel basis can completely incorporate the HJM convexity term only if all entries of the volatility matrix $\bm{\varSigma}_{t}$ are zero.

In the degenerate case of $\lambda=0$, where $B_{\mathrm{NS}}(\tau)=(1,\tau)$, we can make the forward curve evolution given by $f_{t}(\tau)=B_{\mathrm{NS}}(\tau)X_{t}$ arbitrage-free if we make only the first factor stochastic by setting all entries of $\bm{\varSigma}_{t}$ except for $\sigma_{1,1}$ to zero and incorporating the resulting HJM convexity term $\sigma_{1,1}^{2}\tau$ with $b_{2}(\tau)=\tau$. This approach is equivalent to the Ho-Lee model, as noted in Björk & Christensen (1999).

Christensen et al (2011) showed that the forward rate process of the form $f_{t}(\tau)=B_{\mathrm{NS}}(\tau)X_{t}+G(t,\tau)$ with a deterministic $G(t,\tau)$ can be made arbitrage-free if the factor vector $X_{t}$ follows (18), where the mean reversion matrix $\bm{D}$ has the form:

 $\bm{D}=\begin{pmatrix}0&0&0\\ 0&-\lambda&1\\ 0&0&-\lambda\end{pmatrix}$

Christensen et al (2011) did not derive this form of matrix $\bm{D}$ but rather postulated it from the very beginning and proved that the resulting dynamics is arbitrage-free under an appropriate selection of locally deterministic term $G(t,\tau)$. We immediately see that $\bm{D}$ is the generating matrix for basis $B_{\mathrm{NS}}(\tau)$, which explains why $\bm{D}$ has this peculiar form.

The Svensson (1995) extension of the Nelson-Siegel basis:

 $B_{\mathrm{S}}(\tau)=(1,\mathrm{e}^{-\lambda_{1}\tau},\tau\mathrm{e}^{-\lambda% _{1}\tau},\tau\mathrm{e}^{-\lambda_{2}\tau})$

is incomplete, and thus it cannot sustain a non-trivial arbitrage-free evolution. Christensen et al (2009) considered the following complete extension of the Svensson basis, which they called the generalized Nelson-Siegel (GNS):

 $B_{\mathrm{GNS}}(\tau)=(1,\mathrm{e}^{-\lambda_{1}\tau},\tau\mathrm{e}^{-% \lambda_{1}\tau},\mathrm{e}^{-\lambda_{2}\tau},\tau\mathrm{e}^{-\lambda_{2}% \tau})$

They showed that the forward rate process of the form:

 $f_{t}(\tau)=B_{\mathrm{GNS}}(\tau)X_{t}+G(t,\tau)$

with a deterministic $G(t,\tau)$ could be made arbitrage-free if the factor vector $X_{t}$ followed (18) with the mean reversion matrix $\bm{D}$ of a specific form that happened to be identical to the generating matrix of basis $B_{\mathrm{GNS}}(\tau)$. Similar to their Nelson-Siegel result, they postulated this form of the mean reversion matrix $\bm{D}$ from the very beginning rather than deriving it.

Note that since both $B_{\mathrm{NS}}(\tau)$ and $B_{\mathrm{GNS}}(\tau)$ have a constant as their first basis function, $b_{1}(\tau)=1$, their auxiliary bases $\smash{\tilde{B}_{\mathrm{NS}}(\tau)}$ and $\smash{\tilde{B}_{\mathrm{GNS}}(\tau)}$ have the second basis function $\smash{\tilde{b}_{2}(\tau)=\tau}$, which grows indefinitely with the tenor $\tau$. This results in a rate distribution where the mean and variance can grow indefinitely, which is attributable to the absence of mean reversion for the first factor.

## Selecting basis $B(\tau)$

The FHJM modelling framework is fully specified by selecting the basis $B(\tau)$ and specifying the volatility process $\bm{\varSigma}_{t}$. While specification and calibration of the volatility process are beyond the scope of this paper, in this section we discuss possible approaches for selecting the basis $B(\tau)$, which is the first step in specifying the model. Since $B(\tau)$ spans the space of shocks to the instantaneous forward curve $f_{t}(\tau)$, a natural approach is to select $B(\tau)$ empirically by fitting it to the historical yield curve shock data.

Since $B(\tau)$ is a complete EP basis, it is fully specified by selecting its spectral values $\lambda_{1},\dots,\lambda_{L}$ and their multiplicities $\pi_{1},\dots,\pi_{L}$. Fitting a parametric family of functions to empirical shock data is sometimes referred to as the synthetic, or functional, principal component approach.

In the pricing model literature, the spectral parameters $\lambda_{i}$ are usually treated as mean reversion speed parameters, to be calibrated to the liquid market option data while setting their multiplicities to 1, which amounts to using a pure exponential (PE) basis consisting of functions $\mathrm{e}^{-\lambda_{i}\tau}$. Figure 1: Covariance matrix Frobenious norm fit error for US dollar Libor/swap curve data between July 7, 2000 and November 8, 2021 using the basis Bλ1,λ2⁢(τ). Horizontal coordinates are λ1 (left axis) and λ2-λ1 (right axis). The green and red curves correspond to the multiplicity cases λ1=0 and λ2=λ1, respectively. Figure 2: Covariance matrix Frobenious norm fit errors for US dollar Libor/swap curve data between July 7, 2000 and November 8, 2021 across the number of basis functions K=2,…,10, with the spectral values selected using global (red) and uniform (green) approaches.

Note that since:

 $\tau^{n}\mathrm{e}^{-\lambda\tau}=-\frac{\partial}{\partial\lambda}(\tau^{n-1}% \mathrm{e}^{-\lambda\tau})$

we can always approximate the space spanned by a single-lambda EP basis:

 $\mathrm{EP}_{\lambda,n}(\tau):=\bigg{(}\mathrm{e}^{-\lambda\tau},\frac{\tau}{1% !}\mathrm{e}^{-\lambda\tau},\dots,\frac{\tau^{n-1}}{(n-1)!}\mathrm{e}^{-% \lambda\tau}\bigg{)}$

with the space spanned by a PE basis of the same size:

 $\mathrm{PE}_{\lambda_{1},\dots,\lambda_{n}}(\tau):=(\mathrm{e}^{-\lambda_{1}% \tau},\dots,\mathrm{e}^{-\lambda_{n}\tau})$

where $\lambda_{1},\dots,\lambda_{n}$ are distinct values close to $\lambda$. In other words, the space spanned by the single-lambda EP basis $\mathrm{EP}_{\lambda,n}(\tau)$ is a limit case of the space spanned by the PE basis $\mathrm{PE}_{\lambda_{1},\dots,\lambda_{n}}(\tau)$ when $\lambda_{1},\dots,\lambda_{n}$ converge to $\lambda$. We illustrate this point in figure 1, which shows the covariance matrix fitting error surface for the US dollar London Interbank Offered Rate (Libor)/swap curve between July 7, 2000 and November 8, 2021 when using the following three-dimensional complete EP basis parameterized by two non-negative parameters $0\leq\lambda_{1}\leq\lambda_{2}$:

 $B_{\lambda_{1},\lambda_{2}}(\tau)=\begin{cases}(1,\tau,\tau^{2}),&0=\lambda_{1% }=\lambda_{2}\\ (1,\tau,\mathrm{e}^{-\lambda_{2}\tau}),&0=\lambda_{1}<\lambda_{2}\\ (1,\mathrm{e}^{-\lambda_{1}\tau},\tau\mathrm{e}^{-\lambda_{1}\tau}),&0<\lambda% _{1}=\lambda_{2}\\ (1,\mathrm{e}^{-\lambda_{1}\tau},\mathrm{e}^{-\lambda_{2}\tau}),&0<\lambda_{1}% <\lambda_{2}\end{cases}$

The error surface is smooth everywhere, including the multiplicity points of $0=\lambda_{1}<\lambda_{2}$ (green curve), $0<\lambda_{1}=\lambda_{2}$ (red curve) and $0=\lambda_{1}=\lambda_{2}$ (intersection of green and red curves).

This observation allows us to approach the task of selecting the EP basis $B(\tau)$ of size $K$ as an optimization problem of selecting $K$ non-negative, but not necessarily distinct, values $0\leq\lambda_{1}\leq\cdots\leq\lambda_{K}$, with the understanding that each value $\lambda_{i}$ repeated $\pi_{i}$ times is represented by the $\smash{\mathrm{EP}_{\lambda_{i},\pi_{i}}(\tau)}$ sub-basis in the main basis $B(\tau)$.

In our experience, the specific values of parameters $\lambda_{1},\dots,\lambda_{K}$ are not that important as long as their number, $K$, is not too small and they cover a wide enough value range.555 This assertion is consistent with the recommendations regarding the selection of mean reversion values found in Andreasen (2005) and Andersen & Piterbarg (2010). This means that we can stick to the PE case of distinct spectral parameter values $\lambda_{i}$ because of its simplicity and computational convenience.

The simplest way to specify the spectral values $\lambda_{1}<\cdots<\lambda_{K}$ is to spread them uniformly using the following simple formula:

 $\lambda_{i}^{\mathrm{Uniform}}=\alpha+(i-1)\beta,\quad i=1,\dots,K$

where $0<\alpha<\beta$. One advantage of using the uniform spectrum $\varLambda=\{\alpha+(i-1)\beta\mid i=1,\dots,K\}$ is that it results in a simple extended spectrum $\smash{\tilde{\varLambda}}=\varLambda\cup\{\lambda_{i}+\lambda_{j}\mid i,j=1,% \dots,K\}$. Since $0<\alpha<\beta$, the extended basis $\smash{\tilde{B}(\tau)}$ is also PE, with the extended spectrum given by:

 $\tilde{\varLambda}=\{\alpha+(i-1)\beta\mid i=1,\dots,K\}\cup\{2\alpha+(k-1)% \beta\mid k=1,\dots,2K-1\}$

The overall dimensionality of the solution, given by the dimensionality of $\smash{\tilde{B}(\tau)}$, is $3K-1$. This means that in the case of a uniform spectrum the solution’s dimensionality grows linearly with the dimensionality of the main basis $B(\tau)$, while in the general case it grows quadratically with $K$.

To compare the performance of different EP bases in fitting historical data, we computed covariance matrix Frobenious norm fit errors for US dollar Libor/swap curve data between July 7, 2000 and November 8, 2021 across dimension values $K=2,\dots,10$. For each $K$, we selected the spectral values $\lambda_{1}\leq\cdots\leq\lambda_{K}$ using two different approaches:

• the global approach, where we selected the spectral values through a multi-dimensional search to minimize the fit error; and

• the uniform approach, where we selected the spectral values using the uniform formula, with $\alpha=0.0001$ and $\beta$ determined through a one-dimensional search to minimize the fit error.

Figure 2 shows the error curves for each approach across the number of basis functions. For $K\geq 5$ there is practically no difference in the fitting error for the two approaches. This suggests that we can use the most convenient approach as long as the number of basis functions is not too small.

## Conclusions

In this paper, we proposed a new risk-neutral yield curve modelling framework, the FHJM, which features the intuitiveness, transparency and efficiency of the factor modeling approach on which it is based.

Andrei Lyashenko is the head of market pricing and risk models and Yevgeny Goncharov is a leading research analyst. They both work at Quantitative Risk Management, Inc. (QRM) in Chicago. The authors thank Yutian Nie, Vladimir Piterbarg and the anonymous referees for valuable feedback.

Email: andrei.lyashenko@qrm.com, yevgeny.goncharov@qrm.com.

## References

• Andersen L and V Piterbarg, 2010
Interest Rate Modeling
Atlantic Financial Press
• Andreasen J, 2005
Back to the future
Risk September, pages 104–109
• Björk T and B Christensen, 1999
Interest rate dynamics and consistent forward rate curves
Mathematical Finance 9(4), pages 323–348
• Cheyette O, 1992
Markov representation of the Heath-Jarrow-Morton model
Working Paper, BARRA
• Christensen J, F Diebold and G Rudebusch, 2009
An arbitrage-free generalized Nelson-Siegel term structure model
Econometrics Journal 12, pages C33–C64
• Christensen J, F Diebold and G Rudebusch, 2011
The affine arbitrage-free class of Nelson-Siegel term structure models
Journal of Econometrics 164, pages 4–20
• Filipović D, 1999
A note on the Nelson-Siegel family
Mathematical Finance 9(4), pages 349–359
• Heath D, R Jarrow and A Morton, 1992
Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation
Econometrica 60, pages 77–105
• Lyashenko A and Y Goncharov, 2021
Bridging P-Q modeling divide with factor HJM modeling framework
Preprint, SSRN 3995533
• Musiela M, 1993
Stochastic PDEs and term structure models
Preprint, University of New South Wales
• Nelson C and A Siegel, 1987
Parsimonious modeling of yield curves
Journal of Business 60, pages 473–489
• Svensson L, 1995
Estimating forward interest rates with the extended Nelson–Siegel method
Sveriges Riksbank Quarterly Review 3, pages 13–26

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

#### More on Cutting Edge

##### Getting more for less: better A / B testing via causal regularisation

A causal machine learning algorithm is used to estimate trades’ price impact

##### Analytic risk-free rates option pricing with smile and skew

An arbitrage-free short-rate model for backward-looking compounded rates is presented

An analytic approximation for the implied volatility surface of basket options is introduced

##### Podcast: Artur Sepp on rates volatility and decentralised finance

Quant says high volatility requires pricing and risk management models to be revisited

##### A robust stochastic volatility model for interest rates

A swaption pricing model based on a single-factor Cheyette model is shown to fit accurately

##### Podcast: Julien Guyon on volatility modelling and World Cup draws

​​​​​​​Academic discusses option pricing, path-dependent volatility and tackling FIFA’s statistical bias

##### Does the term structure of the at-the-money skew really follow a power law?

A power law can fit the ATM skew, but struggles with short maturities

##### Obtaining arbitrage-free FX implied volatility by variational inference

An ML-based algorithm that provides implied volatilities from bid-ask prices is proposed