# A FAVAR modeling approach to credit risk stress testing and its application to the Hong Kong banking industry

## Zhifeng Wang and Fangying Wei

#### Need to know

• An empirical credit risk stress testing model is proposed to utilize the information of a large number of macroeconomic variables as model input without curse of dimensionality.
• The dynamic inter-relationship among macroeconomic variables and credit risk loss measures is studied without exogeneity assumption.
• Multi-period projection of credit risk loss in response to macroeconomic shocks is constructed by utilizing impulse response function.
• The proposed model is applied to explore the credit risk loss of Hong Kong banking industry over the period of historical financial crisis.

#### Abstract

In October 2018, the Basel Committee on Banking Supervision (BCBS) published its stress testing principles. One of these principles is about stress testing model validation, aided by business interpretation, benchmark comparison and backtesting. In this paper, a credit risk stress testing model based on the factor-augmented vector autoregressive (FAVAR) approach is proposed to project credit risk loss under stressed scenarios. Inherited from both factor analysis (FA) and the vector autoregressive (VAR) model, the FAVAR approach ensures that the proposed model has many appealing features. First, a large number of model input variables can be reduced to a handful of latent common factors to avoid the curse of dimensionality. Second, the dynamic interrelationship among macroeconomic variables and credit risk loss measures can be studied without exogeneity assumptions. Moreover, the application of the impulse response function facilitates the multiperiod projection of credit risk loss in response to macroeconomic shocks. All of these features make the proposed modeling framework a potentially handy solution to fulfilling the BCBS requirement of quantitative adequacy assessment of banks’ internal stress testing results with a benchmark model. The scope of its application can also extend to impairment modeling for International Financial Reporting Standard 9, which requires the projection of credit risk losses over consecutive periods under different macroeconomic scenarios.

## 1 Introduction

Stress testing is a common practice in the banking industry; it is used to evaluate the potential vulnerability of individual financial institutions or the banking industry as a whole under extreme but plausible macroeconomic scenarios. It plays a critical role in risk management, providing banks and regulatory authorities with a quantitative assessment of the unexpected adverse outcomes that other risk management tools fail to foresee. There are numerous factors affecting the quality of stress testing results, including reliability and granularity of input data, design of macroeconomic scenarios, mapping of scenarios to relevant risk factors, and integration of risks of various types. There is no single best industry practice to address all of these factors when performing stress testing.

Having seen the rapid evolution of stress testing in recent years, especially following the severe distresses of the global financial crisis (GFC) in 2008, the Basel Committee on Banking Supervision (BCBS) published its stress testing principles in October 2018. Their purpose was to guide banks and authorities on the core elements of stress testing frameworks (Basel Committee on Banking Supervision 2018). One of these principles is about stress testing model validation based on business interpretation, benchmark comparison and backtesting. It states that: “Stress testing models, results and frameworks should be subject to challenge and regular review. It is expected that this review would include an assessment of the overall adequacy of the exercise, eg, backtesting or other benchmark comparison.”

Backtesting, a generic form of model validation, usually compares actual historical results with model predictions for the same portfolio in the same scenarios. Backtesting stress testing models in the usual way may not work because hypothetical macroeconomic scenarios, by definition, cannot happen in the past. One idea is to replace these hypothetical scenarios with actual historical financial crisis scenarios of a similar magnitude and compare the model output with the actual historical losses. This idea may not work for two reasons. First, the actual historical loss data during the period of financial crisis may not be available to the bank. Second, the actual historical loss data may be irrelevant if the risk profile of the bank has changed over time. Another idea is to derive the stress testing model output under the prevailing market situation and to compare the model output with the current actual losses. There are several studies in the literature on this topic. Gersl and Seidler (2012), for instance, proposed a backtesting scheme based on actual ex post values of macroeconomic variables and suggested that a conservative buffer be imposed on stress testing results. Camara et al (2017) suggested estimating sensitivities of stress testing models with respect to macroeconomic shocks, explaining that these sensitivities can be applied to predict losses under actual economic situations; the predicted losses are then compared with the actual losses. These studies competently addressed the concerns regarding backtesting stress testing models. Nevertheless, questions remain as to whether a conservative buffer in a prevailing normal market situation is also conservative under extremely stressed market conditions, and whether sensitivities remain the same under normal and stressed conditions.

Considering the challenges of backtesting stress testing models, benchmark comparison can be an alternative method of stress testing model validation. There are various studies in the literature on stress testing methodologies and modeling approaches. Most of these studies use a relatively small number of macroeconomic variables, which may fail to give a full picture of the economic situation. However, including a large number of variables may induce the curse of dimensionality, especially when the number of samples is limited. To overcome this challenge, various dimension-reduction techniques have been developed, including principal component analysis (PCA), factor analysis (FA) and linear discriminant analysis (LDA). In particular, FA can reduce a large number of correlated observed variables to a small set of theoretical latent common factors, thereby preserving most of the information while avoiding the curse of dimensionality. Further, the vector autoregressive (VAR) model has been widely used in econometric research to study the dynamic interrelationship of variables without exogeneity assumptions. Taking advantage of both FA and VAR, Bernanke et al (2005) proposed the factor-augmented vector autoregressive (FAVAR) approach to study the effect of monetary policy innovations on the economy. The application of the FAVAR approach has been addressed widely in the literature, where the impact of an observable variable (usually the short-term monetary interest rate) on other latent factors (usually the macroeconomic variables) is analyzed.

In this paper, the FAVAR approach is applied to a credit risk stress testing model in order to project credit risk loss under stressed scenarios. The credit risk loss in this study is measured using nonperforming loan ratios.11 1 A nonperforming loan ratio (hereafter NPL) is defined as the percentage of classified loans out of the total balance of loans. In the five-tier loan classification system, classified loans come from the substandard, doubtful and loss categories. Unlike the existing applications of the FAVAR approach in the literature, this paper studies the impact of latent factors (macroeconomic variables) on the observable variable (NPL). Moreover, an impulse response function (IRF) is used to facilitate the multiperiod projection of NPL in response to macroeconomic shocks. The appealing features of FAVAR and IRF make the proposed modeling framework a potentially handy solution to fulfilling the BCBS requirement of quantitative adequacy assessment of banks’ internal stress testing results through a benchmark comparison.

The structure of this paper is as follows. Section 2 reviews stress testing methodologies and modeling approaches in the literature. Section 3 discusses the proposed FAVAR model in terms of fitting and diagnostics. Section 4 performs out-of-sample validation. Section 5 concludes.

## 2 Literature review

A thorough overview of stress testing methodologies and applications has already been compiled by Quagliariello (2009). Foglia (2009) reviewed the quantitative macro stress testing methods developed by authorities, revealing that macro stress testing is a three-step process:

• first, generating consistent macroeconomic stress scenarios;

• second, mapping the macroeconomic scenarios to risk factors; and

• third, calculating losses and assessing the results.

Cihak (2007) differentiated two approaches to credit risk stress testing: one based on portfolio-level loan performance measures, such as NPL, loan loss provision and historical default rate; the other based on micro-level data related to the default risk of the household and corporate sectors. Foglia (2009) observed that credit risk stress testing models usually include two to five macroeconomic factors as explanatory variables and noticed that the IRF is used to measure the impact of various macroeconomic variables on the default rate.

Gross and Poblacion (2019) grouped the models that map macroeconomic variables to credit risk measures into three categories:

• first, a single equation model with macro and micro measures as exogenous predictors;

• second, an endogenous VAR-type model for macroeconomic variables only, attached to a separate model for credit risk measures without considering feedback from credit risk to macroeconomic dynamics; and

• third, a VAR-type model with credit risk measures included, thus allowing for two-way feedback between risk measures and macroeconomic variables.

Camara et al (2017) proposed that the portfolio loss rate be contemporaneously regressed on three types of macroeconomic variables: gross domestic product (GDP), inflation and unemployment rate. Wong et al (2008) suggested applying the seemingly unrelated regression (SUR) method to the portfolio loss rate with respect to three types of macroeconomic and market variables: GDP, interest rate and property price. They also conducted a Monte Carlo simulation to build up the portfolio loss rate distribution. Fong and Wong (2008) applied a mixture vector autoregressive (MVAR) model to the same variables. The MVAR model is a mixture of two separate VAR models representing normal and abnormal market conditions.

In order to address the model uncertainty of a single equation model, Gross and Poblacion (2019) proposed the Bayesian model averaging (BMA) method. Papadopoulos (2017) applied the model combination method to cope with structural breaks and forecasters’ bias. Melecky and Podpiera (2012) claimed that a relative interpretation of stress testing results is preferred over an absolute treatment, since model limitations make it impossible to construct highly precise scenarios or to capture the relevant risks and their interrelationships. They suggested that a peer-group analysis is appropriate by comparing bank-specific results with the average of their peer group.

The literature has also observed a trend in industry practice: a move from single-shock stress scenarios to actual historical stress scenarios in order to integrate risks of various types, including credit, market, liquidity and contagion, and to assess the aggregated impact in a meaningful way. Fiori and Iannotti (2010) summarized two risk integration approaches: one is the top-down approach, which derives marginal distributions of individual risks separately, and then aggregates these through a variance–covariance or copula approach; the other is the bottom-up approach, which constructs integrated models based on common risk drivers, where the risk interactions are embedded in the models.

Foglia (2009) suggested following model goodness-of-fit and the consistency of economic theory to select explanatory variables. Kalirai and Scheicher (2002) applied linear regression to describe the relation between loan loss provision and potential explanatory factors. Beck et al (2013) studied the macroeconomic determinants of NPL across seventy-five countries based on dynamic panel data analysis, finding that several macro financial variables significantly affect NPL, including real GDP growth, share price, exchange rate and lending interest rate.

Most of the proposed stress testing models and approaches in the literature use a relatively small number of macroeconomic variables, which may fail to reflect the complete economic situation. To address this issue, a dimension-reduction technique is applied to substitute the large number of variables with a handful of estimated factors for modeling. Stock and Watson (2002) pointed out that this idea has a long tradition in econometrics, and they proposed an approximate dynamic factor model for this purpose in their paper. Bernanke et al (2005) extended the dynamic factor model to FAVAR by including observable variables together with unobservable factors in order to study the effect of monetary policy innovations on the economy. Two estimation approaches are proposed for FAVAR in their paper: a two-step estimation method, in which the factors are estimated by PCA prior to the estimation of the VAR model; and a one-step method, which applies Bayesian likelihood methods and Gibbs sampling to estimate the factors and the dynamics simultaneously. The FAVAR approach inherits appealing features from both FA and the VAR model.

The application of a dimension-reduction technique to stress testing models is occasionally seen in the literature. Boss et al (2009) proposed a regression model with factors derived from PCA as explanatory variables, although the model results are less satisfactory from the perspective of economic intuition. Fiori and Iannotti (2010) proposed a VAR framework via the FAVAR approach to analyze the interactions of credit risk and market risk in response to monetary policy shocks. A latent factor analysis was done by Jimenez and Mencia (2007), where unobservable common factors were used to capture contagion effects between sectors in credit loss distribution estimation for a banking system.

In the literature, the FAVAR approach (Bernanke et al 2005; Fiori and Iannotti 2010) is usually applied to analyze the impact of an observable variable (eg, interest rate) on other latent factors (eg, macroeconomic variables). On the contrary, this paper studies the impact of latent factors (macroeconomic variables) on an observable variable (NPL). More importantly, in the modeling framework, IRF is used to facilitate the multiperiod projection of NPL in response to macroeconomic shocks.

## 3 FAVAR modeling

### 3.1 Methodology overview

An empirical model is developed in this paper to study the dynamics of a credit risk loss measure (NPL) and a set of macroeconomic variables. We use the following eight macroeconomic variables for Hong Kong: GDP, unemployment rate, household income, consumer price index, residential property price index, retail property price index, stock index and interbank interest rate. We also use one for mainland China: GDP. The proposed modeling framework has no restriction on the number of macroeconomic variables that can be used. The Hong Kong banking industry’s gross NPL (before netting specific provisions and individual impairment allowances) is used as the industrial average credit risk loss rate to measure the asset quality of banks. The quarterly industry NPL is provided by the Hong Kong Monetary Authority (HKMA) starting from 1997 Q1, covering the Asian financial crisis (AFC) in 1997 and the 2008 GFC. The proposed modeling framework can be applied to the internal credit risk loss data of a bank as long as the data covers historical stressed events; otherwise, the bank may use industry data as a proxy, but it must apply proper adjustments that consider the sensitivity of its own credit risk loss with respect to that of the industry.

The FAVAR modeling framework consists of the following steps.

1. (1)

All the macroeconomic variables and NPL are properly transformed, including taking the first-order difference, the natural logarithm change and the first-order difference of the natural logarithm change, to induce stationarity of the time series.

2. (2)

After transformation, lead–lag analysis is conducted on the macroeconomic variables with respect to NPL. Cross-correlation indicates that macroeconomic variables have contemporaneous or lagged relationships with NPL. More importantly, the signs of correlation coefficients are confirmed to be economically consistent. Based on this lead–lag analysis, the macroeconomic variables are grouped for PCA study in the next step. This is analogous to the “slow-moving variables” and “fast-moving variables” classification proposed by Bernanke et al (2005), where the classification is based on subjective judgment.

3. (3)

PCA is conducted on the grouped variables separately. The principal components and NPL are then fitted to a VAR model. The principal components used in the model are selected based on the proportion of total explained variance and their cross-correlations with respect to NPL. The number of lags used in the model is empirically determined based on information criteria, including the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the Hannan–Quinn information criterion (HQIC); model goodness-of-fit, stability and residual diagnostics are also taken into account to determine the model specification. The FAVAR model is fitted onto a training set covering a historical stress period. The fitted model is then applied to a validation set of historical data for out-of-sample validation. As a powerful tool to investigate the dynamic interactions of variables in a VAR system, IRF is used in this process to estimate the expected responses of NPL over future time periods with respect to changes in macroeconomic variables.

4. (4)

The FAVAR model fitted in this way is used to project portfolio credit risk loss under given stressed scenarios. A testing set of hypothetical macroeconomic scenarios is created to illustrate the projection process.

### 3.2 Data description

The results of this paper are based on publicly available data and the free statistical software R. The sources of data for this study are the HKMA, the Hong Kong Census and Statistics Department, the National Bureau of Statistics of China, and the Stock Exchange of Hong Kong. The R package vars by Pfaff (2008) is used for VAR analysis to generate the statistics shown in this paper.

The following quarterly data (in percent) is used for modeling. The specified transformation is applied to induce stationarity in the time series.

• HKGDP: $\Delta_{\mathrm{QoQ}}(\Delta_{\mathrm{YoY}}(\cdot))$, first difference (quarter-on-quarter) of year-on-year growth rate of Hong Kong GDP.

• CNGDP: $\Delta_{\mathrm{QoQ}}(\Delta_{\mathrm{YoY}}(\cdot))$, first difference (quarter-on-quarter) of year-on-year growth rate of mainland China GDP.

• UNEMP: $\Delta_{\mathrm{QoQ}}(\Delta_{\mathrm{YoY}}(\cdot))$, first difference (quarter-on-quarter) of year-on-year change of Hong Kong unemployment rate.

• INCOME: $\Delta_{\mathrm{QoQ}}(\Delta_{\mathrm{YoY}}\ln(\cdot))$, first difference (quarter-on-quarter) of year-on-year natural logarithm change of Hong Kong median monthly household income.

• CPI: $\Delta_{\mathrm{QoQ}}(\Delta_{\mathrm{YoY}}\ln(\cdot))$, first difference (quarter-on-quarter) of year-on-year natural logarithm change of Hong Kong consumer price index.

• RES: $\Delta_{\mathrm{QoQ}}\ln(\cdot)$, natural logarithm change (quarter-on-quarter) of Hong Kong residential property price.

• RTL: $\Delta_{\mathrm{QoQ}}\ln(\cdot)$, natural logarithm change (quarter-on-quarter) of Hong Kong retail property price.

• HSI: $\Delta_{\mathrm{QoQ}}\ln(\cdot)$, natural logarithm change (quarter-on-quarter) of Hang Seng index.

• H3M: $\Delta_{\mathrm{QoQ}}(\cdot)$, first difference (quarter-on-quarter) of three-month Hong Kong interbank offer rate (HIBOR).

• NPL_D: $\Delta_{\mathrm{QoQ}}\ln(\cdot)$, natural logarithm change (quarter-on-quarter) of Hong Kong banking industry gross NPL ratio. The NPL ratio is denoted as NPL.

Data from 1999 Q3 to 2019 Q1 (seventy-nine samples), including the relatively severe stressed scenario of the 2008 GFC, are used as a training set for model fitting. Data from 1997 Q2 to 1999 Q2 over the 1997 AFC is used as a validation set to show the model goodness-of-fit. Hypothetical scenarios from 2019 Q2 to 2021 Q2 are used to form a testing set to illustrate how the model can be applied to project stressed losses in stress testing. These hypothetical scenarios are designed based on V-shaped economic downturns and recovery, in a magnitude comparable to historical maximum change. The series are plotted in Figure 1.

The augmented Dickey–Fuller (ADF) unit root test shows that these ten transformed series from 1997 Q2 (the earliest available date for NPL_D) to 2019 Q1 are stationary. Stationarity is also assured for the training set from 1999 Q3 to 2019 Q1 (including the 2008 GFC scenario), while the transformed series from 1997 Q2 to 2007 Q4 (including the 1997 AFC scenario) does not pass the ADF stationary test.

### 3.3 Model fitting

Next, the training set of 1999 Q3–2019 Q1 is used for FAVAR model fitting.

We basically follow the two-step approach proposed by Bernanke et al (2005), where latent factors are estimated by PCA prior to the estimation of the factor-augmented VAR. However, the factor estimation method is different due to a different model specification. Instead of analyzing the impact of an observable variable (interest rate) on other latent factors (macroeconomic variables), this paper studies the impact of latent factors (macroeconomic variables) on an observable variable (NPL). The overall effect of the macroeconomic scenario is of interest, rather than the individual effects of each macroeconomic variable on NPL. Thus, instead of relying on the asymptotic approximation of common factors with principal components, we treat our first step as applying PCA to extract important information and to reduce the correlated, observed macroeconomic variables to a smaller set of independent, unobservable components.22 2 Common factors are asymptotically approximated by principal components. This is when the number of macroeconomic variables is large and the number of principal components is large compared with the true number of factors, as concluded in Bernanke et al (2005). Moreover, we propose an empirical method based on cross-correlation analysis to group the macroeconomic variables for PCA, instead of the subjective classification of variables into “slow-moving” and “fast-moving” categories by Bernanke et al (2005).

The cross-correlations of macroeconomic variables with the NPL_D of the training set are shown in Figure 2. We can see that the maximum absolute correlations are obtained around the contemporaneous position. The signs of the correlation coefficients are economically consistent. For example, NPL_D is contemporaneously negatively correlated with HKGDP, CNGDP, INCOME, CPI, RES, RTL, HSI and H3M, and contemporaneously positively correlated with UNEMP. The correlation structure of the nine variables with NPL_D suggests that it makes economic sense to classify the nine variables into one group for the following PCA study.33 3 Otherwise, for example, if five out of nine macroeconomic variables are correlated with NPL_D at the three-lag period while the remaining four are correlated with NPL_D contemporaneously (at the zero-lag period), it makes sense to group the five into one set and the remaining four into another; PCA is performed on the two groups separately in order to retrieve the most relevant information for NPL_D. This process is analogous to the “slow-moving” and “fast-moving” variables classification in Bernanke et al (2005), except that the cross-correlation serves as an objective criterion for the classification.

PCA is applied to the macroeconomic variables after standardization transformation. The variance of the principal components is shown in Figure 3. The ADF unit root test shows that the nine principal components are stationary. Cross-correlations of principal components with NPL_D are studied in Figure 4, which shows that the correlations between NPL_D and principal components other than the first one are not very significant. Therefore, the first principal component (PC1), which accounts for 32% of data variation, is chosen for FAVAR modeling. Figure 5 intuitively shows the similarity of PC1 and NPL_D, representing a correlation coefficient of 0.53. The factor loadings are shown in Table 1.

We fit the following FAVAR model with the chosen principal component (PC1) and the log-change of NPL (NPL_D) in VAR form:

 $\begin{bmatrix}P_{t}\\ N_{t}\end{bmatrix}=\varPhi(L)\begin{bmatrix}P_{t-1}\\ N_{t-1}\end{bmatrix}+\varepsilon_{t},$ (3.1)

where $P_{t}$ is PC1 at time $t$, $N_{t}$ is NPL_D at time $t$, $\varepsilon_{t}$ is residual vector with zero mean and covariance matrix $\varSigma$, and $\varPhi(L)$ are the lag polynomials of order $d$.

Model selection gives $d=3$ based on AIC, $d=1$ based on BIC, and $\smash{d=2}$ based on HQIC. We empirically choose $d=2$, taking into consideration model goodness-of-fit, parsimony, stability and residual diagnostics.

The fitted model is

 $\begin{bmatrix}P_{t}\\ N_{t}\end{bmatrix}=\begin{bmatrix}0.5995&3.0510\\ 0.0161&0.5076\end{bmatrix}\begin{bmatrix}P_{t-1}\\ N_{t-1}\end{bmatrix}+\begin{bmatrix}-0.2190&-1.5871\\ -0.0139&\quad 0.2431\end{bmatrix}\begin{bmatrix}P_{t-2}\\ N_{t-2}\end{bmatrix},$

with the residual vector covariance matrix

 $\begin{bmatrix}2.0234&0.0314\\ 0.0314&0.0040\end{bmatrix}.$

The fit and residuals of NPL_D, including the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the residuals, are shown in Figure 6. The adjusted $R$-squared of the fitted model is 0.5409. A small sample-adjusted Portmanteau test shows residual autocorrelation is not significant. A multivariate Jarque–Bera test cannot reject the residuals’ multivariate normal distribution assumption. The Langrange multiplier (LM) test for autoregressive conditional heteroscedasticity (ARCH) shows residual ARCH is not significant. The ordinary least squares-based cumulative sums (OLS–CUSUM) empirical fluctuation process of Figure 7 shows that the boundary is marginally crossed; thus, the structural break of this model is not strongly supported.

IRF is a powerful tool for investigating the dynamic interactions of variables in a VAR system. Based on the vector moving average (VMA) representation, it gives us the expected responses of one variable over future time periods with respect to the unit change (impulse) of another variable. Since there is usually contemporaneous correlation among residuals, shown as nonzero off-diagonal elements of the residual covariance matrix, the orthogonal impulse response is used to isolate the impacts of variables with the aid of a Choleski decomposition of the covariance matrix. The results of the orthogonal impulse response are affected by the order of the variables in the VAR model. The NPL_D comes last in the VAR model for this study, which implicitly assumes that the shock to PC1 has a contemporaneous effect on NPL_D, but not vice versa, ie, the change of macroeconomic condition instantaneously affects the asset credit quality, while the asset credit quality will reinforce the macroeconomic condition only in the following periods.

Figure 8 shows the responses of NPL_D with respect to the orthogonal impulse of PC1, together with a bootstrapped one-standard-deviation confidence interval, generated by the vars package of R (Pfaff 2008). The IRF is used for NPL projection in the following sections.

## 4 Out-of-sample validation and projection

Historical data from the 1997 AFC covering 1997 Q2 to 1999 Q2 (validation set) is used for out-of-sample (out-of-time) validation on the FAVAR model fitted in Section 3. A hypothetical scenario of macroeconomic variables from 2019 Q2 to 2021 Q2 (testing set) is applied to the fitted FAVAR model to project the NPL under a hypothetical stressed scenario over consecutive periods.

Table 2 lists the PC1 for the validation set and the testing set (after standardization transformation with respect to the mean and standard deviation of the training set). The PC1 score range of the training set ($[-3.45,6.26]$) shown in Figure 5 marginally covers the range for the testing set ($[-3.99,6.33]$) and the validation set ($[-2.19,6.09]$), which alleviates concerns about overextrapolation of the fitted model. It is worth noting that the PC1s for the validation set and the testing set are listed side by side to show the similarity of the two scenarios. However, the two scenarios are not exactly the same. As explained in Section 3.2, the hypothetical scenario of the testing set is designed based on a V-shaped economic downturn and recovery scenario; the size of the downturn and the recovery is of a magnitude comparable to the historical maximum change.

The PC1s in Table 2 are regarded as impulses of PC1 over consecutive nine quarters. These impulses are applied to IRF to accumulate the NPL_D responses over the projection period. Since NPL_D is the log-change of the original NPL, adjustment using a residual standard error is needed to derive an unbiased prediction of NPL.44 4 For more details, please refer to Wooldridge (2012). NPL and NPL_D projections on the validation set are listed in Table 3 and plotted in Figure 9. We can see that the projections closely follow the historical trend. Root mean square errors (RMSE) for NPL and NPL_D are 0.71% and 0.07%, respectively.

NPL and NPL_D projections on the testing set of a hypothetical stressed scenario (from 2019 Q2 to 2021 Q2) are plotted in Figure 10. This gives the mean and one standard deviation of a multiperiod projection for NPL and the log-change of NPL under a stressed scenario.

It is worth noting that the one standard deviation confidence interval given in this study stands for a 68% confidence interval of impulse responses of the VAR model, without accounting for the extra uncertainties caused by scenario factor score calculation and multiperiod projection estimation. With all those extra uncertainties, the actual confidence interval will be wider. A comprehensive simulation study is needed to construct a precise confidence interval after taking those uncertainties into consideration. Restricted by the main focus of this study, the confidence interval given here is an approximation.

## 5 Conclusions

This paper applies the FAVAR approach to develop an empirical model showing the dynamic relationship between a credit risk loss measure and a set of macroeconomic variables.

The proposed FAVAR approach has the following advantages. First, it allows utilization of a large number of macroeconomic variables to obtain rich information while avoiding the curse of dimensionality. Second, because of the orthogonal characteristics of PCA, the multicollinearity concern in a regression model is alleviated. Third, all variables are treated as endogenous; thus, subjective assumptions on exogenous explanatory variables in a regression model are prevented. Fourth, multiperiod prediction is achieved using IRF. Last, a bootstrapped confidence interval of IRF provides an interval estimation of the stress results, which is more informative than a point estimation.

Nevertheless, all models have their limitations. First, as with any other empirical model, extra care should be taken regarding extrapolation when applying the model to hypothetical scenarios that are much different in magnitude to the scenario used for model fitting. Second, differencing and detrending on variables to induce stationarity may sacrifice critical information; moreover, the standardization required by PCA further reduces the information that can be collected from the original variables. Relaxing the constraint on stationarity by utilizing a vector error correction model (VECM) may be an alternative solution. Third, no economic theory can be applied to determine the order of the variables in the FAVAR model (especially when more than one component is chosen in VAR). This may affect the orthogonal IRF and thus the prediction results. Empirical judgment and economic intuition may help to define the model specification. Last, precise IRF confidence interval construction should take into account all of the uncertainties in the model.

Despite the limitations, the proposed modeling approach based on FAVAR is potentially useful for practitioners looking to develop a benchmark model in order to validate their existing credit risk stress testing models. Its scope of application can also be extended to impairment modeling for International Financial Reporting Standard 9, which requires the projection of credit risk losses over consecutive periods under different macroeconomic scenarios.

## Declaration of interest

The views and opinions expressed in this paper are solely those of the authors and do not reflect the policy or business practice of their employers in any way. All remaining errors are the sole responsibility of the authors.

## Acknowledgements

Zhifeng Wang is Head of the Risk Analytics Department at Shanghai Commercial Bank and was previously Senior Model Validation Manager at Bank of China (Hong Kong), where most of the work on this study was done. Fangying Wei is Model Validation Manager at Bank of China (Hong Kong). The authors would like to express deep gratitude to the anonymous reviewers for their insightful comments and suggestions, which were critical in improving the clarity and quality of the paper. The authors also sincerely thank the copyeditors for the valuable suggestions made at proof stage, which greatly improved the grammar and sense of the sentences in this paper.

## References

• Basel Committee on Banking Supervision (2018). Stress testing principles. Report, October 17, Bank for International Settlements.
• Beck, R., Jakubik, P., and Piloiu, A. (2013). Non-performing loans: what matters in addition to the economic cycle? Working Paper 1515, February, European Central Bank.
• Bernanke, B. S., Boivin, J., and Eliasz, P. (2005). Measuring the effects of monetary policy: a factor-augmented vector autoregressive (FAVAR) approach. Quarterly Journal of Economics 120, 387–422 (https://doi.org/10.1162/qjec.2005.120.1.387).
• Boss, M., Fenz, G., Pann, J., Puhr, C., Schneider, M., and Ubl, E. (2009). Modeling credit risk through the Austrian business cycle: an update of the OeNB model. Financial Stability Report 17, pp. 85–101, Austrian Central Bank.
• Camara, B., Pessarossi, P., and Philippon, T. (2017). Backtesting European stress tests. Discussion Paper DP11805, Centre for Economic Policy Research.
• Cihak, M. (2007). Introduction to applied stress testing. Working Paper 59, International Monetary Fund (https://doi.org/10.5089/9781451866230.001).
• Fiori, R., and Iannotti, S. (2010). On the interaction between market and credit risk: a factor-augmented vector autoregressive (FAVAR) approach. Working Paper 779, Bank of Italy (https://doi.org/10.2139/ssrn.1792562).
• Foglia, A. (2009). Stress testing credit risk: a survey of authorities’ approaches. International Journal of Central Banking 5(3), 9–45.
• Fong, P. W., and Wong, C. S. (2008). Stress testing banks’ credit risk using mixture vector autoregressive models. Working Paper 13/2008, Hong Kong Monetary Authority (https://doi.org/10.2139/ssrn.1326833).
• Gersl, A., and Seidler, J. (2012). How to improve the quality of stress tests through backtesting. Czech Journal of Economics and Finance 62(4), 325–346.
• Gross, M., and Poblacion, J. (2019). Implications of model uncertainty for bank stress testing. Journal of Financial Services Research 55(1), 31–58 (https://doi.org/10.1007/s10693-017-0275-4).
• Jimenez, G., and Mencia, J. (2007). Modelling the distribution of credit losses with observable and latent factors. Working Paper 0709, Bank of Spain.
• Kalirai, H., and Scheicher, M. (2002). Macroeconomic stress testing: preliminary evidence for Austria. Financial Stability Report 3, pp. 58–74, Austrian Central Bank.
• Melecky, M., and Podpiera, A. M. (2012). Macroprudential stress-testing practices of central banks in central and southeastern Europe: comparison and challenges ahead. Emerging Markets Finance and Trade 48(4), 118–134 (https://doi.org/10.2753/REE1540-496X480407).
• Papadopoulos, G. (2017). A model combination approach to developing robust models for credit risk stress testing: an application to a stressed economy. The Journal of Risk Model Validation 11(1), 49–72 (https://doi.org/10.21314/JRMV.2017.168).
• Pfaff, B. (2008). VAR, SVAR and SVEC models: implementation within R package vars. Journal of Statistical Software 27(4) (https://doi.org/10.18637/jss.v027.i04).
• Quagliariello, M. (2009). Stress-Testing the Banking System: Methodologies and Applications. Cambridge University Press (https://doi.org/10.1017/CBO9780511635618).
• Stock, J., and Watson, M. (2002). Macroeconomic forecasting using diffusion indexes. Journal of Business Economics and Statistics 20(2), 147–162 (https://doi.org/10.1198/073500102317351921).
• Wong, H. Y., Choi, K. F., and Fong, P. W. (2008). A framework for stress-testing banks’ credit risk. The Journal of Risk Model Validation 2(1), 3–23 (https://doi.org/10.21314/JRMV.2008.018).
• Wooldridge, J. M. (2012). Introductory Econometrics: A Modern Approach. South-Western Cengage Learning.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact [email protected] or view our subscription options here: http://subscriptions.risk.net/subscribe