Journal of Risk Model Validation
ISSN:
1753-9587 (online)
Editor-in-chief: Steve Satchell
Volume 20, Number 1 (March 2026)
Editor's Letter
Steve Satchell
Trinity College, University of Cambridge
It is difficult to find any gathering at which artificial intelligence (AI) and its likely impacts on employment and activity are not being discussed. Machine learning, which is a large and important subcomponent of AI, is the most frequently employed quantitative feature. This issue of The Journal of Risk Model Validation features two papers that directly address validation using machine learning. Whether their findings imply we will all (including the editor) become unemployed remains to be seen, but it is a pleasure to be able to present such material.
The first paper on this topic is “Interpretable machine learning for default risk prediction in stress testing” by Junqi Zhao, Nan Zhou and Zailong Wan. Stress testing is essential for testing the resilience of banks’ portfolios against possible future economic conditions, and The Journal of Risk Model Validation has always regarded stress testing as an essential part of model validation. In the context of credit risk modeling, a key area of interest in stress testing is predicting the probability of default. In model validation, a benchmark model can serve as a means to challenge the limitations of an existing model. Zhao et al’s study develops such a benchmark model: an inherently interpretable machine learning model that uses the explainable boosting machine (EBM) structure in a discrete-time survival analysis setting to predict the forward-looking probability of default of a real-world credit card portfolio. This model is compared against a standard alternative model, and the authors’ model achieves comparable predictive accuracy much more simply and with less data. This study will be of particular interest to financial institutions seeking to adopt machine-learning-based methods for optimizing stress testing model development or to enhance benchmarking practices in risk model validation while maintaining compliance with regulatory requirements.
Accurately predicting default risk among small businesses is critical for lenders and policy makers. However, traditional credit risk models often rely on extensive financial statements that many small enterprises lack. The second machine learning paper in this issue, “The role of personal credit in small business risk assessment: a machine learning approach” by Zilong Liu and Hongyan Liang, explores the value of integrating the personal credit bureau data of small business owners, along with business-level and trade line variables, within a machine learning framework to improve default prediction. Using a large data set from the Gies Consumer and Small Business Credit Panel, the authors’ baseline models, which rely on fundamental business attributes and also incorporate business trade line information (such as active accounts and delinquency patterns) perform rather poorly. However, when they add variables that capture personal credit scores, outstanding balances and recent inquiries rankings, these are shown to be among the strongest predictors, alongside measures of business debt (eg, Uniform Commercial Code filings and open trade line balances). Liu and Liang’s findings reveal that personal credit factors can fill critical information gaps when formal business records are scant, thereby strengthening credit risk assessments and enhancing lending decisions in the small business sector. In addition, their results highlight the importance of validating risk models using alternative data sources, to ensure greater robustness and reliability in predicting small business defaults. Readers will find this paper very helpful in that it details at a granular level the steps required to apply machine learning to such a problem.
In the issue’s third paper, “Crises, combined crises and their implications for firm profitability”, Alexandre Siqueira and Sylvia Gottschalk investigate how the interaction between distinct types of crises impacts firm profitability. Building on a taxonomy of combined crises, where up to four concomitant crises (banking, currency, debt and recession) are considered, the authors estimate panel regressions of the main determinants of firm profitability for emerging and mature countries. Their results show that gross margin has a positive impact on firm profitability, especially in tranquil times. Leverage has a consistently negative impact on profitability, while size is valid only in times of non-crisis. Siqueira and Gottschalk find that the impact of other determinants, such as liquidity, external dependence, ownership and age, varies with the type of crisis and by country. Their study highlights the necessity of using more than one model to understand firm characteristics when explaining the impacts of crises on firm profitability. Existing profitability models are also indirectly validated, evidencing potential errors in model specification due to data selection. This paper contributes to a growing literature showing that the economic impact of combinations of crises affects firm profitability differently from separate recessions or currency, banking or debt crises. While readers may find this conclusion unsurprising, it is good to see it modeled and quantified.
Our last paper covers “Statistically distinguishable rating scales”. Its author, Mikhail Pomazanov, proposes a method of designing a risk rating scale that is not excessive in relation to existing calculations and is based on a mix of ranking categories and optimization. This allows for more stable validation with a fixed maximum number of violations of Wald criteria compared with an excess scale, which is usually used by banks. The excess scale Pomazanov discusses refers to the multiple categories or features the risk evaluator may wish to test. His proposed method – which loosely requires that higher ratings should have lower default rates – should reduce the estimated probability of default, which will provide savings in capital requirements under the advanced internal ratings-based approach. Theoretical justifications for this effect are presented, and numerical calculations from closed data are performed for three rating scales (two of which are based on the statistics for rating agencies, while the third is the rating scale of a particular bank). The proposed method is most relevant for the corporate segment of the loan portfolio, where there appears to be less guidance on the nature and quantity of rating categories.
Papers in this issue
Interpretable machine learning for default risk prediction in stress testing
This paper proposes a benchmark model which can be used to predict the forward-looking probability of default of a real-world credit card portfolio.
The role of personal credit in small business risk assessment: a machine learning approach
The authors investigate how personal credit data can be combined with business-level and tradeline variables in a machine learning framework to enhance default prediction.
Crises, combined crises and their implications for firm profitability
The authors put forward a taxonomy for combined crises where up to four accompanying crises are apparent and how their interactions might impact firm profitability.
Statistically distinguishable rating scales
The author suggests a means to design a statistically distinguishable rating scale that is not excessive in relation to the existing observation statistics, allowing for more stable validation.