The effect of variant sample sizes and default rates on validation metrics for probability of default models

Dawn Hunter

Save this article

Abstract

ABSTRACT

In this paper a survey of common model-validation metrics for scoring models where the response is a binary variable is presented. These metrics include the Hosmer-Lemeshow statistic, the accuracy ratio, the standardized residual sum of squares and the conditional information entropy ratio. More specifically, we restrict ourselves to probability of obligor credit default models, and investigate the effects of varying sample sizes and default rates in the population. We show that no single validation metric gives accurate evaluations for a set of varying conditions, and we document the weaknesses and the strengths of these metrics using simulation and empirical data. We recommend that decision makers use information from multiple sources to drive their decisions, and that they understand the weight they need to put on each source given the specifics of the situation at hand.

As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.

If you would like to purchase additional rights please email info@risk.net

You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.

If you would like to purchase additional rights please email info@risk.net