Journal of Risk Model Validation

The effect of variant sample sizes and default rates on validation metrics for probability of default models

David Li, Ruchi Bhariok, Radu Neagu


In this paper a survey of common model-validation metrics for scoring models where the response is a binary variable is presented. These metrics include the Hosmer-Lemeshow statistic, the accuracy ratio, the standardized residual sum of squares and the conditional information entropy ratio. More specifically, we restrict ourselves to probability of obligor credit default models, and investigate the effects of varying sample sizes and default rates in the population. We show that no single validation metric gives accurate evaluations for a set of varying conditions, and we document the weaknesses and the strengths of these metrics using simulation and empirical data. We recommend that decision makers use information from multiple sources to drive their decisions, and that they understand the weight they need to put on each source given the specifics of the situation at hand.

Sorry, our subscription options are not loading right now

Please try again later. Get in touch with our customer services team if this issue persists.

New to View our subscription options

You need to sign in to use this feature. If you don’t have a account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here