Actionable data breach insights from op risk modelling

Actionable data breach insights from op risk modelling

Thomas Lee, chief executive at VivoSecurity, and Martin Liljeblad, operational risk manager at MUFG Americas, examine how a data breach cost model can replace an advanced measurement approach in a structured scenario

Thomas Lee – VivoSecurity
Thomas Lee, VivoSecurity

Things change. For just over a decade, most large banks measured their operational risk capital using advanced measurement approach (AMA) models.1 Their reign was seen as troubled, their workings as a black box – leaving risk management standards between banks potentially uneven, conflicted and inadequate. With the demise of the AMA now looming, it is time to herald a new kind of model – one that is transparent, intuitive and partnered with expert judgement.

AMA models are essentially loss distribution models trained on a confused array of events that include external fraud and small personal identifiable information (PII) data breaches.2 It is standard practice to estimate value-at-risk from these models using a 99.9% confidence interval for regulatory capital, or 95% confidence intervals for internal estimates. The problem with this approach is that the impact from tail events is significant, but forecasts are uncertain because of the paucity of extreme event data.

It is not surprising that regulators no longer trust these loss distribution models, and institutions are encouraged to base operational risk capital on the standardised measurement approach,3 which uses an institution’s internal loss history and size as a proxy for risk. This approach results in large regulator-pleasing capital reserves.

Regulators may also want these Basel Committee on Banking Supervision estimates supplemented with structured scenario analysis4,5 for idiosyncratic and rare, but impactful, events. These scenarios can be based on expert judgement. However, research on the inaccuracy of expert judgement with regard to rare events is robust,6 and large data breaches fall into this category. Not surprisingly, regulators are keen to see that firms’ expert judgement-based cyber loss estimates are supported by historical data where available. While loss distribution models are to be phased out, a new type of model exists that can aid experts.


The PII data breach model 

Martin Liljeblad – MUFG Americas
Martin Liljeblad, MUFG Americas

VivoSecurity has developed a model for PII data breaches that characterises the cost of a breach affecting a specific number of people from a specific cause – independent of the probability that the event might occur. Such a model is a simple linear regression characterisation of historical events, which reveals the factors influencing cost. The model also addresses another issue: paucity of data. Very large data breaches are rare, and no single industry has enough data to develop a strong model. So the model explains the cost of a data breach across industries. The model therefore addresses the concerns that drive operational risk reserve estimates away from models: 

  • A lack of transparency with internally developed models 
  • A lack of data for rare events 
  • A lack of non-uniform approach across institutions. 

At least in the case of a PII data breach, the model also fosters better business decisions, risk awareness and understanding – requirements championed elsewhere.7

Modelling a PII data breach begins with a set of historical data breaches (observations) and a large set of variables (explanatory variables) that could predict data breach costs. Model development involves determining which variables correlate with costs and which do not. When model development is complete, a small set of variables will be found to be predictive and a large set will have been eliminated because they were not found to correlate better than by random chance. Both sets of variables provide insights to experts, dispelling or supporting assumptions. Such a model bolsters expert judgement by anchoring assumptions in a strong, straightforward model based on relevant historical events, and compliant with SR 11-7.8

For example, VivoSecurity’s cost model reveals the most important factor in predicting cost is the square root of the number of people affected by a data breach, while the type of PII exposed was not found to be an important predictor of cost. Therefore, if a financial institution has 19 million customers and 60,000 employees, an expert could justify using a data breach affecting 19.06 million people. Scenario analysis should not focus on the type of PII data breach – each one is costly, including employee protected health information and customer PII, personal financial information and cardholder data. Because the cost of a data breach is related to the square root of the number of people affected, the common approach of multiplying the number of people affected by a dollar constant greatly overestimates the impact of a large data breach and greatly underestimates the impact of a small one.

The second most important factor is how the breach occurred, with breaches caused by a malicious outsider around four to five times costlier than any other cause. Scenario analysis should therefore focus on malicious outsiders as the worst-case data breach and the costs specifically related to a malicious outsider attack.

Regulators are paying close attention to the legal costs of a data breach, and the model reveals that a single lawsuit can double the cost of a data breach. Costs continue to increase by the 0.7 power of the number of lawsuits. To address this, VivoSecurity modelled the probability for lawsuits separately and found that the probability also increases with the number of people affected by the data breach. For a data breach affecting 19 million people, there is a 30% chance of one or more lawsuits.


Leveraging incident response

A model-based cost forecast is a cost distribution (figure 1), and experts must decide which confidence interval to use. While a 99.9% confidence interval would be appropriate when using a loss distribution model, which also considers probability, it would not make sense for a cost model that simply characterises the cost for a specific data breach – the probability that the incident could occur is not part of the model. For a large data breach, the probability that the event would occur is already the 99.9% case and the most likely cost for this rare event should be closer to the median of a cost model. 


Therefore, experts must choose a confidence interval that makes sense. VivoSecurity’s cost model identifies investigation costs as a substantial component of the cost of a data breach caused by malicious outsiders. Investigation is part of the incident response plan, and expert judgement can therefore focus on evaluating readiness with regard to industry best practice and peers. The cost model does not find industry to be an important predictor of costs. This justifies drawing lessons learned from across industries in judging incident response readiness.

The cost model even suggests qualifications for experts regarding evaluation of incident response: familiarity with data breach investigations, experience with multiple enterprises and educated on industry best practice, which includes all server and application access logging turned on and access logs saved in a read-only manner with a standard format wherever possible. There should be written procedures and evidence that logs remain turned on. Some chief information security officers even invest in tools that speed up investigation, such as end‑point detection and response technology. If qualified expert judgement is not available internally, then it might be prudent to engage a third party to evaluate investigation‑readiness.


So, things change and at least one kind of model can once again help forecast operational risk. These days, cyber risk is a top priority for both financial institutions and regulators (the US Federal Reserve Board and the Office of the Comptroller of the Currency, for example), and they are keen to see that firms’ expert judgement-based cyber loss estimates are supported by appropriate historical data where available. Many Matters Requiring Attention have been issued to Comprehensive Capital Analysis and Review/Dodd‑Frank Act Stress Test banks over the past several cycles relating to inappropriate loss estimate forecasting, costing these institutions tens of millions of dollars to address them.


1. AMA for operational risk models is developed by characterising historical events with, for example, a lognormal distribution.
2. PII includes protected health information, personal identifiable information with financial information and cardholder data. 
3. Basel Committee on Banking Supervision consultative document, Standardized Measurement Approach for operational risk, March 2016.
4. Board of Governors of the Federal Reserve System, SR 11-8, Supervisory guidance on implementation issues related to the advanced measurement approaches for operational risk, section on Scenario analysis, June 2011.
5. Australian Prudential Regulation Authority (APRA) information paper, Applying a structured approach to operational risk scenario analysis in Australia, September 2007. 
6. For example, research by Nobel laureate Daniel Kahneman, Princeton University. 
7. Oliver Wyman, Beyond AMA: Putting operational risk models to good use, 2016. 
8. Board of Governors of the Federal Reserve System, SR 11-7, Guidance on model risk management, April 2011.


Thanks to all who contributed to this article: Srinivas Peri of Princeton Strategy Group, Dr ​Jon Schachter of VivoSecurity for current AMA approaches at different banks, Philip Agcaoili, chief information security officer at Elavon and senior vice president, US Bank, for best practice regarding data breach investigation.

The authors

Thomas Lee 

Thomas Lee, chief executive at VivoSecurity, has decades of experience pioneering methods in statistical analysis, image processing and digital signal processing for science, industry and cyber risk. He has multiple patents and papers published in peer‑reviewed journals and is an expert in software, operating systems and hardware vulnerabilities, and enterprise operations.


Martin Liljeblad

Martin Liljeblad, operational risk manager at MUFG Americas, was previously head of operational risk reporting at Nordea and a legislator for the Covered Bonds Act at the Swedish Financial Supervisory Authority. He is an industry authority on operational risk, with more than a decade of experience internationally and within the US. Liljeblad has chaired and spoken at Marcus Evans conferences in London and Amsterdam. The views in this article are Martin’s own and do not necessarily reflect the views of MUFG.

To learn more

VivoSecurity invites you to explore its Executive Education programme, including how data breach cost models can inform expert judgement, improve incident response planning and integrate into Model Risk Management frameworks.

Contact Bo Stevens to learn more

  • LinkedIn  
  • Save this article
  • Print this page  

You need to sign in to use this feature. If you don’t have a account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an indvidual account here: