
The data puddle challenge
The loss event taxonomies currently in use are inadequate. The worst problem is the lack of clarity with regard to the boundary conditions between risk event categories. Tara McLenaghen explores the issues
Developing an operational risk taxonomy is far from an easy task. As noted by my colleague Rick Cech in his article Event Horizon (OpRisk & Compliance February 2007, pages 35–39), the challenges and ambiguities of the Basel II loss event hierarchy are now widely acknowledged, yet banks and industry associations have generally not strayed too far in establishing their own classification structures. This is in part due to the benefits of using an industry standard, but also reflects the difficulty in developing a classification structure that is intuitive for all users within a bank and across the industry, has objective classification criteria requiring minimal interpretation, and has categories that are both mutually exclusive and exhaustive. It is not a task for the faint of heart.
Even the authors of the Basel II hierarchy never really intended for their work to be 'carved in stone'. The hierarchy was fleshed out by a special task force of the Working Group on Operational Risk, of the Institute of International Finance (IIF), under pressure of a deadline. It was proposed to Basel architects as part of the IIF's overall recommendations for a regulatory model for assessment of operational risk capital. Somewhat to our surprise, the hierarchy was immediately adopted by Basel, with almost no changes and minimal dialogue. While the participating banks were reasonably pleased with their proposals as a starting point for discussion, I believe we all expected there would be further fine-tuning both as a result of short-term discussion and longer-term experience. Unfortunately, this has not been the case. And while other groups have made some adjustments to their own subsequent adaptations of Basel II – the Operational Riskdata eXchange Association (ORX) framework, and the KRI Framework created by RiskBusiness International and the Risk Management Association – all frameworks remain fairly closely aligned.
In fact, without disparaging the significant efforts that have been made by many in this area to date, it is unfortunately true that all three of these loss event taxonomies suffer from major weaknesses in terms of the clarity of the boundary conditions between risk event categories. This lack of clarity has a direct impact on the accuracy and consistency of loss classification – a data integrity issue that is fundamental both for loss modelling and risk management.
The extent of these weaknesses was revealed in a research project conducted by RiskBusiness, during which we tested a new tool designed to enable firms to easily classify and reclassify loss data into different formats for reporting and management purposes. In the course of this research, we prepared a loss event database, which we used to test the efficacy of the new tool for classifying loss data into each of the Basel II, ORX and KRI Framework hierarchies. What we discovered, completely independent of the tool used for classification, was that many loss events in our test pool could be correctly classified into more than one risk category. We call these occurrences 'data puddles'. These puddles exist because taxonomy definitions are missing, incomplete, imprecise or overlapping, and often not very intuitive.
Now is the time to resolve these issues, particularly in light of our increasing reliance on these frameworks. While the original taxonomies were conceived as a structure to classify operational risk losses, they are now used to classify all kinds of operational risk data, including data from risk and control self-assessments, scenarios, key risk indicators and capital assessments. A common classification framework is the linchpin for relating data from different sources, allowing banks to compare, interpret and learn from the information they have gone to such pains to collect. Unfortunately, it is this very aspect that compounds the impact of flaws in these structures.
Accuracy and consistency in data classification is important for the full range of analysis, including:
• Risk modelling, which depends heavily on the assumption that the data being modelled has all been generated from the same underlying system or process. When data is misclassified and erroneously put into the wrong risk 'bucket', it creates anomalies in frequency and severity distributions, resulting in skewed capital estimates for the business. This is a serious issue for banks aspiring to advanced measurement approach models.
• Loss data benchmarking through consortiums or by obtaining external loss data from other sources. Any distortions in loss classification would be exacerbated in an industry pooling situation, making attempts to benchmark loss frequency and severity performance even more challenging than it already is.
• Data analysis for risk management. Business people need to be able to rely on the data to understand the size and scope of various problems, and to build business cases to address the underlying issues. The volumes of loss data alone may prevent a full review and analysis of descriptions for any but the largest losses. In fact, most business people are already spending too much time sorting through loss data, trying to organise it from the broad Basel categories into more business-specific and actionable issues – the presence of incorrectly classified data just makes this job harder.
The weaknesses we have found in the three main loss hierarchies also highlight another concern. The process of properly classifying a loss must be quick, easy and intuitive for users of any loss collection system, or they will be discouraged from capturing the data into the system at all. Loss data collection is often highly decentralised, and system users are not operational risk experts, nor are they necessarily highly motivated to ensure robust and accurate collection. Users faced with ambiguities in the classification process or highly technical classification labels will be quickly frustrated, more likely to make an incorrect choice of bucket, and less likely to report an event the next time.
Clearly, we need to make some changes, both as individual banks and as an industry, to address these problems in our classification approach. That said, there is no need to throw the baby out with the bath water. Most users today believe that an event-based taxonomy was the right choice, and the risk categories in the first level of the hierarchy (Level 1) are still the right 'buckets'. Most of the work needs to be done in the second tier of the hierarchy (Level 2), where it makes sense to try to enhance and improve the categories in a more systematic fashion."
Finding the puddles
To test our taxonomy translator tool, we assembled a set of loss events. Our goal was to collect a minimum of five loss events for each risk category in the three taxonomies (Basel II, ORX and KRI Framework), so as to ensure that our tool was robust for every possible type of loss event. The intent was to use the new tool to classify the events, and then compare the results with the risk categories selected based on a more subjective classification process that relied on our collective expertise about the various taxonomies. This was to be done on an iterative basis, so that issues identified in the first phase could be corrected and then the tool retested against the same loss events by a different tester. In this way, we would refine the tool until classification challenges were eliminated.
Assembling a robust set of test data proved to be challenging. As is commonly acknowledged, public data is heavily biased towards only a few categories of losses, and this was borne out by our own experience. We had no shortage of data for fraud, improper practices and fiduciary breaches leading to lawsuits, security breaches leading to stolen data, or for unauthorised market activity. However, we did have gaps in a number of categories. In some cases this is because the category covers a unique kind of loss, and it has proven difficult to come up with more than one or two distinct variations. Examples of this problem include 'terrorism', as well as 'system security/wilful damage, not for profit'. In other cases, issues in the taxonomy itself make it hard to come up with appropriate events because the category definitions are confusing or overlapping.
Of course, it was the latter issue of confusing and overlapping risk categories that actually led to our most important findings regarding the existence of data puddles. What we discovered is that 36% of our data set puddled in one or more of the taxonomies, that is, they could be correctly classified into more than one risk category, based on our collective understanding of the authoritative definitions. While we were certainly aware that all the taxonomies had some issues in terms of lack of clarity, it was the extent of the problem that came as a significant surprise. The overall findings are summarised in table A.
It would be reasonable to ask whether the puddle challenges we experienced were a direct result of the classification methodology we were testing. We can say with assurance that this is not the case. Every puddle was investigated, and carefully compared with the risk definitions available for each taxonomy. In each incidence, based on the facts available, it was determined that the risk categories were sufficiently ambiguous that a reasonable person could correctly classify the event in more than one category.
On the other hand, there may be some bias introduced into our study as a result of some test loss events lacking sufficient detail to support correct classification. In most cases, additional details were added to the loss story to rectify this problem. Of course, lack of detail regarding losses is a real-life challenge for those who are responsible for collecting and reporting loss data, and one that needs to be considered in the design of any taxonomy.
It is important to reiterate that our test data was not a random sample from a typical financial institution's complete loss data set. Clearly a random sample would not have as many puddles, as most losses can be captured within only a handful of categories, as already noted. In a similar vein, the test data set aimed to have at least five losses per category, implying, for example, that there would be five examples of terrorism – clearly over-representing the incidence of terrorism in most countries. Nevertheless, the key point is that the identified puddles cross a wide range of risk categories, and are not confined to the more obscure. Moreover, our testing is not complete, and it is quite possible that more puddles may exist.
Eliminating the puddles
Closer examination of the 24 data puddle 'types' revealed that many of these puddles are related, and can be loosely organised to bring focus to key issues. Table B illustrates some of the most prevalent issues.
Where puddles exist within the operational risk taxonomies, there are a few options available to the industry:
• Add some new overriding rules that clarify which risks take priority for classification purposes. For example: unauthorised activity always takes precedence over internal fraud.
• Add some new rules that add a new distinction to aid in classification. For example: internal fraud is an action taken by an individual for his own benefit at the expense of the firm.
• Change the existing definitions of the risk categories.
• Make more material changes to the taxonomies
Based on our own analysis to date, we are proposing some rules and definition changes for discussion by the industry, as outlined in table B. To further illustrate the issues and approach, let us examine the third of these puddle groups in detail.
One of the more subtle but relevant puddles is the challenge of classifying illegal activities conducted by the firm. These activities can rightfully be slotted into both 'internal fraud', and either 'improper practices' or 'suitability, disclosure and fiduciary' (both of 'clients, products and business practices'). Table C gives the exact definitions provided by each of the taxonomies for these risk categories.
The most obvious problems with the Basel category descriptions (which for the most part have been adopted verbatim by ORX in order to ensure alignment) are:
• There are no definitions provided for the Level 2 categories, only examples.
• Some category descriptions are overlapping. Many of the examples given under 'improper practices' and also under 'suitability, disclosure and fiduciary' are against the law, which makes them easy to classify under 'internal fraud', given the Level 1 definition. ORX tries to clarify this issue by noting at the end of its description of the Level 2 category of 'theft and fraud', that "acts in this category violate public laws of general conduct, applicable to all individuals, which usually carry criminal penalties". This is obviously an improvement, although the point may be lost on many people doing the classifying.
• Many definitions are imprecise, and fail to create clear boundaries between categories. ORX's addendum to the description of the Level 2 category of 'theft and fraud' tries to separate acts that 'violate public laws of conduct, applicable to all individuals, which usually carry criminal penalties' from business malpractice, but this criteria will fail for events such as money laundering.
The KRI Framework tries to resolve some of these problems by providing definitions, but again, a lack of precision and clear boundary criteria result in similar problems. For example, the exclusion of 'fraud' from the KRI Framework's definition of 'improper practices' may result in some things such as antitrust activities or market manipulation getting classified as internal fraud by some users and improper practices by others, depending on the understanding of the individual doing the classifying. The problem lies in the interpretation of the word 'fraud'.
Operational risk practitioners, especially those with a bit of a history with the taxonomies, have tended to classify losses based on their own understanding of the common intent behind these category descriptions. The challenge is that those individuals entering the data often do not have this background. Further, our discussions among ourselves and with practitioners suggests that the common understandings among the 'experts' are often not actually that common. In other words, when you consider the issues in light of specific events, the experts often have varying interpretations of the category descriptions.
Table D provides a sample of three test events that would puddle under two or three of the above risk category definitions for Basel and ORX.
Clearly these examples are not unknown to most operational risk practitioners. We believe that most operational risk 'experts' would tend to classify all these losses as improper practices or suitability/fidicuary failures, but in a decentralised loss collection system, there is bound to be some confusion and different classification decisions.
Accordingly, we have proposed the following changes or rules to try to resolve the problem:
• 'Internal fraud' is an illegal action taken by an employee for his own direct or indirect financial benefit at the expense of the firm.
• If the event is an act that: results in an undue advantage for the firm, or violates a regulation or law of business practice, or results in the firm's failure to meet a fiduciary duty of care, then it is either 'improper practices' or 'suitability, disclosure and fiduciary', but not 'execution, delivery and process management'.
• 'Suitability, disclosure and fiduciary' is a failure to meet a fiduciary responsibility to the client as defined by specific legal standards in your jurisdiction.
• If an event meets the above standard, then the 'suitability, disclosure and fiduciary' classification overrides all others.
With these new guidelines, we can definitively classify the test events as either improper practice or suitability, disclosure and fiduciary failures, as noted in the third column of table D.
Moving forward
While we have proposed a number of new rules to try to resolve some of the most common and frequent loss data classification puddles, this is still a work in progress, and obviously industry dialogue is required. There are still more puddles, including between 'transaction capture, execution and maintenance' and 'monitoring and reporting' (for example, accidental misreporting of financial data), and between 'improper practices', 'monitoring and reporting', and 'external fraud' (for example, failure to report on suspected money laundering activities), and between 'external fraud' and 'external systems security' (for example, thefts/intrusions that hinge on technology-related security failures). And more rules, or taxonomy changes, are required to achieve consistently accurate data classification.
Clearly, there are no 'right' or 'wrong' answers to these challenges – just the need for a complete set of risks with clearly drawn lines between each category. My colleague, Rick Cech, has proposed a number of general principles for the creation of an event hierarchy in his article Event Horizon. These include the use of intuitive labels for risk categories, the need for categories to be defined such that an event can only fall into only one 'bucket', the need for categories to be defined clearly and used consistently, and the need for category definitions to be based on event characteristics, rather than on impact types, causes or controls.
RiskBusiness has adopted these principles, and developed one solution to the taxonomy challenge based on our research and experience. Building on some of the rules proposed in this article, we have created a comprehensive set of more than 100 detailed risk categories, which can be aligned to the Basel II, ORX and KRI Frameworks to form a Level 3 and 4 to these structures. Each detailed risk category is positively defined with boundary conditions established. But just as importantly, they represent the language of common operational risk issues as seen from a layman's perspective, and therefore resonate with most business people within the organisation. Such an approach should not only improve the accuracy of classification, but has the added benefit of allowing users to easily map losses back to different taxonomies.
Regardless of the approach used, the industry should acknowledge and resolve the issues associated with the current loss data taxonomies now. We recognise that adopting new rules and definitions will lead to a final structure that contradicts the Basel II taxonomy in certain areas. This underscores the importance of impressing upon regulators that this is a natural evolution in our understanding of the classification challenges, and changes are absolutely necessary to improve data integrity, in service of our mutual goal of better operational risk management and measurement. Without these changes, we undermine all our efforts around data modelling for capital purposes, benchmarking loss experience across the industry, and perhaps most importantly, analysing the data to achieve insights for better risk management. n
Tara McLenaghen is executive director, North America at RiskBusiness International, an international operational risk advisory firm.
E-mail: tara.mclenaghen@riskbusiness.com tel: (416) 427 8007
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net
More on Operational risk
Evalueserve tames GenAI to boost client’s cyber underwriting
Firm’s insurance client adopts machine learning to interrogate risk posed by hackers
Integrated GRC solutions 2024: market update and vendor landscape
In the face of persistent digitisation challenges and the attendant transformation in business practices, many firms have been struggling to maintain governance and business continuity
Vendor spotlight: Dixtior AML transaction monitoring solutions
This Chartis Research report considers how, by working together, financial institutions, vendors and regulators can create more effective AML systems
Financial crime and compliance50 2024
The detailed analysis for the Financial crime and compliance50 considers firms’ technological advances and strategic direction to provide a complete view of how market leaders are driving transformation in this sector
Automating regulatory compliance and reporting
Flaws in the regulation of the banking sector have been addressed initially by Basel III, implemented last year. Financial institutions can comply with capital and liquidity requirements in a natively integrated yet modular environment by utilising…
Investment banks: the future of risk control
This Risk.net survey report explores the current state of risk controls in investment banks, the challenges of effective engagement across the three lines of defence, and the opportunity to develop a more dynamic approach to first-line risk control
Op risk outlook 2022: the legal perspective
Christoph Kurth, partner of the global financial institutions leadership team at Baker McKenzie, discusses the key themes emerging from Risk.net’s Top 10 op risks 2022 survey and how financial firms can better manage and mitigate the impact of…
Emerging trends in op risk
Karen Man, partner and member of the global financial institutions leadership team at Baker McKenzie, discusses emerging op risks in the wake of the Covid‑19 pandemic, a rise in cyber attacks, concerns around conduct and culture, and the complexities of…