This article was paid for by a contributing third party.More Information.
Industrialising the challenge process: AI in operational risk scenario analysis
Banks increasingly rely on scenario analysis to inform resilience and capital planning, but defending the assumptions behind extreme loss events remains difficult. Elseware CEO Patrick Naim and Nedim Baruh, managing director and head of operational risk measurement and analytics at JPMorganChase, explain how structured modelling and artificial intelligence could help industrialise the challenge process
Operational risk scenario analysis has existed for a long time. Why is it under more scrutiny now?
Patrick Naim, Elseware: I wouldn’t describe it as scrutiny so much as more ambitious objectives for scenario analysis. When scenario analysis was part of the advanced measurement approach (AMA) framework, it mainly served capital modelling. Scenarios added synthetic points to the loss distribution used for regulatory capital.
Now that AMA is gone, scenarios no longer feed a regulatory formula. Instead, they have become central to forward-looking risk management, including the Internal Capital Adequacy Assessment Process, operational resilience and recovery planning.
That shift changes the nature of the exercise. Scenarios can’t remain narrative descriptions of plausible events. They need to behave like parameterised mechanisms that generate losses. In practice, that means the assumptions inside a scenario must be explicit and sensitive to change. If adjusting them doesn’t affect the outcome, the scenario is unlikely to be useful for decision‑making.
Nedim Baruh, JPMorganChase: I like Patrick’s use of “ambition” here. As practitioners, we’ve raised our own expectations for scenario analysis because we see the potential in the outputs, particularly as a forward-looking assessment of risk.
At the same time, translating that ambition into full acceptance has been harder than many of us expected. When scenarios were used for regulatory purposes, there was an inherent level of acceptance because regulators required them. Today, there is interest from management and risk owners, but there is still some scepticism about whether the results are robust enough to support critical decisions.
So there is a natural tension between the ambition we see for the tool and the level of confidence organisations are willing to place in it.
You were early advocates of structured scenario analysis. What problem were you trying to solve?
Patrick Naim: When we started working on this more than 15 years ago, we noticed a paradox. Under the AMA framework, institutions were expected to identify and quantify combinations of events with probabilities above 0.1%. But the dominant modelling approach, the loss distribution approach, was entirely backward-looking.
If you tried to model credit risk simply by fitting a distribution to past write-offs, it would make little sense. In credit and market risk, we model the mechanisms that generate losses: probabilities of default, exposures and sensitivities to market movements.
Operational risk lacked that structure. That is why we developed the XOI approach. Exposure represents the resources that can be affected: people, systems, transactions or products. Occurrence is the probability of the adverse event. Impact is the cost generated when that event affects the resource. In that sense, exposure plays a role similar to loans in credit risk or positions in market risk – it anchors the loss‑generation mechanism.
Nedim Baruh: From a practitioner’s perspective, the earlier approach to scenarios was very simple. You would sit in a room with subject matter experts and focus mainly on estimating the loss severity of an event. It was largely based on the opinions of the people in the room.
As governance and validation processes in banks evolved, that approach was always going to become problematic. Validation teams increasingly started asking for the explanations and evidence behind those expert opinions. If someone says a scenario could generate a billion-dollar loss, the obvious question is: what factors actually drive that outcome?
You can’t answer that without introducing structure. You need to identify the risk factors and describe the loss-generating mechanism – how those factors interact to produce the loss.
How have structured scenarios improved practice, and where do limitations remain?
Nedim Baruh: Structured scenarios have clearly improved the execution of scenario analysis, particularly the ability to challenge assumptions. Discussions with validation teams are much stronger than under earlier, less structured approaches.
Where limitations remain is in evidencing the assumptions behind the risk factors. In some cases, they are data-driven, but often you still rely on subject matter expertise. Demonstrating the robustness of those judgements is difficult because you are ultimately trying to measure something that is rarely observed – the materialisation of an extreme but plausible loss event.
Patrick Naim: The main improvement has been thinking about operational risk in terms of exposures. In financial risk, the exposure is naturally the dollar amount of a loan or position, but in non-financial risk there is no obvious maximum loss. Instead, exposure is defined as the resources exposed to the risk.
A key challenge is granularity. Exposures must be defined as independently exposed units, and in practice that can be difficult. For example, cyber risk cannot be assessed at the workstation level, and market manipulation does not make sense at the desk level – it needs to be considered at a more appropriate level, such as product or product-country combinations.
Another issue is the interaction with controls, which are assessed bottom-up at very granular levels, whereas structured scenarios are more top-down. Finally, there is the risk of false precision. Scenario parameters combine data and expert judgement, so the focus should be on reasonable orders of magnitude rather than excessive accuracy.
Even with structured models, scenario assumptions still rely heavily on expert judgement. Why is the challenge process so hard to systematise?
Nedim Baruh: From an execution perspective, we haven’t really focused on systematising it. In practice, the challenge usually comes through model validation. The process is quite robust, but it’s also periodic – typically once a year or every couple of years.
So, the challenge tends to be point-in-time. When validation raises questions, we respond to those questions, go back to experts and try to support or refine the assumptions.
Patrick Naim: The challenge is unavoidable because expert judgement remains everywhere in the process. Even when structured scenarios use objective business data, those numbers still need to be projected forward, and deciding whether they remain stable or change is already a form of judgement.
Take a mis-selling scenario: you might estimate expected revenue for a product based on real data, but you still must assess the fraction of clients who would complain. That might draw on past or external cases, but ultimately you still have to judge what is plausible for your own institution.
Most institutions concentrate challenge within model validation, which tends to focus on technical aspects, such as data completeness or modelling techniques. In a few institutions we’ve seen independent expert panels – subject matter experts who were not involved in the original design. In those cases, the challenge comes from another expert perspective on the risk itself.
Structured scenarios have clearly improved the execution of scenario analysis, particularly the ability to challenge assumptions. Discussions with validation teams are much stronger than under earlier, less structured approaches.
Nedim Baruh, JPMorganChase
Why do you think AI is now interesting in this context, when it wasn’t realistically feasible five or 10 years ago?
Patrick Naim: There’s a lot of excitement about large language models (LLMs) in risk management because they can leverage large volumes of knowledge and appear able to help identify risks.
But there are two important limitations. First, LLMs interpolate rather than extrapolate. By design, they build on existing knowledge rather than exploring completely new territory. For risk management, this matters, because the models are partly backward-looking by construction.
On a more practical level, LLM outputs are non-deterministic. In a conversational setting, that’s manageable as you can refine the question through interaction. But if you try to use them in a more automated way, the variability of the answers can easily create confusion unless safeguards are in place.
Nedim Baruh: I think the interesting part is the potential to automate things we currently do manually. The ability to analyse large amounts of data could help with processes that are still quite manual and open to interpretation, particularly around structuring challenge and validation.
What are you exploring with TrustAgent?
Patrick Naim: What we’re really trying to do is automate the kind of challenge you would normally get from an expert panel, rather than relying purely on model validation of a structured scenario.
Using AI to generate or quantify a scenario directly would be unstable. Instead, we start with a human assessment, say a 5% probability of a DDoS attack disrupting services for more than four hours, and then ask AI to find evidence that could reasonably challenge that assumption.
The sources AI retrieves are directly linked to the initial assessment. The goal is not to have AI build or quantify the scenario, but to strengthen the challenge process and make the assumptions more robust. At the moment, this is more of a research framework than a product.
Nedim Baruh: Ideally you would want challenge much earlier in the process, but assembling independent expert panels is difficult in practice. Workshop facilitators often focus on getting the exercise done, and it’s not easy to challenge experts directly in that setting.
What this approach could do is introduce a structured, potentially less biased mechanism to challenge assumptions as the scenario is being built. It may require more effort up front, but it could make the process far more transparent and provide much stronger documentation for model validation teams.
What would it mean in practice to “industrialise the challenge process” and why is it important?
Patrick Naim: It’s important because the challenge process is often painful and sometimes only minimally implemented, even though it is essential for the quality of the model.
In practice, we think of it in three steps. First, a search agent identifies potential sources that could challenge an assumption. For example, if we are assessing the probability of a cyber attack, the system might find a survey or dataset with new information about attacks in the financial sector.
Second, we compare that source with the original assumption. A report might mention hundreds of DDoS attacks in the sector, but that does not necessarily correspond to the specific scenario we are trying to assess, such as attacks that disrupt services for more than four hours.
Third, we decide whether the information supports, contradicts or is irrelevant to the assumption. This kind of comparison still requires background knowledge and judgement, which is why AI can help structure the process.
Nedim Baruh: The need to industrialise the challenge process has always been there; we just didn’t have the tools.
What we’re really talking about is making that challenge more structured and systematic using modern technology. The need already exists, but today it’s handled in ways that are inefficient and resource-intensive.
If this work is successful, what would change for banks in how they design and govern operational risk scenarios?
Patrick Naim: Operational risk governance would become less about defending numbers and more about documenting and challenging the reasoning behind them. The idea is to provide automated support for the challenge process, so discussions focus less on debating the numbers and more on the logic and evidence behind the assumptions.
Nedim Baruh: The biggest change would be reducing the effort required to defend scenario analysis. Answering validation questions, demonstrating challenge and performing ongoing monitoring are all parts that make the overall process extremely time-consuming.
It may require a bit more structure up front, thinking more carefully about parameters and how the challenge process is organised, but that’s a small investment for the time it could save later.
What are some of the biggest unanswered questions or risks in applying AI to something as sensitive as operational risk?
Patrick Naim: AI is being discussed everywhere at the moment, so it helps to distinguish between different types of applications. First, there is conversational use – for example, analysing or summarising documents. Second, there are automated mapping exercises, such as linking risks to controls or regulations to processes. The third category is decision systems, such as fraud detection, anti-money laundering or credit processing. Those are not new; earlier generations of neural networks have been used for these purposes for years.
Many of the risks associated with AI are already familiar: non-deterministic outputs, opaque reasoning and the illusion of objectivity created by structured output. But these issues are not unique to AI. Human decision-making is already subject to bias and overconfidence.
The key is how AI is embedded in the process. The idea behind TrustAgent is for AI to operate within a structured workflow, searching for evidence, comparing it with scenario assumptions and documenting the reasoning. By organising the information used to challenge assumptions, the process becomes more transparent and auditable.
Nedim Baruh: The goal isn’t to automate the challenge process. What we’re really doing is structuring that process up front and making it more disciplined and better documented.
AI helps create a more robust and repeatable way of evidencing the challenge, rather than leaving it as something that happens afterwards during validation.
Patrick Naim: We showed that introducing structure improves scenario quantification, moving from narrative descriptions to risk-factor models. The next step is to bring the same discipline to the challenge process, shifting from ad hoc reviews to a consistent, auditable, evidence-based workflow.
Structured modelling is also a prerequisite for using AI meaningfully. In that sense, TrustAgent creates a virtuous circle: better structure improves quantification, and that structure makes it possible to challenge assumptions in a more systematic way.
To learn more about structured operational risk modelling and TrustAgent, visit Elseware.
The interviewees were speaking in a personal capacity. The views expressed in this Q&A do not necessarily reflect or represent the views of their respective institutions.
Sponsored content
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net