Do banks still need to validate GenAI models?

- Kris Devasabai
- 2026年5月7日

Most banks skipped SR 11-7 validation for Copilot and other GenAI tools by classifying them as productivity software, enabling rapid deployment.
On April 17 regulators replaced SR 11-7 with new guidance SR 26-2 and excluded GenAI from its scope.
At least two large banks, Bank of America and Goldman Sachs, were validating GenAI models including Copilot before SR 11-7 was withdrawn.
Experts warn foundation models are hard to validate but say validation plus controls such as human-in-the-loop and approved prompts are needed, and regulators may formalise GenAI guidance for model risk management.

これらの要約はLLMによって生成されていますが、人間によって編集されています。

Banks got Microsoft Copilot into the hands of employees at record speed.

Too fast, if you ask some.

Long-standing US model risk guidance, known as SR 11-7, required banks to validate quantitative models used in decision making before deployment.

As Risk.net reported in April, most banks skipped this step with Copilot and other generative AI applications by classifying them as productivity tools rather than models. This allowed them to quickly deploy the latest AI tools without having to confront thorny questions about how they actually work.

Most in the industry are convinced this was the right thing to do. GenAI is a transformative technology, they argue, and falling behind the adoption curve is simply not an option. Model validation was seen as a roadblock to progress.

That argument is not without merit. SR 11-7 required banks to validate models by evaluating their conceptual soundness and running backtests to see how they perform in different scenarios. That’s a reasonable ask for the traditional quantitative models banks use to calculate everything from value-at-risk to default probabilities. But it’s a tall order for GenAI models, which never seem to give the same answer twice.

An AI head at a large global bank scoffs at the idea that banks would even attempt to validate Copilot, which is powered by ChatGPT, OpenAI’s large language model.

“I don’t think Copilot should be validated,” this person says. “You cannot validate a foundation model. If other banks want to try to do that, they can. We don’t.”

US banking regulators have given banks a pass on GenAI validation – for now at least. On April 17, they replaced SR 11-7 with new model risk guidance, known as SR 26-2, and excluded GenAI models from its scope.

But not everyone in the industry agrees with the proposition that GenAI models are beyond validation. As Risk.net reported in April, at least two large US banks – Bank of America and Goldman Sachs – were subjecting GenAI models, including AI assistants such as Copilot, to validation by default before SR 11-7 was withdrawn by regulators.

Others feel the same way. In a paper published in the Journal of Operational Risk on March 27, Krishan Kumar Sharma, a model risk leader and senior vice-president at Citi, proposed a six-pillar model risk governance framework for GenAI, which he believes could serve as a foundation for new supervisory guidance tailored specifically for generative models.

One of Sharma’s recommendations is that “the GenAI system itself and the specific applications built upon it should be formally classified as a model within the bank’s model risk management framework” and undergo “formal model validation process before deployment”.

This recommendation “is based on real-time testing”, Sharma writes in the paper, though the results of the testing were not published “for reasons of confidentiality”.

Banks may want to think twice about using SR 26-2 as a reason to abandon model risk management for GenAI

Validating GenAI models is not easy, Sharma concedes, but he argues this step is necessary to ensure the accuracy and robustness of their outputs.

“I fully agree that foundation models themselves are difficult to validate in the traditional sense – that is not in dispute,” Sharma tells Risk.net. “But the goal is to ensure that the risks specific to GenAI, particularly hallucination and opacity, are surfaced and managed through the model risk management lifecycle rather than left outside of it.”

Sharma does not claim validation is a magic bullet for GenAI risks. To the contrary, he cautions that validation of GenAI applications cannot be as exhaustive as for deterministic models and needs to be reinforced with other controls. This includes a human-in-the-loop mandate – which Sharma calls “the single most important control for mitigating risks such as hallucination” – and adherence to a library of approved and validated prompts.

The benefit of validation, he argues, is that it supports other parts of the model risk management framework by clearly identifying the risks and weaknesses of GenAI systems.

Banks, then, may want to think twice about using SR 26-2 as a reason to abandon model risk management for GenAI. The expectation among seasoned model risk managers – including Sharma – is that regulators will in due course publish separate guidance specifically for GenAI and agentic AI use cases. This seems like the most likely outcome. Regulators often allow banks to develop their own frameworks for emerging risks and then use them as a starting point for formalising supervisory requirements. This was the case with SR 11-7 – supervisors let industry practice mature, then codified it.

With several large banks opting to voluntarily subject GenAI models to validation, this could easily become a supervisory expectation in the near future. Banks that jettison validation to get ahead in the AI race could find themselves falling behind supervisory expectations.

Editing by Alex Krohn

コンテンツを印刷またはコピーできるのは、有料の購読契約を結んでいるユーザー、または法人購読契約の一員であるユーザーのみです。

これらのオプションやその他の購読特典を利用するには、info@risk.net にお問い合わせいただくか、こちらの購読オプションをご覧ください： http://subscriptions.risk.net/subscribe

現在、このコンテンツを印刷することはできません。詳しくはinfo@risk.netまでお問い合わせください。

現在、このコンテンツをコピーすることはできません。詳しくはinfo@risk.netまでお問い合わせください。

当社の利用規約、https://www.infopro-digital.com/terms-and-conditions/subscriptions/（ポイント2.4）に記載されているように、印刷は1部のみです。

追加の権利を購入したい場合は、info@risk.netまで電子メールでご連絡ください。

このコンテンツは、当社の記事ツールを使用して共有することができます。当社の利用規約、https://www.infopro-digital.com/terms-and-conditions/subscriptions/（第2.4項）に概説されているように、認定ユーザーは、個人的な使用のために資料のコピーを1部のみ作成することができます。また、2.5項の制限にも従わなければなりません。

追加権利の購入をご希望の場合は、info@risk.netまで電子メールでご連絡ください。

詳細はこちら我々の見解

我々の見解

「SaaSpocalypse」は、プライベート市場にリスクモデルが必要であることを示している

投資家たちは、プライベート・クレジットにおける損失がどれほど深刻なものになるか、ほとんど見当がつかない

2026年6月4日

A skull made from software code is eroding as if being blown away

我々の見解

プライベート・クレジットの開示は、答えよりも疑問を残す結果となっている

指標の不統一や手当たり次第の報告が、米国の金融機関間の比較を妨げています

2026年5月28日

a dollar bill has a magnifying glass hovering over it to denote analysis

我々の見解

「ライトタッチ・ブリゲード」への追加料金

米国のG-Sibサーチャージの改革は、単なる見直しをはるかに超えるものです

2026年5月21日

我々の見解

イランをめぐる混乱は、因果モデル化の必要性を裏付けている

Claudeを用いて構築された新しい予測モデルによると、原油価格は再び100ドルを上回る可能性があると示唆されています

2026年4月10日

Blue barrels of oil lie on background of Iranian rial banknotes

我々の見解

クレジット市場の計算が合わない様子である

今日の投資家にとっては、「リスクの高い」債券を購入するほうが得策であるように思われます

2026年4月2日

我々の見解

イラン情勢により、外国為替取引は不可能になってしまったのだろうか

コストの高さや機会の短さにもかかわらず、FXオプションの取引高が急増しています

2026年4月1日

我々の見解

Can AI be the great equaliser in e-FX?

FX market-makers see real benefits for agentic AI in code generation and data analysis

2026年3月17日

我々の見解

モデル・リスク・マネージャーの孤独

取締役会は、それらをイノベーションの足かせと見なすかもしれません。リスク管理部門は、効率性を重視していることを示す必要があります

2026年3月13日