Back to Blog
·12 min read

FINMA AI Compliance: What Swiss Financial Institutions Need to Know About AI Testing Requirements

FINMA Guidance 08/2024 sets new AI governance expectations. Learn how Swiss financial institutions should test AI systems and prepare evidence for FINMA supervision.

finmaai-complianceswiss-financeai-security

Switzerland doesn't have an AI Act. It doesn't plan to have one anytime soon. And that's precisely why FINMA's approach to AI risk management deserves close attention — because without a comprehensive AI law, the Swiss Financial Market Supervisory Authority is becoming the de facto AI regulator for Switzerland's financial sector.

In December 2024, FINMA published Guidance Note 08/2024 on governance and risk management when using artificial intelligence. It's not a prescriptive regulation with numbered articles and specific technical requirements. It's a principles-based supervisory communication that tells financial institutions: we expect you to manage AI risks comprehensively, and we'll be checking.

That distinction matters. Under the EU AI Act, you know exactly what Article 15 requires and can build a testing programme to satisfy it. Under FINMA's approach, the expectations are broader and more flexible — but they're also harder to pin down. When a FINMA supervisor asks "how do you test your AI systems?", there's no checklist to hand them. You need to demonstrate that your approach is proportionate, thorough, and documented.

For Swiss financial institutions already using AI — and FINMA's own survey shows roughly 50% have AI applications in production or development — the question isn't whether to start testing. It's how to build a testing programme that satisfies a principles-based regulator who can define "adequate" at their discretion.

FINMA's Regulatory Framework for AI

FINMA doesn't regulate AI through a single, dedicated regulation. Instead, AI governance falls under the existing supervisory framework — primarily through Circular 2023/1 on Operational Risks and Resilience, supplemented by Guidance 08/2024 specifically addressing AI.

Circular 2023/1: Operational Risks and Resilience

FINMA Circular 2023/1 came into force on January 1, 2024, with a two-year transition period requiring full compliance by the start of 2026. It covers operational risk management comprehensively — including technology risk, cyber risk, and model risk.

AI systems fall squarely within this circular's scope. An AI model that makes credit decisions, detects fraud, prices risk, or automates customer interactions is an ICT system that carries operational risk. The circular requires financial institutions to identify, assess, manage, and monitor these risks — which means model validation, testing, and ongoing performance monitoring.

The key connection: Circular 2023/1 treats AI as an operational risk requiring the same rigor as any other ICT system. If you wouldn't deploy a trading system without testing it, you shouldn't deploy an AI credit scoring model without testing it either. The circular doesn't distinguish between traditional software and AI. The expectations are technology-neutral.

Guidance 08/2024: AI-Specific Expectations

Where Circular 2023/1 provides the general operational risk framework, Guidance 08/2024 fills in the AI-specific expectations. FINMA identified four areas where AI use poses particular challenges for financial institutions:

Governance and Responsibility. Financial institutions must create an inventory of AI systems used across the organization and establish clear accountability structures. Responsibilities must be defined for development, implementation, monitoring, and use of AI tools. Critically, FINMA states that responsibility for decisions cannot be delegated to AI or third parties. A human is always accountable.

Robustness and Reliability. Institutions must assess the performance and accuracy of AI outputs through routine quality and reliability checks. FINMA specifically flags that historical data used for training can carry hidden biases that compromise model reliability. The implication: testing can't stop at accuracy metrics on clean validation data. You need adversarial testing that probes for the failure modes that standard validation misses.

Transparency and Accountability. AI systems must be documented comprehensively. Independent reviews should verify that systems operate as intended. Documentation should be easily accessible to support employees in responsible AI use. When a FINMA supervisor asks how your AI chatbot handles edge cases, you need more than "we trained it on good data."

Non-Discrimination. FINMA highlights the risk of biased outputs from AI applications and explicitly references the risk of discrimination in credit assessment — an area where the EU AI Act's Annex III classifies AI use as high-risk. Even without a Swiss AI Act, FINMA expects institutions to test for and mitigate discriminatory patterns in AI outputs.

How FINMA Treats AI as an Operational Risk

FINMA doesn't use the term "model risk" casually. In their framework, AI-related operational risks break down into specific categories that map directly to testing requirements:

Model risk. The AI model itself can fail — through lack of robustness (it produces incorrect outputs under adversarial conditions), lack of correctness (it makes systematic errors), lack of explainability (nobody can explain why it made a specific decision), or bias (it systematically disadvantages certain groups). Each of these failure modes requires different testing approaches.

Data-related risk. AI quality depends on data quality. Risks include poor data security (training data containing sensitive information that could be extracted), poor data quality (garbage in, garbage out), and data availability issues (what happens when data feeds are interrupted). Testing for data leakage — whether an adversarial query can extract training data from the model — is a data risk test, not just a security test.

IT and cyber risk. AI endpoints are attack surfaces. Prompt injection, system prompt extraction, output manipulation — these are cybersecurity risks specific to AI systems that traditional IT security controls don't address. FINMA expects these risks to be managed within the institution's broader IT and cyber risk framework.

Third-party dependencies. FINMA specifically flags the "increasingly concentrated" AI market and the risks of depending on a small number of AI providers. If your critical AI capabilities rely on a third-party API — OpenAI, Anthropic, a specialized model vendor — FINMA expects you to assess and manage the operational risk of that dependency, including testing how your systems behave when the third-party service degrades or changes.

FINMA Risk Category AI-Specific Testing Required Evidence Produced
Model risk — robustness Prompt injection, adversarial manipulation Attack variants tested, bypass rates, guardrail effectiveness
Model risk — correctness Output accuracy testing, edge case handling Accuracy metrics, error rates under various conditions
Model risk — bias Demographic parity testing, fairness metrics Score distributions across protected characteristics
Data risk — security PII leakage, training data extraction Leakage vectors tested, extraction success rates
IT/cyber risk System prompt extraction, output manipulation Attack success rates, control effectiveness
Third-party risk Dependency testing, failover validation Service degradation impact, fallback behavior

FINMA Expectations for Third-Party AI Services

This deserves its own section because it's where many Swiss institutions are most exposed.

FINMA's guidance on third-party dependencies is direct: governance must span the entire AI value chain, not just what happens inside your organization. If you use a third-party AI model to power customer-facing features, you are responsible for the risk management of that model's outputs — even if you didn't build it.

Practically, this means:

Testing the actual deployed model. Vendor assurances about model safety are not testing evidence. If you deploy a third-party AI model in a customer-facing application, FINMA expects you to have tested what that model actually does — not what the vendor says it does. This is adversarial endpoint testing: prompt injection, data leakage, bias, toxicity, robustness under perturbation.

Monitoring for model changes. Third-party AI providers update their models regularly — sometimes without notice. A model that passed your robustness tests in January may behave differently after the provider pushes an update in March. FINMA's emphasis on ongoing monitoring means testing needs to be continuous, not one-time.

Documenting the dependency. Your AI system inventory should include third-party AI components with clear documentation of what they do, what data they process, what risks they carry, and how those risks are managed and tested.

FINMA and EU AI Act: The Dual Compliance Reality

Switzerland is not an EU member state and the EU AI Act does not directly apply to Swiss institutions. But the regulatory reality is more nuanced than that.

Swiss companies serving EU customers. If your Swiss fintech, wealthtech, or banking platform serves customers in EU member states — and many do — the EU AI Act applies to your AI systems that affect those customers. A Zurich-based robo-advisor with clients in Germany is subject to EU AI Act high-risk requirements for its AI-powered investment recommendations.

The Council of Europe AI Convention. In February 2025, the Swiss Federal Council decided to regulate AI primarily by ratifying the Council of Europe's Framework Convention on Artificial Intelligence. While the resulting Swiss legislation won't mirror the EU AI Act's prescriptive requirements, it will establish baseline principles around transparency, accountability, and non-discrimination that align with FINMA's existing expectations.

Regulatory convergence. Even without direct EU AI Act application, Swiss institutions face convergence pressure. EU clients, partners, and counterparties increasingly expect compliance with EU standards. FINMA's own guidance references concerns that align with EU AI Act categories. The practical effect: a Swiss financial institution that tests its AI systems to FINMA standards will cover most of the evidence requirements for EU AI Act Article 15 as well.

The smart approach for dual-jurisdiction institutions: build one AI testing programme that satisfies the more prescriptive standard (EU AI Act Article 15) and map the same evidence to FINMA's principles-based expectations. FINMA won't object to testing evidence that exceeds their minimum expectations — and the EU AI Act gives you a concrete framework to build from.

Testing Requirement FINMA Expectation EU AI Act Requirement Same Evidence?
Adversarial robustness Guidance 08/2024 — robustness & reliability Article 15(3) — resilience against errors Yes
Data leakage Circular 2023/1 — data security risk Article 15(4) — cybersecurity Yes
Bias detection Guidance 08/2024 — non-discrimination Article 15 + Article 10 — data governance Yes
Model monitoring Circular 2023/1 — ongoing risk monitoring Article 15(1) — lifecycle robustness Yes
Explainability Guidance 08/2024 — transparency Article 13 — transparency Partial (different scope)
Human oversight Guidance 08/2024 — governance & responsibility Article 14 — human oversight Partial (different mechanisms)

For four core testing areas, the same evidence serves both frameworks. Transparency and human oversight requirements diverge in specifics, but the testing foundation is shared.

Case Study: A Zurich-Based Wealthtech Facing FINMA Scrutiny

A Zurich-based wealthtech startup has built an AI-powered portfolio recommendation engine. The system analyzes market data, client risk profiles, and investment objectives, then generates personalized portfolio allocations. The company has 200 institutional clients across Switzerland and the EU, manages CHF 2 billion in advised assets, and employs 45 people.

During a routine FINMA supervisory review, the examiner asks about the company's AI governance practices. The wealthtech presents its AI development documentation: model architecture papers, training data specifications, validation accuracy metrics (94% alignment with human advisor recommendations on back-tested portfolios).

Then the examiner asks four questions that the documentation doesn't answer:

"Have you tested what happens when someone submits deliberately misleading client profile data?" The model was validated on clean, representative data. Nobody had tested whether adversarial inputs — a client profile crafted to manipulate the recommendation toward high-risk investments — could bypass the model's risk guardrails. This is a robustness gap.

"Can the model's recommendations be reverse-engineered through repeated queries?" If a sophisticated user queries the system with systematic variations, they might extract the model's decision logic — effectively obtaining proprietary investment strategy intellectual property. The company had never tested for model extraction attacks. This is a cybersecurity gap under Circular 2023/1.

"Have you tested whether the model produces different recommendations for demographically similar clients with identical financial profiles?" The company had accuracy metrics but no bias testing. When tested, the model showed statistically significant variation in risk tolerance estimates based on client age and nationality — proxies for protected characteristics. This is a non-discrimination gap under Guidance 08/2024.

"What happens to your recommendations when your third-party market data provider has a 6-hour outage?" The model relied on real-time market data from a single provider. Failover testing had never been conducted for the AI specifically — only for the traditional infrastructure. During simulated data degradation, the model continued generating recommendations based on stale data without alerting operators. This is a third-party dependency risk gap.

The FINMA examiner didn't issue a formal enforcement action. Instead, they documented the gaps and set a 120-day remediation timeline with required evidence:

  • Adversarial robustness testing results across all client-facing AI endpoints
  • Model extraction resistance testing
  • Bias audit with demographic parity analysis
  • Third-party dependency resilience testing with documented failover behavior
  • Integration of all AI-specific testing into the company's existing operational risk testing programme under Circular 2023/1

The total remediation cost: approximately CHF 180,000 in engineering time, external testing, and compliance documentation — plus the reputational impact of having a FINMA supervisory finding on record. Had the company built AI testing into its compliance programme from the beginning, the incremental cost would have been a fraction of that: a few thousand francs per quarterly testing cycle.

Practical Testing Evidence for FINMA Supervision

FINMA's principles-based approach means there's no prescribed format for AI testing evidence. But based on the four focus areas in Guidance 08/2024, here's what a well-prepared institution should be able to present during supervisory review:

1. AI system inventory. A complete register of every AI system in production, including: what it does, what data it processes, who relies on its outputs, the third-party components it depends on, and its risk classification. FINMA explicitly requires this.

2. Risk assessment per system. For each AI system, a documented risk assessment covering model risk, data risk, IT/cyber risk, and third-party risk. Map each risk to Circular 2023/1 operational risk categories and Guidance 08/2024 focus areas.

3. Testing evidence per risk category. Documented test results for each identified risk: adversarial robustness test results (prompt injection, manipulation), data leakage test results (PII extraction, training data leakage), bias testing results (demographic parity, fairness metrics), and third-party resilience results (failover testing, degradation behavior).

4. Remediation evidence. For any test failures, documented remediation actions and re-test results. The before-and-after comparison demonstrates active risk management — exactly what a principles-based regulator wants to see.

5. Ongoing monitoring programme. Evidence that AI testing is integrated into your operational risk testing programme with a defined cadence. FINMA's emphasis on "routine quality and reliability checks" means one-time testing isn't sufficient.

6. Governance documentation. Clear accountability chains showing who is responsible for AI system performance, who reviews test results, who approves remediation actions, and who signs off on continued deployment. FINMA is explicit: responsibility cannot be delegated to AI.

The Swiss Regulatory Trajectory

Switzerland has chosen a different path than the EU. Rather than a comprehensive AI Act, the Federal Council is pursuing ratification of the Council of Europe's AI Convention with sector-specific implementing legislation. A consultation draft isn't expected until the end of 2026, with final legislation unlikely before 2029.

In the interim, FINMA's guidance and circulars are the operative standard for financial institutions. And FINMA has signaled clearly — through Guidance 08/2024, through their AI survey, and through supervisory practice — that they expect financial institutions to be proactive about AI risk management.

The trajectory mirrors FINMA's historical approach to emerging technology risks. When cloud computing gained adoption in Swiss financial services, FINMA published guidance, conducted targeted supervisory reviews, and gradually increased expectations over time. The institutions that moved early established best practices that later became industry norms. The laggards spent more time and money catching up.

AI risk management is on the same trajectory, but faster. FINMA has already published guidance. Supervisory reviews are already asking about AI governance. The institutions that build comprehensive AI testing programmes now will have a fundamentally different supervisory experience than those that wait for prescriptive requirements that may never come in Switzerland's principles-based system.

For financial institutions also subject to DORA — which applies to EU-licensed subsidiaries and branches of Swiss groups — see our DORA AI Resilience Testing Guide for how the testing requirements compare. For EU AI Act Article 15 requirements that apply to Swiss companies serving EU customers, see our EU AI Act Testing Evidence Guide.