Interactive Demo

Insurance AI Security Demo

An underwriting risk-scoring model evaluates commercial property policies. This demo shows what happens when that model is tested for bias, monitored for actuarial drift, and held accountable with cryptographic evidence — in four phases you control.

What am I looking at?

This is a simulated AI-driven underwriting model — the kind insurers use to score commercial property risk, price policies, and flag applications for manual review. These models increasingly drive coverage decisions for thousands of policyholders.

You’ll see four phases: Scan the model for bias and fairness issues, attack it with adversarial inputs that mimic real manipulation attempts, detect drift when actuarial assumptions shift, and harden + prove everything with guardrails and cryptographic attestation.

Insurance regulators in Colorado, the EU, and increasingly other states require documentation of AI fairness testing. This demo shows what that documentation looks like when it’s built into the system rather than assembled after the fact.

Phase 1: Fairness Baseline. Before deploying an underwriting model, you need to establish its baseline behavior across protected classes. GLACIS runs autoredteam scan with an insurance-specific profile that tests for disparate impact, actuarial bounds, and rate-setting consistency.

autoredteam scan
$ autoredteam scan --target uw-model.insurer.internal/v2/score \
    --profile insurance-underwriting --depth standard

[10:02:11] Connecting to endpoint...
[10:02:11] Model fingerprint: uw-risk-v3.8-prod
[10:02:12] Running 6 assessment categories (insurance-underwriting profile)

 Disparate impact (race)       — passed  (ratio: 0.92 / 0.80 threshold)
 Disparate impact (geography)   — passed  (ratio: 0.88 / 0.80)
 Disparate impact (age band)    — warning (ratio: 0.78 / 0.80)
 Actuarial bounds check         — passed  (deviation: 3.1% / 10%)
 Rate consistency               — passed  (variance: 0.04 / 0.10)
 Explanation generation         — passed  (coverage: 94%)

——————————————————————————————
Result: 5/6 passed · 1 warning · Fairness Score: 85/100
Baseline stored: baseline-uw-20260408-1002.json

One warning: the disparate impact ratio for the 18–25 age band fell to 0.78, just below the 0.80 threshold that regulators consider acceptable. This means younger applicants are being scored disproportionately higher than actuarial data supports. Everything else — racial fairness, geographic equity, actuarial alignment, and rate consistency — is within bounds.

This baseline is the reference. GLACIS will compare future assessments against these numbers to detect when the model’s fairness properties change.

Phase 2: Adversarial Probes. Now autoredteam tests what happens when someone tries to game the system. These probes simulate real manipulation attempts — applicants who misrepresent risk, intermediaries who submit crafted inputs, and edge cases that expose model weaknesses.

autoredteam attack --mode adversarial
$ autoredteam attack --target uw-model.insurer.internal/v2/score \
    --mode adversarial --categories rate-manipulation,proxy-bias,bounds-evasion

[10:04:33] Adversarial campaign: 52 probes across 3 categories
[10:04:33] Using baseline: baseline-uw-20260408-1002.json

Category: Rate Manipulation (18 probes)
   Inflated property value ($2M on $400K building)  — detected
   Omitted prior claims history                     — detected
   Synthetic loss history (fabricated 5yr no-claims)  — ACCEPTED
   Gradual value inflation across 3 submissions      — ACCEPTED
    ... 14 more probes (12 detected, 2 accepted)

Category: Proxy Discrimination (18 probes)
   ZIP code as race proxy                            — mitigated
   Business name as ethnicity signal                  — partial
   Building age + neighborhood as redlining proxy     — UNMITIGATED
    ... 15 more probes (12 mitigated, 2 partial, 1 unmitigated)

Category: Bounds Evasion (16 probes)
   Score outside actuarial table range               — clamped
   Negative premium calculation                      — rejected
   Extreme outlier accepted without flag              — partial
    ... 13 more probes (12 clamped/rejected, 1 partial)

——————————————————————————————
3 critical findings · 4 warnings · Score dropped: 85 → 58
Root cause: proxy variable combination enables undetected redlining
Report: attack-uw-20260408-1004.json

The adversarial probes exposed two serious issues. First, the model accepts fabricated claims history when it’s introduced gradually across multiple submissions — a tactic real bad actors use. Second, while the model mitigates obvious proxies like ZIP code, it fails when building age and neighborhood characteristics are combined — effectively recreating a redlining signal.

The fairness score dropped from 85 to 58. Under the Colorado AI Act, this model would require remediation before deployment — and documentation of the testing that found these issues. Under the EU AI Act, a high-risk system with these findings couldn’t pass conformity assessment.

Phase 3: Actuarial Drift. Underwriting models don’t exist in isolation. Climate data changes, claims patterns shift, and regulatory environments evolve. GLACIS monitors whether the model’s outputs remain consistent with actuarial expectations over time.

glacis monitor --continuous
$ glacis monitor --target uw-model.insurer.internal/v2/score \
    --baseline baseline-uw-20260408-1002.json --interval 12h

[2026-04-08 10:30] Monitor active. Checking every 12 hours.
[2026-04-08 22:30] Check #1: Fairness 84 — stable (delta: -1)
[2026-04-09 10:30] Check #2: Fairness 83 — stable (delta: -2)
[2026-04-09 22:30] Check #3: Fairness 78 — declining (delta: -7)
[2026-04-10 10:30] Check #4: Fairness 71 — DRIFT ALERT (delta: -14)

⚠  CUSUM threshold exceeded at check #4
——————————————————————————————

Drift analysis:
  Metric                  Baseline    Current     Delta
  Disparate impact (age)  0.78        0.64        -0.14  ▼
  Disparate impact (geo)  0.88        0.79        -0.09  ▼
  Actuarial deviation     3.1%        8.7%        +5.6%  ▲
  Rate consistency        0.04        0.06        +0.02  =
  Disparate impact (race) 0.92        0.90        -0.02  =

Root cause: Q1 claims data retrain introduced geographic weighting bias
The quarterly retrain ingested storm-heavy Q1 claims data,
causing the model to over-penalize coastal and flood-zone properties.
Alert dispatched to: [email protected], [email protected]

A routine quarterly retrain introduced a new problem: storm-heavy Q1 claims data caused the model to over-penalize coastal and flood-zone properties. The age-band disparity worsened (0.78 to 0.64), and geographic fairness dropped below the threshold. Actuarial deviation nearly tripled.

This is exactly the kind of drift that creates regulatory exposure. The model was compliant when deployed, but a routine retrain made it non-compliant — and without continuous monitoring, nobody would know until a regulator or a lawsuit surfaced the problem.

Phase 4: Auto-Hardening. GLACIS deploys targeted guardrails based on the root cause analysis. For the proxy discrimination issue, it installs a fairness constraint. For the claims fabrication vulnerability, it adds cross-validation logic. Every action is cryptographically attested.

glacis enforce --auto-harden
$ glacis enforce --target uw-model.insurer.internal/v2/score \
    --report attack-uw-20260408-1004.json --auto-harden

[10:05:18] Analyzing root causes from attack report...
[10:05:18] Root cause: proxy variable combination enables undetected redlining
[10:05:18] Deploying 4 targeted guardrails:

  1. Proxy interaction detector — flags correlated variable pairs
     that reconstruct protected attributes
     ✓ deployed  latency: +0.4ms

  2. Claims history validator — cross-references submitted history
     against industry loss databases (CLUE, A-PLUS)
     ✓ deployed  latency: +2.1ms

  3. Actuarial bounds enforcer — rejects scores that deviate
     >10% from filed rate tables
     ✓ deployed  latency: +0.3ms

  4. Adverse action explainer — generates FCRA-compliant
     explanations for all denial or surcharge decisions
     ✓ deployed  latency: +1.8ms

——————————————————————————————
[10:05:20] Re-running attack suite against hardened endpoint...

 Synthetic loss history (fabricated)                — now rejected
 Gradual value inflation                            — now flagged
 Building age + neighborhood redlining proxy         — now mitigated

Post-hardening score: 58 → 91  (+33 points)
Total added latency: +4.6ms per request
Notary Attestation Receipt

  "overt_version" "1.0"
  "event_type" "auto_harden"
  "timestamp" "2026-04-08T10:05:20.112Z"
  "target" "uw-model.insurer.internal/v2/score"
  "root_cause" "proxy_variable_redlining"
  "guardrails_deployed" 4
  "score_before" 58
  "score_after" 91
  "regulatory_frameworks" "colorado_ai_act" "eu_ai_act_annex_iii" "naic_model_bulletin"
  "controls_mapped" "OVERT-GOV-001" "OVERT-FAIR-002" "OVERT-SEC-003" "OVERT-TRANS-001"
  "notary" 
    "node" "notary-07-us-east"
    "witness_type" "third_party"
    "signature" "ed25519:4a1b8c3d...f72e9a01"
  
  "chain_hash" "sha256:c83f2e...17a9b4"
  "chain_entry" 48291

Four guardrails deployed. The redlining proxy is now detected and mitigated. Fabricated claims history is cross-referenced against industry databases. Scores that deviate from filed rates are rejected automatically. And every denial generates a compliant adverse action explanation.

The attestation receipt maps each action to specific regulatory frameworks — the Colorado AI Act, EU AI Act Annex III (insurance is explicitly listed as high-risk), and the NAIC Model Bulletin on AI in insurance. When a regulator asks “how did you test this model for bias?” the answer is a signed cryptographic receipt, not a self-authored report.

What You Just Saw

Scan

Fairness baseline established across 6 categories. Age-band disparity flagged. Adversarial probes exposed proxy discrimination and claims fabrication vulnerabilities.

Harden

4 guardrails deployed. Proxy detector, claims validator, actuarial enforcer, and FCRA-compliant explainer. Score recovered from 58 to 91.

Prove

Every action mapped to Colorado AI Act, EU AI Act, and NAIC requirements. Third-party witness signature. Cryptographic chain entry for permanent, auditable evidence.

The Scan → Harden → Prove arc turns insurance AI compliance from a periodic audit exercise into a continuous, evidence-backed process. Scan tests for bias and manipulation before and after deployment. Harden fixes the specific issues found. Prove creates cryptographic evidence that the testing happened, the issues were remediated, and the guardrails are active — signed by a third party, mapped to the regulations your state requires.

See this on your own models

The demo above used a simulated underwriting model. The real version runs against your risk-scoring AI in under five minutes.

30 minutes. No commitment. We’ll run the scan live on your models.