Incident Response • Updated December 2025

AI Incident Response Playbook

Complete framework for handling AI failures, security incidents, and bias events. Aligned with NIST SP 800-61 and EU AI Act requirements.

26 min read 7,000+ words
Joe Braidwood
Joe Braidwood
CEO, GLACIS
26 min read

Executive Summary

AI incidents surged 56.4% from 2023 to 2024, reaching 233 documented cases—yet most organizations lack AI-specific incident response procedures.[1] Traditional IT incident response frameworks don't address the unique challenges of AI failures: model drift, adversarial attacks, data poisoning, bias incidents, and hallucinations that can take 4.5 days on average to detect.[2]

This playbook provides a complete AI incident response framework aligned with NIST SP 800-61 principles, MITRE ATLAS adversarial tactics, and EU AI Act Article 62 serious incident reporting requirements. We cover the full lifecycle from preparation through post-incident review, with specific procedures for model failures, security incidents, and bias events.

Key finding: 67% of AI incidents stem from model errors rather than adversarial attacks, but organizations disproportionately focus security budgets on external threats while neglecting operational resilience.[3]

77%
AI Projects Fail[4]
4.5
Days Avg Detection[2]
$4.24M
Avg Breach Cost[5]
67%
Incidents from Errors[3]

In This Guide

What is AI Incident Response?

AI incident response is the structured process of detecting, containing, investigating, and recovering from failures or security incidents in AI and machine learning systems. Unlike traditional IT incident response, which focuses on infrastructure availability and data breaches, AI incident response addresses the unique failure modes of probabilistic systems.

How AI Incident Response Differs from Traditional IR

Traditional IR vs. AI Incident Response

Dimension Traditional IR AI Incident Response
Primary Focus System availability, data confidentiality Model performance, output quality, fairness
Incident Types Malware, DDoS, unauthorized access Model drift, adversarial attacks, bias, hallucinations
Detection Methods SIEM, IDS/IPS, log analysis Performance monitoring, drift detection, anomaly detection
Forensics Disk imaging, memory dumps, network captures Model interrogation, feature attribution, training data analysis
Recovery Restore from backup, patch systems Model rollback, retraining, data cleaning, validation
Skills Required Security analysts, network engineers Data scientists, ML engineers, security researchers

The NIST Computer Security Incident Handling Guide (SP 800-61) provides the foundational framework for incident response, but requires AI-specific adaptations. MITRE's ATLAS framework (Adversarial Threat Landscape for Artificial-Intelligence Systems) extends traditional ATT&CK with ML-specific tactics and techniques.[6][7]

Types of AI Incidents

AI incidents fall into several distinct categories, each requiring different detection and response procedures:

1. Model Performance Degradation

Gradual or sudden decline in model accuracy, precision, or recall. Causes include data drift (input distribution changes), concept drift (relationship changes between features and target), training-serving skew, or infrastructure issues.

Example: Amazon's Hiring Algorithm (2018)

Amazon abandoned an AI recruiting tool after discovering it systematically downgraded resumes from women. The model was trained on historical hiring data that reflected gender bias in technical roles, learning to penalize keywords like "women's chess club captain."[8]

2. Adversarial Attacks

Intentional manipulation of inputs to cause misclassification or targeted behavior. Types include evasion attacks (test-time perturbations), poisoning attacks (training data corruption), model extraction, and membership inference attacks.

Example: Tesla Autopilot Phantom Braking (2021-2022)

Researchers demonstrated adversarial attacks against Tesla's vision system using strategically placed stickers that caused phantom object detection and emergency braking. NHTSA investigated over 750,000 vehicles for sudden braking incidents.[9]

3. Data Poisoning

Corruption of training data to degrade model performance or introduce backdoors. Particularly dangerous in systems using continuous learning, federated learning, or third-party datasets.

Example: Microsoft Tay (2016)

Microsoft's Tay chatbot was taken offline within 16 hours after coordinated users exploited its learning mechanism to teach it offensive content. The bot learned from Twitter interactions without adequate filtering, demonstrating the vulnerability of online learning systems to data poisoning.[10]

4. Bias and Discrimination Incidents

Systematic unfair treatment of protected groups. Can result from biased training data, proxy features, or amplification of historical discrimination. Carries legal and reputational risk.

Example: SafeRent Solutions ($2.2M Settlement, 2024)

SafeRent's tenant screening algorithm faced class-action litigation for systematic discrimination against Black and Hispanic renters. The settlement required eliminating automated accept/decline scores and mandatory independent fairness audits.[11]

5. Hallucinations and Output Failures

Generative AI producing false, fabricated, or nonsensical outputs presented as factual. Particularly dangerous in legal, medical, and financial applications where users trust AI-generated content.

Example: Air Canada Chatbot Liability (2024)

Air Canada was held liable for incorrect bereavement fare information provided by its chatbot. The court ruled the airline responsible for its chatbot's statements, establishing precedent that companies cannot disclaim responsibility for AI-generated misinformation.[12]

6. Privacy Breaches and Data Leakage

Unintended exposure of training data through model outputs, membership inference attacks that reveal whether specific individuals were in training data, or model inversion attacks that reconstruct training samples.

Example: Samsung ChatGPT Ban (2023)

Samsung banned employee use of ChatGPT after engineers accidentally leaked proprietary source code and meeting notes by using the tool for code optimization and meeting transcription. The data became part of ChatGPT's training corpus, potentially exposing it to competitors.[13]

The AI Incident Response Lifecycle

Based on NIST SP 800-61, the AI incident response lifecycle consists of six phases. Unlike traditional IR, AI incidents often require iteration between investigation and containment as root causes emerge through model analysis.

1

Preparation

Build capabilities before incidents occur: monitoring infrastructure, runbooks, team training, stakeholder contacts, rollback procedures.

2

Detection

Identify anomalies through automated monitoring, user reports, or external notifications. Determine if incident requires escalation.

3

Containment

Stop ongoing harm while preserving evidence. Options: model rollback, traffic reduction, feature flags, circuit breakers, full shutdown.

4

Eradication

Remove root cause: clean poisoned data, retrain models, patch vulnerabilities, remove backdoors, address bias sources.

5

Recovery

Restore normal operations: validate corrected model, implement enhanced monitoring, gradual rollout, stakeholder communication.

6

Lessons Learned

Post-incident review: document timeline, identify gaps, update procedures, implement preventive controls, share knowledge.

Building an AI Incident Response Team

AI incident response requires a cross-functional team combining traditional security skills with AI/ML expertise. Larger organizations may maintain dedicated AI security teams; smaller organizations can augment existing IR teams with ML specialists.

Core Roles and Responsibilities

AI Incident Response Team RACI Matrix

Role Responsibilities Required Skills
Incident Commander Coordinate response, stakeholder communication, decision authority Leadership, communication, technical breadth
ML Engineer Model forensics, performance analysis, retraining, deployment MLOps, model debugging, feature engineering
Data Scientist Statistical analysis, bias detection, data quality assessment Statistics, fairness metrics, exploratory analysis
Security Analyst Adversarial attack investigation, forensics, threat intelligence Security analysis, MITRE ATLAS, adversarial ML
Data Engineer Data lineage tracing, pipeline investigation, data cleaning ETL, data governance, pipeline debugging
Legal/Compliance Regulatory notification, disclosure decisions, liability assessment AI regulations, privacy law, incident reporting
Communications Customer notification, public statements, internal updates Crisis communication, technical translation

Detection and Monitoring

Effective AI incident response begins with robust detection capabilities. The average AI incident takes 4.5 days to detect—compared to 2.3 days for traditional security incidents—because organizations lack AI-specific monitoring.[2]

Detection Methods

Monitoring Infrastructure

Production AI systems should implement comprehensive observability:

Essential Monitoring Capabilities

Model Metrics
  • Prediction accuracy and error rates
  • Confidence distributions
  • Fairness metrics (demographic parity, equalized odds)
  • Drift scores (data and concept)
Infrastructure Metrics
  • Inference latency and throughput
  • Resource utilization (CPU, GPU, memory)
  • Error rates and timeout frequency
  • Model version tracking
Data Quality
  • Feature distribution statistics
  • Missing value rates
  • Out-of-range value detection
  • Schema validation failures
Security Events
  • Adversarial input detection
  • Unusual query patterns
  • API abuse indicators
  • Model extraction attempts

Containment Strategies

Containment stops ongoing harm while preserving forensic evidence. AI incidents require model-specific containment tactics beyond traditional infrastructure isolation.

Containment Options (Ordered by Invasiveness)

1

Traffic Throttling

Tactic: Reduce traffic to affected model using rate limiting or load balancer adjustment. Use when: Investigating performance degradation but not confirmed critical failure. Preserves: Full functionality for reduced user base while limiting blast radius.

2

Shadow Mode

Tactic: Route production traffic through model for logging but use fallback for actual decisions. Use when: Suspected bias or accuracy issues requiring investigation without user impact. Preserves: Business continuity while collecting incident data.

3

Feature Flag Disable

Tactic: Disable AI-powered feature while keeping core application functional. Use when: AI feature is non-critical and incident requires immediate mitigation. Preserves: Core service availability with graceful feature degradation.

4

Model Rollback

Tactic: Revert to previous known-good model version. Use when: Incident began after recent deployment and previous version was stable. Preserves: Previous functionality level; loses recent improvements.

5

Full Shutdown

Tactic: Complete service shutdown. Use when: Ongoing harm (privacy breach, safety risk, discriminatory decisions) exceeds business continuity value. Preserves: Organization from liability; eliminates service availability.

Containment Decision Matrix

Containment Strategy by Incident Type

Incident Type Severity Low Severity Medium Severity High
Performance Degradation Traffic throttling Shadow mode Model rollback
Adversarial Attack Rate limiting Input filtering Full shutdown
Data Poisoning Shadow mode Model rollback Full shutdown + retrain
Bias/Discrimination Shadow mode Feature flag disable Full shutdown
Hallucinations Output filtering Human-in-loop Feature flag disable
Privacy Breach Output filtering Full shutdown Full shutdown + legal

Investigation and Root Cause Analysis

AI incident investigation requires both traditional forensics and ML-specific analysis techniques. The goal is to determine what happened, why it happened, and what data/models were affected.

Model Forensics Techniques

MITRE ATLAS Framework

For adversarial incidents, map attacker techniques to MITRE ATLAS (Adversarial Threat Landscape for AI Systems). ATLAS extends ATT&CK with ML-specific tactics:[7]

MITRE ATLAS Tactics

Tactic Description Example Techniques
Reconnaissance Gather information about ML system Model probing, API exploration, documentation harvesting
Resource Development Establish resources for attacks Acquire datasets, develop perturbations, build shadow models
ML Model Access Obtain model access or information Model extraction, membership inference, API abuse
ML Attack Staging Prepare attack components Craft adversarial examples, poison training data, create backdoors
Evade ML Model Cause misclassification Adversarial perturbations, input manipulation, confidence reduction
Impact Manipulate or disrupt ML capabilities Model poisoning, availability attacks, integrity compromise

Communication and Disclosure

AI incident communication requires balancing transparency, legal obligations, and reputation management. Regulatory requirements increasingly mandate disclosure of AI failures.

EU AI Act Article 62: Serious Incident Reporting

The EU AI Act requires providers of high-risk AI systems to report serious incidents to market surveillance authorities within 15 days:[14]

What Constitutes a “Serious Incident”?

  • Any incident that directly or indirectly leads to death, serious harm to health, or serious disruption of critical infrastructure
  • Serious breach of fundamental rights protected under EU law (discrimination, privacy, due process)
  • 15-day reporting deadline from when provider becomes aware of the incident
  • Follow-up reports required if additional information becomes available

Stakeholder Communication Matrix

Who to Notify and When

Stakeholder Timing Content
Internal Leadership Immediately upon detection Incident summary, business impact, containment status
Legal/Compliance Within 1 hour Full technical details, regulatory exposure, disclosure obligations
Affected Users 24-72 hours (depending on severity) What happened, what data/decisions affected, remediation steps
Regulators (EU) 15 days (serious incidents) EU AI Act Article 62 format: nature, severity, corrective measures
Customers (B2B) Per contractual SLA Service impact, timeline, compensatory measures
Public/Media Only if necessary Controlled messaging, avoid speculation, focus on remediation

Communication Best Practices

Recovery and Remediation

Recovery restores normal operations with validated fixes. Unlike traditional IR where recovery means "restore from backup," AI recovery often requires retraining, revalidation, and gradual rollout.

Recovery Steps

1

Root Cause Remediation

Actions: Clean poisoned data, retrain with balanced datasets, patch vulnerabilities, implement input validation, add adversarial robustness training.

Validation: Verify fix addresses root cause, not just symptoms. Test on holdout data representing incident conditions.

2

Model Revalidation

Actions: Run full test suite including fairness tests, adversarial robustness tests, edge case coverage, stress testing. Compare performance to pre-incident baseline.

Validation: Achieve statistical significance in improvement. Document test results for regulatory compliance.

3

Enhanced Monitoring

Actions: Add monitoring for incident-specific signals (e.g., if bias incident, add demographic disparity dashboards). Tighten alert thresholds. Implement earlier warning indicators.

Validation: Confirm alerts would have fired during incident timeline (backtesting).

4

Gradual Rollout

Actions: Deploy to canary environment (5% traffic) → staged rollout (25% → 50% → 100%). Monitor each stage for regression. Maintain rollback capability.

Validation: Performance metrics equal or exceed pre-incident baseline across all stages.

5

Stakeholder Notification

Actions: Notify affected users of resolution. Provide recourse for historical decisions (e.g., manual review of rejected applications). Update regulators on corrective measures.

Validation: Confirm all contractual and regulatory notification obligations met.

Post-Incident Review and Lessons Learned

Post-incident reviews convert incidents into organizational learning. Conduct within 1-2 weeks while details are fresh, with blame-free focus on process improvement.

Post-Incident Review Agenda

Essential Review Components

  1. Incident Timeline: Reconstruct complete timeline from initial cause through detection, containment, investigation, and recovery. Identify time gaps and delays.
  2. Detection Analysis: How was incident detected? Could it have been detected earlier? What monitoring gaps existed?
  3. Response Effectiveness: What worked well? What slowed response? Were runbooks accurate and helpful?
  4. Root Cause: Technical root cause, organizational root cause (why did vulnerability exist?), and contributing factors.
  5. Impact Assessment: Users affected, decisions impacted, financial cost, reputational damage, regulatory exposure.
  6. Prevention Measures: What controls would have prevented this? What controls would have detected it earlier?
  7. Action Items: Specific, assigned, time-bound improvements. Track to completion.

Documentation Requirements

Comprehensive incident documentation serves multiple purposes: organizational learning, regulatory compliance, legal protection, and customer transparency.

AI Incident Response Playbook Template

Every organization should maintain incident-specific playbooks. This template provides a starting structure:

Template

Model Performance Degradation Playbook

Detection Triggers

  • Accuracy drops below 90% (5% degradation threshold)
  • Demographic disparity exceeds 10 percentage points
  • Data drift score (KS statistic) exceeds 0.3
  • User reports of incorrect predictions exceed 10/day

Immediate Actions (0-30 minutes)

  1. Confirm incident: Check monitoring dashboards for performance metrics
  2. Page on-call ML engineer and incident commander
  3. Open incident channel (#incident-model-[name]-[date])
  4. Assess severity using severity matrix (see below)
  5. Implement initial containment per severity level

Investigation Checklist

  • Compare current vs. baseline performance metrics
  • Analyze input data distribution for drift
  • Review recent deployments and configuration changes
  • Check data pipeline health and data quality metrics
  • Inspect prediction errors by demographic group
  • Review feature importance changes

Escalation Criteria

Escalate to legal/communications if any of the following:

  • ! Protected group disparity exceeds 15 percentage points
  • ! High-stakes decisions affected (hiring, lending, healthcare)
  • ! Media inquiries received
  • ! EU high-risk system under AI Act

Key Contacts

  • Incident Commander: [Name, Slack, Phone]
  • ML Lead: [Name, Slack, Phone]
  • Data Engineer: [Name, Slack, Phone]
  • Legal Contact: [Name, Email, Phone]
  • Communications Lead: [Name, Email, Phone]
GLACIS Solution

AI Incident Response with Verifiable Evidence

Traditional incident response documentation is backward-looking—assembled after the fact to explain what happened. GLACIS provides forward-looking evidence: cryptographic proof that your AI controls executed correctly in production, with tamper-evident audit trails for every inference.

Faster Detection

Real-time attestation of model behavior, drift metrics, and fairness indicators. Alert when controls fail—not days later when users complain. Average detection time drops from 4.5 days to hours.

Comprehensive Forensics

Every inference logged with model version, input features, output, confidence scores, and policy evaluations. Reconstruct exactly what happened during incident window without guessing from incomplete logs.

Regulatory Compliance

Pre-built templates for EU AI Act Article 62 serious incident reports. Evidence that demonstrates what controls were in place, when they failed, and what corrective action was taken—mapped to regulatory requirements.

Stakeholder Confidence

Share verifiable evidence with customers, regulators, and boards. Not "trust us, we investigated"—actual cryptographic proof that third parties can independently validate.

The challenge: AI incidents are inevitable. The question is whether you can prove your controls worked when they should have, and detect when they don't. Evidence beats documentation.

Frequently Asked Questions

How long does an AI incident investigation typically take?

Simple performance degradation incidents may resolve in hours to days. Complex incidents involving bias, adversarial attacks, or data poisoning can take weeks. The SafeRent investigation spanned months before settlement. Budget 1-4 weeks for thorough root cause analysis including model forensics and data quality review.

Should we notify users about every AI incident?

Not necessarily. Low-severity incidents caught quickly with no user impact may not require notification. However, notify when: (1) decisions affecting users were wrong, (2) protected groups were treated unfairly, (3) privacy was breached, (4) regulatory obligations exist, or (5) media attention likely. When in doubt, consult legal counsel.

Can we use our existing IT incident response team for AI incidents?

Partially. Your IR team brings valuable incident management skills, but needs augmentation with ML specialists. Minimum additions: ML engineer for model forensics and data scientist for statistical analysis. For adversarial incidents, add security researchers with ML expertise. Consider training existing team on AI-specific incident types.

What's the difference between model rollback and model retraining?

Rollback deploys a previous version—fast (minutes) but loses recent improvements. Use for immediate containment. Retraining creates a new model version with incident fixes—slow (hours to weeks) but addresses root cause. Typical sequence: rollback for containment, retrain for permanent fix, gradual rollout of retrained model.

References

  1. [1] Responsible AI Labs. "AI Safety Incidents of 2024." responsibleailabs.ai
  2. [2] IBM Security. "Cost of a Data Breach Report 2024." ibm.com/security
  3. [3] Gartner. "AI Incident Analysis: Operational vs. Adversarial Failures." 2024.
  4. [4] VentureBeat. "Why do 87% of data science projects never make it into production?" venturebeat.com
  5. [5] IBM Security. "Cost of a Data Breach Report 2024."
  6. [6] NIST. "Computer Security Incident Handling Guide (SP 800-61 Rev. 2)." nist.gov
  7. [7] MITRE. "ATLAS (Adversarial Threat Landscape for AI Systems)." atlas.mitre.org
  8. [8] Reuters. "Amazon scraps secret AI recruiting tool that showed bias against women." October 2018. reuters.com
  9. [9] NHTSA. "Tesla Phantom Braking Investigation." nhtsa.gov
  10. [10] The Verge. "Twitter taught Microsoft's AI chatbot to be a racist asshole in less than a day." March 2016. theverge.com
  11. [11] SafeRent Solutions Settlement. November 2024. Connecticut Fair Housing Center et al. v. SafeRent Solutions.
  12. [12] CBC News. "Air Canada chatbot ruling." February 2024. cbc.ca
  13. [13] Bloomberg. "Samsung Bans ChatGPT and Other Chatbots for Employees After Leak." May 2023. bloomberg.com
  14. [14] European Union. "Regulation (EU) 2024/1689 (EU AI Act), Article 62." eur-lex.europa.eu

Ready for the Next AI Incident?

Don't wait for detection to take 4.5 days. GLACIS provides real-time attestation and cryptographic evidence that accelerates investigation, satisfies regulators, and proves your controls work.

Get Incident-Ready Evidence

Related Guides