What Is an AI Audit?
An AI audit is a systematic examination of an AI system's development, deployment, and operation. Unlike traditional IT audits that focus primarily on security controls, AI audits evaluate a broader set of concerns:
- Technical soundness: Model performance, reliability, robustness
- Ethical alignment: Fairness, bias, transparency, accountability
- Regulatory compliance: Adherence to applicable laws and standards
- Governance effectiveness: Policies, procedures, oversight mechanisms
- Risk management: Identification, assessment, and mitigation of AI-specific risks
The goal is to provide independent assurance that AI systems operate as intended, comply with requirements, and don't create unacceptable risks for the organization or affected individuals.
The Audit Gap
Most organizations can describe their AI governance policies but cannot produce evidence that controls actually executed. Auditors increasingly focus on proof of execution, not just documentation of intent.
Types of AI Audits
AI audits come in several forms, each with different scopes, objectives, and evidence requirements:
| Audit Type | Focus | Typical Duration | Who Conducts |
|---|---|---|---|
| Internal Audit | Policy compliance, control effectiveness, risk identification | 2-4 weeks | Internal audit team |
| Technical Audit | Model performance, security, data handling, MLOps practices | 2-6 weeks | Specialized AI auditors |
| Bias/Fairness Audit | Algorithmic bias, disparate impact, protected class analysis | 4-8 weeks | Specialized firms, academic partners |
| Regulatory Audit | Compliance with specific regulations (EU AI Act, state laws) | 6-12 weeks | External auditors, regulators |
| Certification Audit | ISO 42001, SOC 2 + AI controls, industry certifications | 8-16 weeks | Accredited certification bodies |
| Due Diligence Audit | M&A, vendor selection, investment decisions | 2-6 weeks | Consulting firms, specialized assessors |
Healthcare-Specific Considerations
Healthcare AI systems face additional audit requirements:
- HIPAA compliance: PHI handling, access controls, audit trails
- FDA oversight: Software as a Medical Device (SaMD) requirements
- Clinical validation: Evidence of safety and efficacy
- Integration audits: EHR connectivity, data flow mapping
Regulatory Landscape
The regulatory environment for AI is evolving rapidly. Organizations must prepare for multiple overlapping requirements:
Colorado AI Act
First US state AI law. Requires risk assessments, impact statements, and disclosure for high-risk AI systems.
EU AI Act (High-Risk)
Most healthcare AI classified as high-risk. Requires conformity assessments, technical documentation, and ongoing monitoring.
California ADMT
Automated Decision-Making Technology regulations. Disclosure requirements and opt-out rights for significant decisions.
ISO 42001
International standard for AI management systems. Voluntary but increasingly expected by enterprise buyers.
These regulations share common themes: risk assessment, documentation, human oversight, and ongoing monitoring. Organizations that build comprehensive governance programs can address multiple requirements simultaneously.
Evidence Requirements
Auditors evaluate evidence at four levels, each providing increasing assurance. This is the GLACIS Evidence Hierarchy:
GLACIS Evidence Hierarchy
Four levels of AI governance evidence, from weakest to strongest
Policy Documentation
Written policies, procedures, guidelines. Demonstrates intent but not execution. Weakest evidence.
Operational Records
Logs, dashboards, reports showing activities occurred. Can be incomplete or retroactively modified. Moderate evidence.
Execution Traces
Timestamped, per-inference records showing controls executed. Links specific outputs to specific control states. Strong evidence.
Cryptographic Attestation
Tamper-evident, independently verifiable proof. Third parties can validate without trusting the vendor. Strongest evidence.
Evidence by Audit Domain
Different audit areas require different types of evidence:
| Domain | Required Evidence |
|---|---|
| Governance | AI policy, roles & responsibilities, board oversight records, ethics review minutes, training records |
| Risk Management | Risk assessments, impact analyses, risk registers, mitigation plans, residual risk acceptance |
| Development | Requirements specs, data lineage, training documentation, validation reports, model cards |
| Testing | Test plans, test results, bias testing, red team reports, performance benchmarks |
| Operations | Deployment records, monitoring dashboards, incident logs, change management, access logs |
| Continuous Monitoring | Drift detection, performance metrics, feedback loops, retraining triggers, audit trails |
Common Audit Findings
Based on our experience with healthcare AI audits, these are the most frequent findings:
Missing Execution Evidence
Organizations have policies and monitoring dashboards but cannot prove which controls executed for specific outputs. "We have guardrails" is not the same as "here's proof the guardrails ran."
Incomplete Model Documentation
Training data sources, preprocessing steps, and model architecture decisions are poorly documented or not linked to deployed versions.
Inadequate Bias Testing
Testing exists but doesn't cover all protected classes, use cases, or edge conditions. Results aren't tied to deployment decisions.
Weak Change Control
Model updates and configuration changes lack formal approval processes. Auditors can't verify which version was running at a specific time.
Human Oversight Gaps
Policies require human review but no evidence exists that reviews actually occurred, or reviewers lack qualification documentation.
Preparation Timeline
A typical AI audit preparation follows this timeline. Start earlier if you're building governance from scratch.
Audit scoping and preparation
- Define audit scope with auditor
- Identify all AI systems in scope
- Assign audit coordinators and evidence owners
- Conduct gap assessment against requirements
Evidence collection and remediation
- Gather existing documentation
- Identify and remediate gaps
- Implement missing controls
- Create evidence inventory
Documentation and testing
- Complete technical documentation
- Run bias and performance tests
- Validate control effectiveness
- Prepare evidence packages
Final preparation
- Internal readiness review
- Staff briefings and interview prep
- Organize evidence room/portal
- Address any last-minute gaps
Active engagement
- Provide requested evidence promptly
- Facilitate interviews and walkthroughs
- Document auditor questions and requests
- Track and respond to preliminary findings
Remediation and follow-up
- Review draft findings
- Develop remediation plans
- Implement corrective actions
- Schedule follow-up verification
Building Audit-Ready Systems
The best audit preparation starts at design time. Build systems that generate evidence automatically:
Design Principles
- Evidence by default: Every inference should generate an audit record automatically
- Immutability: Logs should be tamper-evident; once written, they cannot be modified
- Traceability: Every output should link to its inputs, model version, and control states
- Verifiability: Third parties should be able to validate evidence without full system access
Key Capabilities
Guardrail Execution Logs
For every inference, record which guardrails ran, in what order, with pass/fail results and timestamps.
Model Version Tracking
Cryptographic hashes linking each inference to the exact model weights, configuration, and prompt templates used.
Decision Reconstruction
Ability to replay any historical inference with the exact context (inputs, retrieved data, config) that was present.
Compliance Dashboards
Real-time views mapping operational evidence to specific regulatory requirements and control objectives.
The Payoff
Organizations with audit-ready systems spend 60-80% less time on audit preparation, have fewer findings, and can demonstrate compliance to customers and regulators on demand.
Frequently Asked Questions
What is an AI audit?
An AI audit is a systematic examination of an AI system's development, deployment, and operation to assess compliance with regulations, adherence to ethical principles, effectiveness of controls, and alignment with organizational policies.
Who conducts AI audits?
AI audits can be conducted by internal audit teams, external audit firms (Big 4 and specialized AI auditors), regulatory bodies, standards organizations (for certifications like ISO 42001), and third-party assessment organizations.
How long does an AI audit take?
Duration varies by scope: focused technical assessments take 2-4 weeks, comprehensive governance audits take 4-8 weeks, and full regulatory compliance audits can take 3-6 months. Preparation time should be factored in as well.
What happens if we fail an AI audit?
Consequences depend on the audit type. Internal audits lead to remediation plans. Certification audits may result in certification denial or conditions. Regulatory audits could trigger enforcement actions, fines, or operational restrictions.
How often should AI systems be audited?
High-risk systems should have annual comprehensive audits plus continuous monitoring. Lower-risk systems may be audited every 2-3 years. Material changes (new models, expanded use cases) should trigger reassessment regardless of schedule.