What the OWASP LLM Top 10 Is
The OWASP Top 10 for LLM Applications is the standard taxonomy of security risks for systems built on large language models. Published by the Open Worldwide Application Security Project, it gives security teams a shared language for the vulnerabilities unique to LLM-powered products — from prompt injection to model theft.
The list isn’t academic. It’s the framework that auditors, regulators, and red teams reference when evaluating AI systems. The EU AI Act’s technical documentation requirements map directly to these categories. So do NIST AI RMF controls and insurance underwriting questionnaires.
Below, we walk through each risk: what it is, how it shows up in real deployments, the corresponding MITRE ATLAS technique, and which OVERT controls address it through continuous monitoring.
The Ten Risks
Prompt Injection
An attacker crafts input that overrides the system prompt — either directly (user-facing) or indirectly (via poisoned documents, emails, or web pages the model retrieves). The model follows the attacker’s instructions instead of the developer’s.
In production: RAG-based systems that ingest external documents are especially vulnerable. A single poisoned PDF in the retrieval index can redirect every conversation that references it.
Insecure Output Handling
The application passes LLM output directly to downstream systems without validation. The model generates SQL, JavaScript, shell commands, or API calls that the application executes, enabling injection attacks through the model as a proxy.
In production: Code-generation assistants and agentic systems that execute model-produced code are primary targets. The model becomes an unwitting intermediary for traditional injection attacks.
Training Data Poisoning
Malicious data in the training set introduces backdoors, biases, or targeted vulnerabilities. This includes poisoned fine-tuning data, compromised RLHF feedback, and manipulated retrieval indexes.
In production: Organizations that fine-tune models on customer data or maintain dynamic RAG indexes are exposed. Even a small percentage of poisoned training examples can shift model behavior in targeted ways.
Model Denial of Service
Attackers craft inputs that consume disproportionate compute resources — extremely long contexts, recursive reasoning prompts, or inputs that trigger worst-case inference paths. The result is degraded service or complete unavailability.
In production: Multi-tenant AI services are especially vulnerable. A single user’s adversarial input can exhaust GPU capacity that serves hundreds of other customers.
Supply Chain Vulnerabilities
Compromised model weights, tampered training pipelines, poisoned pre-trained models on public hubs, or vulnerable third-party plugins. The LLM supply chain has many points of entry for attackers.
In production: Teams pulling models from Hugging Face Hub or using community plugins without verification inherit unknown risks. Model provenance is rarely tracked.
Sensitive Information Disclosure
The model leaks PII, credentials, system prompts, proprietary data, or other sensitive information through its responses. This can occur through memorization of training data or through context-window content being referenced in unrelated conversations.
In production: Healthcare and financial services AI systems handling patient records or transaction data face the highest exposure. A single leaked SSN or diagnosis is a reportable breach.
Insecure Plugin Design
LLM plugins and tool integrations lack proper input validation, authentication, or access controls. The model can be manipulated into calling plugins with malicious arguments, accessing resources beyond its intended scope.
In production: Agentic systems with multiple tool integrations multiply this risk. Each new tool is a new privilege boundary that must be independently secured. See our agentic AI security guide.
Excessive Agency
The LLM is granted too many permissions, too broad a scope, or too much autonomy. It can take actions — sending emails, modifying databases, executing code — that go beyond what the application requires.
In production: The drift from “chatbot that answers questions” to “agent that takes actions” often happens incrementally, with each new tool integration expanding the blast radius of a compromise. See our AI agent security guide.
Overreliance
Users or downstream systems trust LLM output without verification, treating it as ground truth. The model hallucinates facts, citations, code, or medical/legal/financial advice that appears authoritative but is fabricated.
In production: In healthcare, unverified AI output can lead to misdiagnosis. In legal, fabricated citations have led to court sanctions. Overreliance is a system design problem, not just a user behavior problem.
Model Theft
Unauthorized access to model weights, fine-tuned variants, or proprietary training data. This includes model extraction through API queries (distillation attacks) and direct exfiltration of model artifacts.
In production: Organizations that fine-tune foundation models with proprietary data create high-value targets. The fine-tuned model embeds trade secrets in its weights.
How Continuous Monitoring Addresses the Top 10
Pre-deployment testing catches known attack patterns. Continuous monitoring catches everything else — the novel attacks, the regressions after model updates, the emergent behaviors in production that no test suite predicted.
GLACIS maps each OWASP LLM risk to specific runtime monitoring capabilities:
- autoredteam tests for all ten risk categories continuously, not just at deployment time. New attack patterns from the research community are added to the test suite within days of publication.
- Enforce applies runtime guardrails for the risks that require real-time intervention: prompt injection detection (LLM01), output sanitization (LLM02), PII filtering (LLM06), and tool-call scope enforcement (LLM07, LLM08).
- Notary creates attestation records for every model interaction, building the audit trail that demonstrates ongoing compliance with OWASP-aligned security controls.
OWASP LLM Risk Scan
See autoredteam assess your model against all 10 OWASP LLM risk categories in five minutes.
Book a Live DemoRelated Reading
Test Against the Top 10
Run autoredteam against your model to see how it holds up across all ten OWASP LLM risk categories.