Frameworks April 2026

OWASP LLM Top 10

Each of the ten risks, how they manifest in production AI systems, and how continuous monitoring addresses every one. Mapped to OVERT controls and MITRE ATLAS techniques.


What the OWASP LLM Top 10 Is

The OWASP Top 10 for LLM Applications is the standard taxonomy of security risks for systems built on large language models. Published by the Open Worldwide Application Security Project, it gives security teams a shared language for the vulnerabilities unique to LLM-powered products — from prompt injection to model theft.

The list isn’t academic. It’s the framework that auditors, regulators, and red teams reference when evaluating AI systems. The EU AI Act’s technical documentation requirements map directly to these categories. So do NIST AI RMF controls and insurance underwriting questionnaires.

Below, we walk through each risk: what it is, how it shows up in real deployments, the corresponding MITRE ATLAS technique, and which OVERT controls address it through continuous monitoring.

The Ten Risks

LLM01

Prompt Injection

An attacker crafts input that overrides the system prompt — either directly (user-facing) or indirectly (via poisoned documents, emails, or web pages the model retrieves). The model follows the attacker’s instructions instead of the developer’s.

In production: RAG-based systems that ingest external documents are especially vulnerable. A single poisoned PDF in the retrieval index can redirect every conversation that references it.

ATLAS: AML.T0051 OVERT: ov-1.1, ov-2.1 Monitor: Input pattern analysis, instruction-boundary enforcement
LLM02

Insecure Output Handling

The application passes LLM output directly to downstream systems without validation. The model generates SQL, JavaScript, shell commands, or API calls that the application executes, enabling injection attacks through the model as a proxy.

In production: Code-generation assistants and agentic systems that execute model-produced code are primary targets. The model becomes an unwitting intermediary for traditional injection attacks.

ATLAS: AML.T0048 OVERT: ov-2.2, ov-3.1 Monitor: Output sanitization validation, tool-call argument auditing
LLM03

Training Data Poisoning

Malicious data in the training set introduces backdoors, biases, or targeted vulnerabilities. This includes poisoned fine-tuning data, compromised RLHF feedback, and manipulated retrieval indexes.

In production: Organizations that fine-tune models on customer data or maintain dynamic RAG indexes are exposed. Even a small percentage of poisoned training examples can shift model behavior in targeted ways.

ATLAS: AML.T0020 OVERT: ov-1.2, ov-4.1 Monitor: Data lineage tracking, behavioral drift detection
LLM04

Model Denial of Service

Attackers craft inputs that consume disproportionate compute resources — extremely long contexts, recursive reasoning prompts, or inputs that trigger worst-case inference paths. The result is degraded service or complete unavailability.

In production: Multi-tenant AI services are especially vulnerable. A single user’s adversarial input can exhaust GPU capacity that serves hundreds of other customers.

ATLAS: AML.T0029 OVERT: ov-2.3 Monitor: Token-rate limiting, inference-cost tracking per request
LLM05

Supply Chain Vulnerabilities

Compromised model weights, tampered training pipelines, poisoned pre-trained models on public hubs, or vulnerable third-party plugins. The LLM supply chain has many points of entry for attackers.

In production: Teams pulling models from Hugging Face Hub or using community plugins without verification inherit unknown risks. Model provenance is rarely tracked.

ATLAS: AML.T0010 OVERT: ov-4.1, ov-4.3 Monitor: Model hash verification, supply chain attestation
LLM06

Sensitive Information Disclosure

The model leaks PII, credentials, system prompts, proprietary data, or other sensitive information through its responses. This can occur through memorization of training data or through context-window content being referenced in unrelated conversations.

In production: Healthcare and financial services AI systems handling patient records or transaction data face the highest exposure. A single leaked SSN or diagnosis is a reportable breach.

ATLAS: AML.T0024 OVERT: ov-2.1, ov-3.2 Monitor: PII scanning on outputs, context-boundary enforcement
LLM07

Insecure Plugin Design

LLM plugins and tool integrations lack proper input validation, authentication, or access controls. The model can be manipulated into calling plugins with malicious arguments, accessing resources beyond its intended scope.

In production: Agentic systems with multiple tool integrations multiply this risk. Each new tool is a new privilege boundary that must be independently secured. See our agentic AI security guide.

ATLAS: AML.T0040 OVERT: ov-3.1, ov-3.3 Monitor: Tool-call argument validation, scope enforcement
LLM08

Excessive Agency

The LLM is granted too many permissions, too broad a scope, or too much autonomy. It can take actions — sending emails, modifying databases, executing code — that go beyond what the application requires.

In production: The drift from “chatbot that answers questions” to “agent that takes actions” often happens incrementally, with each new tool integration expanding the blast radius of a compromise. See our AI agent security guide.

ATLAS: AML.T0048 OVERT: ov-2.2, ov-3.1 Monitor: Action-scope auditing, least-privilege enforcement
LLM09

Overreliance

Users or downstream systems trust LLM output without verification, treating it as ground truth. The model hallucinates facts, citations, code, or medical/legal/financial advice that appears authoritative but is fabricated.

In production: In healthcare, unverified AI output can lead to misdiagnosis. In legal, fabricated citations have led to court sanctions. Overreliance is a system design problem, not just a user behavior problem.

ATLAS: AML.T0043 OVERT: ov-1.3, ov-2.4 Monitor: Hallucination detection, confidence-score tracking, human-in-the-loop enforcement
LLM10

Model Theft

Unauthorized access to model weights, fine-tuned variants, or proprietary training data. This includes model extraction through API queries (distillation attacks) and direct exfiltration of model artifacts.

In production: Organizations that fine-tune foundation models with proprietary data create high-value targets. The fine-tuned model embeds trade secrets in its weights.

ATLAS: AML.T0024 OVERT: ov-4.2, ov-4.3 Monitor: API rate-limiting, query pattern analysis, model artifact access logging

How Continuous Monitoring Addresses the Top 10

Pre-deployment testing catches known attack patterns. Continuous monitoring catches everything else — the novel attacks, the regressions after model updates, the emergent behaviors in production that no test suite predicted.

GLACIS maps each OWASP LLM risk to specific runtime monitoring capabilities:

Interactive

OWASP LLM Risk Scan

See autoredteam assess your model against all 10 OWASP LLM risk categories in five minutes.

Book a Live Demo

Related Reading

Test Against the Top 10

Run autoredteam against your model to see how it holds up across all ten OWASP LLM risk categories.

autoredteam on GitHub Book a Scan Call