AI-powered apps have a new attack surface.
AI penetration testing for applications that ship large language model features, covering prompt injection, agent and tool-call abuse, retrieval-augmented generation context leakage, guardrail bypass, and authentication gaps specific to AI-connected workflows.If your app ships an LLM feature, you shipped a new attack surface. We test prompt injection, agent tool-call abuse, RAG context leakage, guardrail bypass, and the auth gaps that appear when AI systems gain access to user data and internal tools.
LLMs changed what an app can do. And what an attacker can do.
An AI-powered app isn't just a UI with a model behind it. It's a program that reads untrusted text, calls tools, accesses data, and acts on behalf of users. That changes the threat model.
The OWASP LLM #1
An email, a document, a user message, a web page your agent fetched. Any untrusted text can become an instruction to your LLM. We test direct and indirect injection across every source your model consumes.
Agents with tools are programs with powers
If your LLM can call tools, the security question is which tools, with what scope, on whose behalf. We test unscoped tools, confused-deputy problems, and tool-chain abuse the way an attacker would.
Retrieval can leak more than it retrieves
Context windows mix documents from multiple users, tenants, and sources. Cross-user reads, injected system prompts, and stale-cache leaks live in the retrieval layer. Scanners can't see any of this.
What we test on AI apps.
Every class below, applied to your specific AI stack. LLM, RAG, agents, and the glue between them.
Instructions from untrusted text
- Direct prompt injection
- Indirect injection via retrieved docs
- Injection via tool outputs
- Multi-turn injection chains
- Cross-user injection (shared context)
- System prompt leakage
What the LLM can do
- Unscoped tool calls
- Tool-parameter injection
- Confused-deputy problems
- Tool chain abuse
- Authorization bypass in tools
- Recursive agent attacks
Retrieval-layer attacks
- Cross-user document leakage
- Stale / poisoned cache entries
- Embedding-based retrieval abuse
- Metadata-filter bypass
- Source tracking gaps
- Prompt-instruction persistence
Guardrails and what leaves the model
- Jailbreak payload bypass
- Role-play bypass
- Output injection (XSS via LLM)
- Data exfiltration via output
- PII leakage in responses
- Safety-policy circumvention
Example findings from this surface.
Illustrative examples of real vulnerability classes on this attack surface. Anonymized, but every one based on patterns our pentesters encounter repeatedly.
An email assistant summarized incoming messages. A crafted email containing hidden instructions told the agent to forward all future emails to an attacker address and delete the original. The model complied.
The agent had a generic `http.get(url)` tool with no URL allowlist. A prompt-injected message instructed the agent to call attacker.com with sensitive context from the conversation. Data left the environment.
The RAG layer cached retrieved chunks keyed by query hash, not by user or tenant. User A's confidential document chunks surfaced in User B's session when similar queries were asked. Scoping lived in the retrieval filter but not in the cache layer.
The model's safety policy blocked direct requests for restricted content, but framing the same request as a fictional character's dialogue consistently bypassed it. The guardrail was checking form, not intent.
// illustrative examples · not real customer engagements
Ship faster than your pentesters? Let's fix that.
Tell us what you ship and we'll scope a pilot on this surface. The AI and the reviewer take it from there.
Request access