/ llm pentest · new

AI-powered apps have a new attack surface.

AI penetration testing for applications that ship large language model features, covering prompt injection, agent and tool-call abuse, retrieval-augmented generation context leakage, guardrail bypass, and authentication gaps specific to AI-connected workflows.

If your app ships an LLM feature, you shipped a new attack surface. We test prompt injection, agent tool-call abuse, RAG context leakage, guardrail bypass, and the auth gaps that appear when AI systems gain access to user data and internal tools.

Request access See the platform
SCOPE · LLM apps · RAG · agents · tool calls
DEPTH · Prompt, context, policy, and auth
OUTPUT · Signed report + unlimited retests
// TOP FINDINGS · last 30 days
CRITIndirect prompt injection via email tool×2
CRITAgent exfil via unscoped HTTP tool×1
HIGHRAG leak · cross-user document reads×3
HIGHGuardrail bypass · jailbreak payloads×4
MEDOutput injection · XSS via LLM reply×5
MEDTool-call privilege confusion×3
/ why this surface

LLMs changed what an app can do. And what an attacker can do.

An AI-powered app isn't just a UI with a model behind it. It's a program that reads untrusted text, calls tools, accesses data, and acts on behalf of users. That changes the threat model.

01 · PROMPT INJECTION

The OWASP LLM #1

An email, a document, a user message, a web page your agent fetched. Any untrusted text can become an instruction to your LLM. We test direct and indirect injection across every source your model consumes.

02 · TOOL CALLS

Agents with tools are programs with powers

If your LLM can call tools, the security question is which tools, with what scope, on whose behalf. We test unscoped tools, confused-deputy problems, and tool-chain abuse the way an attacker would.

03 · RAG & CONTEXT

Retrieval can leak more than it retrieves

Context windows mix documents from multiple users, tenants, and sources. Cross-user reads, injected system prompts, and stale-cache leaks live in the retrieval layer. Scanners can't see any of this.

/ what we test

What we test on AI apps.

Every class below, applied to your specific AI stack. LLM, RAG, agents, and the glue between them.

01 · PROMPT INJECTION

Instructions from untrusted text

  • Direct prompt injection
  • Indirect injection via retrieved docs
  • Injection via tool outputs
  • Multi-turn injection chains
  • Cross-user injection (shared context)
  • System prompt leakage
02 · AGENTS & TOOLS

What the LLM can do

  • Unscoped tool calls
  • Tool-parameter injection
  • Confused-deputy problems
  • Tool chain abuse
  • Authorization bypass in tools
  • Recursive agent attacks
03 · RAG & CONTEXT

Retrieval-layer attacks

  • Cross-user document leakage
  • Stale / poisoned cache entries
  • Embedding-based retrieval abuse
  • Metadata-filter bypass
  • Source tracking gaps
  • Prompt-instruction persistence
04 · POLICY & OUTPUT

Guardrails and what leaves the model

  • Jailbreak payload bypass
  • Role-play bypass
  • Output injection (XSS via LLM)
  • Data exfiltration via output
  • PII leakage in responses
  • Safety-policy circumvention
/ what we've found

Example findings from this surface.

Illustrative examples of real vulnerability classes on this attack surface. Anonymized, but every one based on patterns our pentesters encounter repeatedly.

CRIT
Indirect prompt injection via email summarization
email-agent · tool chainCVSS 9.4confirmed

An email assistant summarized incoming messages. A crafted email containing hidden instructions told the agent to forward all future emails to an attacker address and delete the original. The model complied.

A
@arjun · reviewer note: AI built the injection payload. I confirmed the agent had email:send and email:delete scopes and walked the full attack. Fix: strip or quarantine any markup that looks like instructions before the summarization prompt.
CRIT
Unscoped HTTP tool → data exfil via SSRF
agent · tools.httpCVSS 9.0confirmed

The agent had a generic `http.get(url)` tool with no URL allowlist. A prompt-injected message instructed the agent to call attacker.com with sensitive context from the conversation. Data left the environment.

A
@arjun · reviewer note: AI found the open tool via a jailbreak probe. I ran the exfil. Tool scopes must be allowlisted: either by domain or by a declared purpose. A raw HTTP tool is always a liability.
HIGH
RAG cross-user leakage via shared cache
rag · vector-cacheCVSS 7.9confirmed

The RAG layer cached retrieved chunks keyed by query hash, not by user or tenant. User A's confidential document chunks surfaced in User B's session when similar queries were asked. Scoping lived in the retrieval filter but not in the cache layer.

A
@arjun · reviewer note: AI surfaced unexpected references; I traced them to the cache. Every layer needs to know the caller, not just the first one. Cache key must include user_id or tenant_id.
HIGH
Guardrail bypass via role-play framing
content-policy · LLMCVSS 7.2confirmed

The model's safety policy blocked direct requests for restricted content, but framing the same request as a fictional character's dialogue consistently bypassed it. The guardrail was checking form, not intent.

A
@arjun · reviewer note: Well-known class, rarely caught. AI surfaced variants; I tested 30 prompts, 24 bypassed the policy. Harden with output-side classification, not just input heuristics.

// illustrative examples · not real customer engagements

/ invite-only

Ship faster than your pentesters? Let's fix that.

Tell us what you ship and we'll scope a pilot on this surface. The AI and the reviewer take it from there.

Request access