/ llm pentest · new

AI-powered apps have a new attack surface.

If your app ships an LLM feature, you shipped a new attack surface. We test prompt injection, agent tool-call abuse, RAG context leakage, guardrail bypass, and the auth gaps that appear when AI systems gain access to user data and internal tools.

Request access See the platform

SCOPE · LLM apps · RAG · agents · tool calls

DEPTH · Prompt, context, policy, and auth

OUTPUT · Signed report + unlimited retests

// TOP FINDINGS · last 30 days

CRITIndirect prompt injection via email tool×2

CRITAgent exfil via unscoped HTTP tool×1

HIGHRAG leak · cross-user document reads×3

HIGHGuardrail bypass · jailbreak payloads×4

MEDOutput injection · XSS via LLM reply×5

MEDTool-call privilege confusion×3

/ why this surface

LLMs changed what an app can do. And what an attacker can do.

An AI-powered app isn't just a UI with a model behind it. It's a program that reads untrusted text, calls tools, accesses data, and acts on behalf of users. That changes the threat model.

01 · PROMPT INJECTION

The OWASP LLM #1

An email, a document, a user message, a web page your agent fetched. Any untrusted text can become an instruction to your LLM. We test direct and indirect injection across every source your model consumes.

02 · TOOL CALLS

Agents with tools are programs with powers

If your LLM can call tools, the security question is which tools, with what scope, on whose behalf. We test unscoped tools, confused-deputy problems, and tool-chain abuse the way an attacker would.

03 · RAG & CONTEXT

Retrieval can leak more than it retrieves

Context windows mix documents from multiple users, tenants, and sources. Cross-user reads, injected system prompts, and stale-cache leaks live in the retrieval layer. Scanners can't see any of this.

/ what we test

What we test on AI apps.

Every class below, applied to your specific AI stack. LLM, RAG, agents, and the glue between them.

01 · PROMPT INJECTION

Instructions from untrusted text

Direct prompt injection
Indirect injection via retrieved docs
Injection via tool outputs
Multi-turn injection chains
Cross-user injection (shared context)
System prompt leakage

02 · AGENTS & TOOLS

What the LLM can do

Unscoped tool calls
Tool-parameter injection
Confused-deputy problems
Tool chain abuse
Authorization bypass in tools
Recursive agent attacks

03 · RAG & CONTEXT

Retrieval-layer attacks

Cross-user document leakage
Stale / poisoned cache entries
Embedding-based retrieval abuse
Metadata-filter bypass
Source tracking gaps
Prompt-instruction persistence

04 · POLICY & OUTPUT

Guardrails and what leaves the model

Jailbreak payload bypass
Role-play bypass
Output injection (XSS via LLM)
Data exfiltration via output
PII leakage in responses
Safety-policy circumvention

/ what we've found

Example findings from this surface.

Illustrative examples of real vulnerability classes on this attack surface. Anonymized, but every one based on patterns our pentesters encounter repeatedly.

CRIT

Indirect prompt injection via email summarization

email-agent · tool chainCVSS 9.4confirmed

An email assistant summarized incoming messages. A crafted email containing hidden instructions told the agent to forward all future emails to an attacker address and delete the original. The model complied.

@arjun · reviewer note: AI built the injection payload. I confirmed the agent had email:send and email:delete scopes and walked the full attack. Fix: strip or quarantine any markup that looks like instructions before the summarization prompt.

CRIT

Unscoped HTTP tool → data exfil via SSRF

agent · tools.httpCVSS 9.0confirmed

The agent had a generic `http.get(url)` tool with no URL allowlist. A prompt-injected message instructed the agent to call attacker.com with sensitive context from the conversation. Data left the environment.

@arjun · reviewer note: AI found the open tool via a jailbreak probe. I ran the exfil. Tool scopes must be allowlisted: either by domain or by a declared purpose. A raw HTTP tool is always a liability.

HIGH

RAG cross-user leakage via shared cache

rag · vector-cacheCVSS 7.9confirmed

The RAG layer cached retrieved chunks keyed by query hash, not by user or tenant. User A's confidential document chunks surfaced in User B's session when similar queries were asked. Scoping lived in the retrieval filter but not in the cache layer.

@arjun · reviewer note: AI surfaced unexpected references; I traced them to the cache. Every layer needs to know the caller, not just the first one. Cache key must include user_id or tenant_id.

HIGH

Guardrail bypass via role-play framing

content-policy · LLMCVSS 7.2confirmed

The model's safety policy blocked direct requests for restricted content, but framing the same request as a fictional character's dialogue consistently bypassed it. The guardrail was checking form, not intent.

@arjun · reviewer note: Well-known class, rarely caught. AI surfaced variants; I tested 30 prompts, 24 bypassed the policy. Harden with output-side classification, not just input heuristics.

// illustrative examples · not real customer engagements

/ invite-only

Ship faster than your pentesters? Let's fix that.

Tell us what you ship and we'll scope a pilot on this surface. The AI and the reviewer take it from there.

Request access