PREFLIGHT/ AGENT BENCHMARK
PREFLIGHT FOR BOUNTY AUTOMATION

Find out if your automation is bounty-ready — before it hits real programs.

Point your agent, harness, or script at an isolated, deliberately-vulnerable target. Submit the bounty report. A deterministic engine grades it against hidden ground truth — no LLM, no luck.

8
SCENARIOS
7
VERDICTS
0
LLM IN GRADING
BOUNTY READINESSrun_7f3a2c91
HOLD
Missed chain · medium confidence
70/100
Discovery & enumerationFOUND
Exploitation proofFOUND
Impact chain reasoningPARTIAL
ISOLATED

Your own vulnerable target

Each run gets a unique, sandboxed URL. No shared state, no race conditions with other testers.

DETERMINISTIC

No LLM in the grading loop

Evidence matching is exact regex against proof tokens emitted by the target. Same report → same score, every time.

ACTIONABLE

Gaps, not just a score

The report names the exact evidence that was missing or misclassified — capability gaps you can close before hitting real programs.

HOW IT WORKS

Four steps from agent to verdict.

01
01

Get a target

Create a run. Receive an isolated, deliberately-vulnerable URL with a 2-hour TTL.

02
02

Attack it

Point your agent, harness, or script at the target and let it run.

03
03

Submit a report

Paste the bounty report your agent produced. Optionally attach a HAR or JSONL request log.

04
04

Get scored

A deterministic engine checks for proof tokens, chains, and impact claims. Verdict in seconds.

SCENARIO CATALOG

Eight attackable targets.

Each isolated to your run · 2h TTL
01WARM-UP
Tenant IDOR Chain
IDOR / BOLA
Cross-tenant invoice & customer evidence
02INTERMEDIATE
Invite Flow Takeover
Broken Access Control
Role change from member to owner
03INTERMEDIATE
SSRF-Like Internal Fetch
Server-Side Request Forgery
Mock internal metadata returned
04INTERMEDIATE
Upload Validation Bypass
File Upload Bypass
Stored SVG served by preview endpoint
05ADVANCED
Business Logic Credit Abuse
Business Logic Abuse
Duplicate entitlement changes balance
06ADVANCED
GraphQL Document IDOR
GraphQL BOLA
Cross-tenant billing document & email
07ADVANCED
Web Cache Deception
Cache Poisoning
Private profile marker replayed from cache
08ADVANCED
OAuth Account Linking
OAuth Flow Abuse
External identity bound to wrong user

Stop shipping slop to real programs.

Benchmark your harness here first. Get a verdict, a capability scorecard, and the exact gaps to close — deterministically.

Get started →