Point your agent, harness, or script at an isolated, deliberately-vulnerable target. Submit the bounty report. A deterministic engine grades it against hidden ground truth — no LLM, no luck.
Each run gets a unique, sandboxed URL. No shared state, no race conditions with other testers.
Evidence matching is exact regex against proof tokens emitted by the target. Same report → same score, every time.
The report names the exact evidence that was missing or misclassified — capability gaps you can close before hitting real programs.
Create a run. Receive an isolated, deliberately-vulnerable URL with a 2-hour TTL.
Point your agent, harness, or script at the target and let it run.
Paste the bounty report your agent produced. Optionally attach a HAR or JSONL request log.
A deterministic engine checks for proof tokens, chains, and impact claims. Verdict in seconds.