Introducing Assessment Guidance for AI Pentesting

Assessment Guidance lets teams provide pentest context so XBOW focuses on what matters most while executing assessments autonomously.

When you hire a human pentester, the engagement doesn't start with hacking. It starts with a scoping call. Your team explains the architecture, flags the payment endpoints, mentions the legacy auth system nobody wants to touch, hands over API specs. The pentester launches the assessment with that context.

XBOW has always been autonomous. Point it at a target, it finds exploits. You could upload source code to give it more to work with. We also shipped a free-form text field early on, a single box where you could type anything. We saw an enormous appetite for providing that kind of guidance, and decided to give that context a more direct path into XBOW's architecture.

Assessment Guidance is the result — a new section within Target Context. You've always been able to upload source code. Now you can provide structured guidance that shapes how XBOW discovers, prioritizes, attacks, and validates. You provide context before the assessment runs, and then XBOW operates autonomously. You're not copiloting anything. You're giving XBOW a better starting point. It handles the rest.

Assessment Guidance was built by Javi Gil, a security engineer on the XBOW team who spent years on the other side of scoping calls. Here's why he built it.

Four Assessment Guidance cards, mapped to XBOW’s architecture

‍
If you've seen XBOW's platform architecture (xbow.com/platform), you know there are distinct components working together: attack surface mapping, a coordinator that prioritizes what to test, autonomous agents that execute attacks, and validators that confirm findings. Assessment Guidance gives you a structured input for each of those components.

You can now guide XBOW across four dimensions: Attack S

urface, Priorities, Attack Strategy, and Validation, each matching the different stages of a penetration test methodology. None are required. XBOW works without them. During Private Preview, teams across financial services, enterprise SaaS, cloud infrastructure, and security companies requested early access. Here's how practitioners used Assessment Guidance.

Attack Surface

Attack Surface is about giving XBOW a head start. Upload API specifications or technical documentation and XBOW's discovery agent knows exactly what to map. XBOW still does its own discovery on top of what you provide.

One team uploaded their OpenAPI spec, a YAML file covering 148 endpoints across a cloud identity service. XBOW parsed it, mapped the full attack surface, and started the assessment already knowing every route. No crawling required for those endpoints.

Another team uploaded two versions of their synthetics monitoring API (v1 and v2, 42 endpoints total) and paired it with priorities to scope the assessment exclusively to their synthetic monitoring product.

A third team skipped specs entirely. They uploaded five route files, raw endpoint definitions pulled from their codebase. XBOW extracted 81 endpoints from those route definitions and targeted the entire assessment to that one product area.

Priorities

This card feeds XBOW's coordinator, the persistent orchestration engine that directs testing. You tell XBOW which features and areas deserve more attention.

A financial services team uploaded a PDF of their transaction flows, collections, receivables, expenses and wrote one line: "Ensure the flows described in this PDF are covered. Watch for any credit card data or other areas that may be relevant to PCI-DSS." XBOW parsed the document, pulled out payment and credit card operations, and focused on those flows.

The team that uploaded dashboard route files into Attack Surface paired it with a single line in Priorities: "Focus the testing on the Dashboards product." Attack Surface defined the surface. Priorities told XBOW what to care about. Eighty-one endpoints, scoped to one product.

This card provides context to XBOW's attack agents, the short-lived, focused workers that execute actual exploits.

XBOW had previously identified a path traversal in a web application. The practitioner wanted to push it further. In Attack Strategy, they described the vulnerable endpoints, pointed XBOW to a server-side DLL that could be decompiled for source analysis, and wrote: "Try all techniques to safely achieve RCE or command execution. Only claim success when RCE is achieved." XBOW chained the path traversal into remote code execution.

Another practitioner had found one API endpoint leaking employee PII and suspected similar endpoints existed. They wrote: "Focus on testing hidden API endpoints for potential PII leakage. Pay particular attention to endpoints similar to the one we've identified." XBOW found variants. hidden endpoints following the same pattern that hadn't been tested yet.

Two use cases for the same card: vulnerability escalation and variant analysis.

Validation

XBOW's validators confirm exploitability before surfacing findings. The Validation card lets you strengthen confidence in findings by providing canaries, unique to your environment, XBOW can use as proof of exploitation.

One customer, for example, uses canary tokens to validate Local File Inclusion and SQLInjection findings by planting flags in disk files and database records that XBOW must retrieve to confirm exploitation.

For setup instructions, see Validate results with canary tokens.

An unexpected use: surgical retesting

We didn't anticipate this one. Practitioners started using Assessment Guidance to retest vulnerabilities surfaced outside of XBOW, from manual pentests, bug bounty reports, and scanners.

Feed a single endpoint into Attack Surface and toggle "limit testing to the listed endpoints." In Priorities and Attack Strategy, scope to one vulnerability class. One practitioner testing a known SQL injection wrote:

"This application has a known SQL Injection vulnerability in the /catalog endpoint using the category parameter. The goal of this test is to ONLY test that endpoint for SQL Injection."

It's not what we designed Assessment Guidance for. We built it for enriching broad assessments, but practitioners immediately saw the value for focused validation. That tells us something about what the feature actually enables: precise control over XBOW's execution, whatever the use case.

See what Assessment Guidance can do

The four cards accept:

Attack Surface: API specifications (OpenAPI, Swagger, SOAP), technical documentation, free-form endpoint descriptions
Priorities: Priority features or endpoints, areas to deprioritize
Attack Strategy: Payload formats, exploit knowledge, known weaknesses, vulnerability classes
Validation: Canary tokens

File uploads support PDF, Markdown, plain text, DOCX, JSON, YAML, and XML (large codebases and documentation sets are supported). Source code uploads continue to work alongside Assessment Guidance under Target Context; all inputs are additive, giving XBOW a more complete picture of your attack surface.

Full documentation goes live on docs.xbow.com with this release.

Getting started

Navigate to the Target Context section when configuring an assessment. Start with what you have, an API spec in Attack Surface and a few items in Priorities will already make a difference. Your context doesn't have to be perfect; if an uploaded spec includes endpoints that no longer exist, XBOW discovers they're unreachable and moves on. Layer in Attack Strategy and Validation as you get comfortable.

If you're not yet on XBOW, request a demo at xbow.com.

Introducing Assessment Guidance: Map Your Brain Onto XBOW's

Four Assessment Guidance cards, mapped to XBOW’s architecture

Attack Surface

Priorities

Validation

An unexpected use: surgical retesting

See what Assessment Guidance can do

Getting started

Related Posts

Getting to “Should I?”, Instead of “Can I?”: How XBOW Finds IDORs With High Accuracy in Ambiguous Contexts

Offensive Security Needs to Become Continuous

XBOW Now Available With EU Data Residency