XBOW tops US leaderboard on HackerOne Read more
XBOW Logo

Boosting offensive security with AI

XBOW autonomously finds and exploits vulnerabilities in 75% of web benchmarks

195 / 261

PortSwigger Labs

solved
204 / 282

PentesterLab Exercises

solved
88 / 104

Novel Benchmarks

solved

See XBOW at work

XBOW pursues high-level goals by executing commands and reviewing their output, without any human intervention.

These are real examples of XBOW solving benchmarks. The only guidance provided to XBOW, aside from general instructions that are identical for every task, is the benchmark description. If you'd like to see all the data, click here.

Team

Security, AI, and Engineering

Aqeel Siddiqui

Aqeel Siddiqui

Head of Product & Customer Success

Jordan McTaggart

Jordan McTaggart

Head of Finance & BizOps

Linkedin X
Alex Gatzlaff

Alex Gatzlaff

Account Executive

Ian Campbell

Ian Campbell

Research Engineer

Leandro Barragan

Leandro Barragan

Security Researcher

Nicolas Trippar

Nicolas Trippar

Security Researcher

GitHub X

Blog

Updates and opinions from the team

June 30, 2025  -  By Diego Jurado

CVE-2025-49493: XML External Entity (XXE) Injection in Akamai CloudTest

When XBOW met Akamai: a walkthrough of discovering and exploiting an XML External Entity vulnerability (CVE-2025-49493) in a widely-deployed application.

Read post

June 24, 2025  -  By Nico Waisman

The road to Top 1: How XBOW did it

For the first time in bug bounty history, an autonomous penetration tester has reached the top spot on the US leaderboard.

Read post

June 24, 2025  -  By Oege de Moor

Taking the Top Hacker in the US to New Heights: XBOW Raises $75M Series B

XBOW has reached a critical milestone: our AI now rivals and surpasses top-tier human hackers.

Read post


Frequently asked questions

Benchmarks

What do you consider a “benchmark”?

A benchmark is a realistic exercise in web security, with a crisp success criterion like capturing a flag. Many challenges in CTF contests do not qualify because they are brainteasers rather than reflecting a realistic web security scenario.

Where did XBOW get its collection of benchmarks?

XBOW’s benchmarks have been carefully selected for relevance and breadth by its security experts. Sources include leading vendors of training materials, such as PortSwigger and PentesterLab, and public CTF competitions. Some benchmarks have been authored specifically for XBOW, so we can be sure they do not occur in any training sets.

The original PortSwigger labs do not have flags — why do the traces shown for these benchmarks include a flag?

The PortSwigger labs detect automatically whether you have solved the lab or not. However, we wanted all benchmarks to have the same crisp success criterion which can be checked by our infrastructure. So we introduced a flag and a mechanism for returning it.

Could you provide more information about the novel XBOW benchmarks?

XBOW’s security experts designed a set of unique web benchmarks to ensure that solutions were never included in any training data. The benchmarks are representative of many vulnerability classes, and varying degrees of difficulty.

Will the novel XBOW benchmarks be released?

Yes. The novel XBOW benchmarks will be open-sourced soon. We hope others will join us in using these benchmarks to set a new standard for the evaluation of security tools.

How many benchmarks does XBOW have?

XBOW has collected a corpus of thousands of benchmarks, both for the purpose of evaluating performance, and for improving performance.

Where can I find more details about the benchmarks that XBOW solved?

We provide more details to back up the results reported on this website. See here for the benchmarks that were attempted, and which were solved.

Technology

How does the AI inside XBOW work?

It is an example of ‘agentic AI’. We use many standard techniques, but also plenty of proprietary innovations. Aside from general guidance that is identical for every task, the only directions given to XBOW are the basic benchmark description.

As a growing startup, this intellectual property is our main asset, so we cannot share the details.

Are the example traces shown edited?

The AI reasoning and command outputs shown in our example traces have not been edited in any way (e.g., wrapped lines are still present). We have withheld the general guidance (“prompts”) to protect XBOW’s proprietary technology.

Can XBOW find and exploit vulnerabilities without providing descriptions or without having “flags” as a goal?

Yes, we have run experiments by blanking out the descriptions and that works fine. Without flags as a goal, XBOW decides on its own when it has finished. You can prompt it to be more or less aggressive - for example, when it discovers a SQL injection, it can (after approval from a human operator) continue to exfiltrate valuable data from the database, or just stop and report the core problem.

Is XBOW useful for everyone or does it require any sort of specific knowledge?

XBOW is useful for anyone looking to improve the security of their web applications. You don’t need to be a security or AI expert to use it—a lot of deep security knowledge is baked into the XBOW product. This is the magic of our team, combining such security expertise with AI and engineering skills.

Responsible AI

How will you ensure your technology won't be misused?

We will only make our technology available to trusted customers in the cloud. It is not possible to run XBOW as a standalone application outside our control.


Book a demo


Book a demo

Find out more about our technology