August 7, 2025

AI Agents for Offsec with Zero False Positives

11:20am-12:00pm

PST

02:20am-3:00pm

EST

Large language models are increasingly helping to automate vulnerability discovery and exploit development in real-world software. However, naïvely asking LLMs to identify vulnerabilities leads to a deluge of false positives that can drown out real findings. In this talk, we will present techniques that enable AI agents to find vulnerabilities at scale, fully autonomously and with zero false positives. The key to our approach is developing robust exploit validators that can conclusively determine whether an exploit claimed by the agent is real, allowing the agent to make arbitrarily many attempts without increasing the amount of human effort needed to review the results. Using these techniques, we were able to test thousands of web apps found on Docker Hub, identifying over 200 zero days and obtaining multiple CVEs.

Location

Blackhat USA 2025 - South Pacific F, Level 0 - North Convention Center

Speakers

Brendan

Dolan‑Gavitt

AI Researcher

Bluesky

GitHub

First Name*

Last Name*

Company Email*

Company Name*

Thanks for registering!

You'll receive a confirmation email shortly.

Oops! Something went wrong while submitting the form.