Tales from the Trace: How Agentic AI Merges Static and Dynamic Testing

Watch XBOW autonomously combine source code analysis with dynamic testing to discover SQL injection in minutes. This episode shows AI reading code, crafting exploits, and validating vulnerabilities - the "holy grail" of application security testing in action.

Learn more

Welcome to Tales from the Trace

Welcome to our first installment of "Tales from the Trace". Just like the cult TV series Tales from the Crypt, this series will reveal crafty twists, unexpected turns, and sometimes shocking discoveries. But instead of haunted houses and monsters, we'll be diving into what XBOW's AI-powered offensive security platform uncovers in application traces: surprising vulnerabilities, strange behaviors, and lessons learned along the way.

The Holy Grail: Static Meets Dynamic

Today's story demonstrates how effective it can be to combine source-code analysis with dynamic testing in an automated workflow. That combination is notoriously difficult for traditional automated scanners. With XBOW's AI capabilities, we get closer to the "holy grail" of application security testing. XBOW uses static knowledge from the code to guide dynamic attacks, while dynamic evidence confirms and validates real security gaps.

Relying only on traditional SAST solutions often leads to noisy results. False positives, dead code paths, or findings that are purely informational. On the other hand, relying only on traditional DAST tools can leave legitimate vulnerabilities undiscovered. By adding source code context to dynamic testing, XBOW dramatically improves the accuracy and likelihood of uncovering real, exploitable vulnerabilities.

This particular trace was generated while testing DVWA, a purposely vulnerable web application used for security training. For this run, I included the application's source code as part of the test configuration. Below we'll walk through the trace and show how XBOW understood the code, crafted a dynamic test, and ultimately surfaced a SQL injection vulnerability.

‍

Chapter 1: Analyzing the Evidence

We can see that XBOW starts by unzipping and analyzing the contents of the source code that was supplied when the test was configured.

Let's examine the provided `DVWA.zip` file to understand the application structure and potential vulnerabilities. I'll unzip it and list its contents.

‍

Chapter 2: Testing the Waters

Next, XBOW analyzes the normal behavior of the endpoint that we are about to test:

This looks like the classic DVWA SQL injection vulnerability where we can input a user ID and it will query the database. Let me first test the normal functionality by entering a valid user ID (like "1") to see what the normal response looks like, then I'll test for SQL injection vulnerabilities.

Let me start by testing with a normal input to understand the baseline behavior.

‍

Chapter 3: When Simple Attacks Fail

Our AI solver then attempts a simple SQLi payload and observes whether it behaves as expected.

Great, that worked as expected. For `id=1`, the application returns "ID: 1, First name: admin, Surname: admin". This confirms the `id` parameter is being used to query the database.

Now, let's try a simple SQL injection payload to confirm the vulnerability. I'll use the classic `' OR '1'='1` payload to see if I can retrieve more data. This payload should effectively make the WHERE clause always true, returning all users in the database. I'll also try a `' AND 1=0` to see if I get no results, which would further confirm the vulnerability.

Let's start with the `' OR '1'='1` payload. I will also use the `#` character to comment out the rest of the query.

‍

Chapter 4: Reading Between the Lines

When the simple attack fails, XBOW dives into the endpoint's source code to hunt for sloppy coding that hints at a smarter payload that might expose SQLi.

The injection payload didn't show any visible results on the page, which is strange. It's possible that the application is handling errors silently or the payload structure needs adjustment. Let me try a different approach.

Let me first check the source code of the DVWA SQLi vulnerability to understand how it works. I'll look at the low security level implementation to see the exact SQL query structure.

‍

Chapter 5: The Smoking Gun

Ah ha! XBOW follows the code path for the endpoint it's testing and sees the problem. It then explains why the first attempt failed, and crafts a new payload tailored to exploit it.

Perfect! Now I understand the vulnerability. The source code shows:

$query = "SELECT first_name, last_name FROM users WHERE user_id = '$id';";

The $id parameter is directly inserted into the SQL query without any sanitization. This is a classic SQL injection vulnerability. The query structure shows that it's selecting first_name and last_name from the users table where user_id = '$id'.

My previous payload might not have worked because I need to properly close the single quote and comment out the rest. Let me try a simpler approach first. Let me try 1' OR '1'='1 without the hash comment, since the query already ends with a semicolon.

Actually, let me first try a basic time-based SQL injection payload to test the SLEEP function. I'll use 1' AND SLEEP(5)# to see if I can cause a 5-second delay. This will help me understand if time-based injection works before creating the proper request files.

‍

Chapter 6: Proof of Exploitation

Success! We found a blind SQL injection vulnerability. Finally, we pass the finding to our SQLi verification tool to confirm exploitability and rule out false positives.

Great! The blind SQL injection page is responding perfectly to our time-based payloads.

- A normal request takes 0.126s.
- A SLEEP(5) request takes 5.125s.
- A SLEEP(15) request takes 15.131s.

This is exactly the behavior we need. The response times are very consistent and predictable, which should allow check-sqli to work correctly.

Now, let's create new request files (request1.txt and request2.txt) targeting the blind SQLi endpoint and run check-sqli again.

‍

The Verdict

While this was a simple case, XBOW was able to test the endpoint, locate the relevant source code, and craft an exploit confirming the vulnerability in just minutes. By contrast, a human tester might spend hours or even days sifting through thousands of lines of code to achieve the same result on an enterprise application. With agentic AI combining static analysis and dynamic validation, XBOW finds these issues quickly, delivering accurate results at scale and leaving no vulnerability lurking in the shadows.

Stay tuned for the next episode of Tales from the Trace!

https://xbow-website-b1b.pages.dev/traces/

Ray

Kelly

Application Security Consultant

Bluesky

GitHub

Alvaro

Muñoz

Security Researcher

Bluesky

GitHub