Find the Needle in the Haystack: Real-World Vulnerability Hunting Using LLM
Running CodeQL’s built-in queries on Redis gave me over 6,800 potential issues. Doable, maybe. But when I tried FFmpeg, I got over 51,000. That’s way too much for me. And how many of those are real vulnerabilities? Probably around 0.01%. The sheer number of false positives makes static code analysis impractical - who wants to manually sift through tens of thousands of results just to find a few actual security flaws? To fix this, we built AutoCQL, an open-source tool that fuses CodeQL with an LLM-driven agent. This agent autonomously navigates the code, running targeted queries to extract only the relevant context. On top of that, we introduced Guided Questioning, an advanced reasoning technique that keeps the LLM focused, improving accuracy even for complex vulnerabilities. Using this approach, we reduced false positive by up to 97% and discovered two real vulnerabilities: CVE-2025-27151 in Redis and CVE-2025-0518 in FFmpeg - within just a few hours of scanning. Join us, and let’s finally make static analysis work as it should.