James Wickett’s Post

View profile for James Wickett, graphic

CEO & Co-Founder, DryRun Security

The authors of this paper (link in comments) "show that LLM agents can autonomously hack websites" the LLM agents are "performing complex tasks without prior knowledge of the vulnerability. For example, these agents can perform complex SQL union attacks, which involve a multi-step process (38 actions) of extracting a database schema, extracting information from the database based on this schema, and performing the final hack. Our most capable agent can hack 73.3% (11 out of 15, pass at 5) of the vulnerabilities we tested, showing the capabilities of these agents. Importantly, our LLM agent is capable of finding vulnerabilities in real-world websites.

  • No alternative text description for this image
James Wickett

CEO & Co-Founder, DryRun Security

2mo

We're replacing legacy SAST using LLMs at DryRun Security, so this paper doesn't surprise me, but I think this will be huge for the pen testing and services industry.

Alberto Alonso

AI/ML Principal Software Architech

2mo

anything that helps detect vulnerabilities prior to production is a good thing. Of course it needs to become a step in the QA/dev process.

I know there's a couple of startups in the space that have been working on this over the past year. So we'll see a lot of this very soon.

Like
Reply

Great article with some pretty interesting implications. I cant say I am overly suprised, and it seems like a fairly natural progression/use of LLM technology in the security field given one of the primary failure points of automated tools over the years has consistantly been business logic flaws chaining into larger breaches.

Alexander D.

DevOps | Network Engineer

2mo

Maybe is going surpirse someone, but AI cannot be "creative" at all. All this article is about simple network scanners and scan flows with some side atack workflow documentation, just like any other regular PAN test workflow.

  • No alternative text description for this image
See more comments

To view or add a comment, sign in

Explore topics