Autonomous Penetration Testing with Cyberguardian: A Large Language Model-Based Approach

Authors: Jeslin Hashly, Jestin K Sunil, Alby Shinoj

Abstract: This paper introduces CyberGuardian, a new LLM-based agent for autonomous penetration testing. CyberGuardian is composed of two parts: a planner and a summarizer. These parts cooperate to create and carry out commands in an iterative manner. To evaluate CyberGuardian, we introduce two new benchmark suites based on the popular Capture the Flag (CTF) systems PicoCTF and OverTheWire, comprising 200 challenges in various domains and levels of difficulty. Our experiments check CyberGuardian's most critical parameters, such as levels of creativity and token usage, on LLM. Results reaffirm the need of good security procedures and show how LLM-based agents can advance autonomous penetration monitoring.

DOI: https://doi.org/10.5281/zenodo.16419170

Related posts

Follow Us on