ANALYSIS OF MACHINE LEARNING METHODS FOR AUTOMATING PENETRATION TESTING
DOI:
https://doi.org/10.28925/2663-4023.2025.27.711Keywords:
machine learning; penetration testing; deep learning; big language models; decision trees; SVMsAbstract
Automation of penetration testing using machine learning methods is one of the most promising areas in modern cybersecurity. The traditional approach to penetration testing requires significant resources, including financial ones, as well as the involvement of highly qualified specialists capable of conducting a comprehensive assessment of system security. This approach may not always provide sufficient speed in detecting new threats, especially in the face of the ever-increasing complexity of cyberattacks and the large number of vulnerabilities. The introduction of machine learning methods into the pentesting process allows creating flexible, adaptive systems that can not only automate routine tasks but also increase the accuracy and efficiency of vulnerability detection. This article provides an overview of the key machine learning algorithms that can be used to automate penetration testing, including support vector machines, random forest, naive Bayes, decision trees, and reinforcement learning methods. Each of these algorithms offers certain advantages in the context of vulnerability analysis, threat classification, and prioritisation of critical security issues. Special attention is paid to the role of large language models in the automation process. They can analyse logs, classify threats, generate reports, and even provide recommendations for fixing identified vulnerabilities. Such models can significantly increase the productivity of specialists by performing routine tasks automatically, which is especially useful when integrated with CI/CD processes. At the same time, the use of LLM has certain limitations, such as dependence on up-to-date data and high computing costs. The article also discusses the challenges and limitations of implementing machine learning algorithms in the pentesting process, such as the need for a large amount of high-quality data to train models, high computing resources, and the risks associated with possible false positives. The results of the study demonstrate that machine learning algorithms have significant potential to improve the efficiency of automated penetration testing, especially in large infrastructures with numerous vulnerabilities.
Downloads
References
Li, Z., Dutta, S., & Naik, M. (2024). LLM-Assisted static analysis for detecting security vulnerabilities. https://doi.org/10.48550/arXiv.2405.17238
Saini, J., & Bansal, A. (2024). Automated penetration testing: Machine learning approach. Symposium on Computing & Intelligent Systems (SCI), vol. 3682, 113–125.
Omar, M. (2023). Detecting software vulnerabilities using Language Models. https://doi.org/10.48550/arXiv.2302.11773
Haidur, H. I., Gakhov, S. O., Marchenko, V. V., & Haidur, K. V. (2024). Conceptual model of detection of phishing attacks based on the use of support vector methods. Modern Information Security. https://doi.org/10.31673/2409-7292.2024.020003
Burova, N., Oprysk, R., Kurii, Y., Lakh, Y., & Susukailo, V. (2024). Machine learning as a key tool for defensive cyber operations: Effectiveness of phishing threat detection. Journal of Scientific Papers “Social Development and Security”, 14(5), 113–123. https://doi.org/10.33445/sds.2024.14.5.11
Orlivska, V. (2024). Prospects for the application of data mining in cybersecurity. Information technologies and systems in the documentary field, 140–142.
Johnson, A. A., Ott, M. Q., & Dogucu, M. (2022). Naive bayes classification. Bayes rules!, 355–372. https://doi.org/10.1201/9780429288340-14
Lunhol, O. (2024). Overview of cybersecurity methods and strategies using artificial intelligence. Electronic Professional Scientific Journal “Cybersecurity: Education, Science, Technique”, 1(25), 379–389. https://doi.org/10.28925/2663-4023.2024.25.379389
Xu, K., Yu, J., Hu, Y., & Ai, X. (2019). Security monitoring data fusion method based on ARIMA and LS-SVM. IOP Conference Series: Earth and Environmental Science, 252, 042104. https://doi.org/10.1088/1755-1315/252/4/042104
Tolkachova, A., & Posuvailo, M.-M. (2024). Penetration testing using deep reinforcement learning. Electronic Professional Scientific Journal “Cybersecurity: Education, Science, Technique”, 3(23) 17–30. https://doi.org/10.28925/2663-4023.2024.23.1730
Piskozub, А., Zhuravchak, D., & Tolkachova, А. (2023). Researching vulnerabilities in chatbots with llm (Large language model). Ukrainian Scientific Journal of Information Security, 29(3), 111–117. https://doi.org/10.18372/2225-5036.29.18069
Machhindra, P. A., Vijay, B. N., Mahendra, B. S., & Rahul, C. A. (2023). Enhancing cyber security through machine learning: A comprehensive analysis. Conference: 2023 4th International Conference on Computation, Automation and Knowledge Management (ICCAKM). https://doi.org/10.1109/ICCAKM58659.2023.10449547
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Анастасія Журавчак, Андріян Піскозуб

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.