ANALYSIS OF MACHINE LEARNING METHODS FOR AUTOMATING PENETRATION TESTING

Authors

DOI:

https://doi.org/10.28925/2663-4023.2025.27.711

Keywords:

machine learning; penetration testing; deep learning; big language models; decision trees; SVMs

Abstract

Automation of penetration testing using machine learning methods is one of the most promising areas in modern cybersecurity. The traditional approach to penetration testing requires significant resources, including financial ones, as well as the involvement of highly qualified specialists capable of conducting a comprehensive assessment of system security. This approach may not always provide sufficient speed in detecting new threats, especially in the face of the ever-increasing complexity of cyberattacks and the large number of vulnerabilities. The introduction of machine learning methods into the pentesting process allows creating flexible, adaptive systems that can not only automate routine tasks but also increase the accuracy and efficiency of vulnerability detection. This article provides an overview of the key machine learning algorithms that can be used to automate penetration testing, including support vector machines, random forest, naive Bayes, decision trees, and reinforcement learning methods. Each of these algorithms offers certain advantages in the context of vulnerability analysis, threat classification, and prioritisation of critical security issues. Special attention is paid to the role of large language models in the automation process. They can analyse logs, classify threats, generate reports, and even provide recommendations for fixing identified vulnerabilities. Such models can significantly increase the productivity of specialists by performing routine tasks automatically, which is especially useful when integrated with CI/CD processes. At the same time, the use of LLM has certain limitations, such as dependence on up-to-date data and high computing costs. The article also discusses the challenges and limitations of implementing machine learning algorithms in the pentesting process, such as the need for a large amount of high-quality data to train models, high computing resources, and the risks associated with possible false positives. The results of the study demonstrate that machine learning algorithms have significant potential to improve the efficiency of automated penetration testing, especially in large infrastructures with numerous vulnerabilities.

Downloads

Download data is not yet available.

References

Li, Z., Dutta, S., & Naik, M. (2024). LLM-Assisted static analysis for detecting security vulnerabilities. https://doi.org/10.48550/arXiv.2405.17238

Saini, J., & Bansal, A. (2024). Automated penetration testing: Machine learning approach. Symposium on Computing & Intelligent Systems (SCI), vol. 3682, 113–125.

Omar, M. (2023). Detecting software vulnerabilities using Language Models. https://doi.org/10.48550/arXiv.2302.11773

Haidur, H. I., Gakhov, S. O., Marchenko, V. V., & Haidur, K. V. (2024). Conceptual model of detection of phishing attacks based on the use of support vector methods. Modern Information Security. https://doi.org/10.31673/2409-7292.2024.020003

Burova, N., Oprysk, R., Kurii, Y., Lakh, Y., & Susukailo, V. (2024). Machine learning as a key tool for defensive cyber operations: Effectiveness of phishing threat detection. Journal of Scientific Papers “Social Development and Security”, 14(5), 113–123. https://doi.org/10.33445/sds.2024.14.5.11

Orlivska, V. (2024). Prospects for the application of data mining in cybersecurity. Information technologies and systems in the documentary field, 140–142.

Johnson, A. A., Ott, M. Q., & Dogucu, M. (2022). Naive bayes classification. Bayes rules!, 355–372. https://doi.org/10.1201/9780429288340-14

Lunhol, O. (2024). Overview of cybersecurity methods and strategies using artificial intelligence. Electronic Professional Scientific Journal “Cybersecurity: Education, Science, Technique”, 1(25), 379–389. https://doi.org/10.28925/2663-4023.2024.25.379389

Xu, K., Yu, J., Hu, Y., & Ai, X. (2019). Security monitoring data fusion method based on ARIMA and LS-SVM. IOP Conference Series: Earth and Environmental Science, 252, 042104. https://doi.org/10.1088/1755-1315/252/4/042104

Tolkachova, A., & Posuvailo, M.-M. (2024). Penetration testing using deep reinforcement learning. Electronic Professional Scientific Journal “Cybersecurity: Education, Science, Technique”, 3(23) 17–30. https://doi.org/10.28925/2663-4023.2024.23.1730

Piskozub, А., Zhuravchak, D., & Tolkachova, А. (2023). Researching vulnerabilities in chatbots with llm (Large language model). Ukrainian Scientific Journal of Information Security, 29(3), 111–117. https://doi.org/10.18372/2225-5036.29.18069

Machhindra, P. A., Vijay, B. N., Mahendra, B. S., & Rahul, C. A. (2023). Enhancing cyber security through machine learning: A comprehensive analysis. Conference: 2023 4th International Conference on Computation, Automation and Knowledge Management (ICCAKM). https://doi.org/10.1109/ICCAKM58659.2023.10449547

Downloads


Abstract views: 172

Published

2025-03-27

How to Cite

Zhuravchak, A., & Piskozub, A. (2025). ANALYSIS OF MACHINE LEARNING METHODS FOR AUTOMATING PENETRATION TESTING . Electronic Professional Scientific Journal «Cybersecurity: Education, Science, Technique», 3(27), 54–62. https://doi.org/10.28925/2663-4023.2025.27.711