A METHOD FOR INCREASING THE DETECTION EFFICIENCY OF INSIDER THREATS USING GAN AUGMENTATION
DOI:
https://doi.org/10.28925/2663-4023.2025.31.1087Keywords:
insider threats, generative adversarial networks, data augmentation, anomaly detection, information securityAbstract
In modern corporate information systems, a significant proportion of information security incidents are insider threats. This creates new requirements for security event monitoring and analysis systems. Unlike external attacks, insider activity is disguised as the usual work of legitimate users, and therefore is difficult to describe using classic signature or perimeter protection mechanisms. An additional complexity is the extreme imbalance of classes in event logs. The number of records of typical daily activity is thousands of times higher than the number of recorded incidents. This leads to degradation of the quality of standard machine learning algorithms. The article develops an approach to increasing the efficiency of detecting insider threats by augmenting data using generative adversarial networks, in particular the Conditional Tabular GAN (CTGAN) architecture.
A process for preparing behavioral logs is proposed. This process involves the aggregation of multi-channel events to the "user-day" level, construction of a vector of dynamic behavioral features and static context, logarithmic normalization of features with "heavy tails" and scaling to the range [–1; 1]. This ensures stable training of the generative model. CTGAN is configured to simulate the conditional distribution of tabular data of the minority class (insider attacks) taking into account the context of the user's role and department. For each continuous feature, specialized normalization is applied, which allows for the correct reproduction of multimodal distributions, and for discrete variables, the Gumbel-Softmax technique is used, which makes it possible to learn using the backpropagation method of the error. The proposed method is promising for integration into SIEM/UEBA class systems and further combination with methods of explanatory artificial intelligence.
Downloads
References
Greitzer, F. L., & Hohimer, R. E. (2011). Modeling Human Behavior to Anticipate Insider Attacks. Journal of Strategic Security, 4(2), 25–48. http://dx.doi.org/10.5038/1944-0472.4.2.2.
2018 Cost of Insider Threats: Global. Research Report / Ponemon Institute LLC. // Traverse City, MI, 2018. Режим доступу: https://www.insiderthreatdefense.us/pdf/Ponemon%20Institute%202018%20Report%20-%20The%20True%20Cost%20Of%20Insider%20Threats%20Revealed.pdf.
Homoliak, I., Toffalini, F., Guarnizo, J., Elovici, Y., & Ochoa, M. (2019). Insight Into Insiders and IT. ACM Computing Surveys, 52(2), 1–40. https://doi.org/10.1145/3303771.
ВАЙС, Т., ОНИЩАК, Н., ПОЛОВКО, І., & ШАРКАДІ, М. (2024). ВИКОРИСТАННЯ ГЕНЕРАТИВНОГО ШТУЧНОГО ІНТЕЛЕКТУ ДЛЯ АНАЛІЗУ ДАНИХ. MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES, (2), 389–394. https://doi.org/10.31891/2219-9365-2024-78-45.
Yuan, S., & Wu, X. (2021). Deep learning for insider threat detection: Review, challenges and opportunities. Computers & Security, 104, 102221. https://doi.org/10.1016/j.cose.2021.102221.
Dunmore, A., Jang-Jaccard, J., Sabrina, F., & Kwak, J. (2023). A Comprehensive Survey of Generative Adversarial Networks (GANs) in Cybersecurity Intrusion Detection. IEEE Access, 1. https://doi.org/10.1109/access.2023.3296707.
Preston M. (2022). Insider Threat Detection Data Augmentation Using WCGAN-GP: Master’s Thesis // Preston Mack. Halifax, Nova Scotia, Canada: Dalhousie University. http://hdl.handle.net/10222/81531.
Gayathri R. G., Sajjanhar, A., & Xiang, Y. (2024). Hybrid deep learning model using SPCAGAN augmentation for insider threat analysis. Expert Systems with Applications, 123533. https://doi.org/10.1016/j.eswa.2024.123533.
Donahue J., Krähenbühl P. & Darrell T. (2017). Adversarial Feature Learning. Proceedings of the International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1605.09782.
Chen, Y., Chen, W., Chandra Pal, S., Saha, A., Chowdhuri, I., Adeli, B., Janizadeh, S., Dineva, A. A., Wang, X., & Mosavi, A. (2021). Evaluation efficiency of hybrid deep learning algorithms with neural network decision tree and boosting methods for predicting groundwater potential. Geocarto International, 1–21. https://doi.org/10.1080/10106049.2021.1920635.
Korchenko, O., Korchenko, A., Zybin, S., & Davydenko, K. (2025). An approach for classifying sociotechnical attacks. Radioelectronic and Computer Systems, 2025(2), 230-252. https://doi.org/10.32620/reks.2025.2.15.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953.
Xu L., Skoularidou M., Cuesta-Infante A., Veeramachaneni K. (2019). Modeling Tabular Data Using Conditional GAN. Advances in Neural Information Processing Systems, 7, 7335–7345. https://doi.org/10.48550/arXiv.1907.00503.
Liberti, G. (2009). Improved Strategies for Branching on General Disjunctions. Zootaxa, 2318, 339–385. https://doi.org/10.1184/R1/12841247.v1.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Сергій Зибін, Віталій Вербиненко

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.