ANALYSIS OF METHODS FOR PREDICTING INTERNAL THREATS BASED ON THE ANALYSIS OF DATA FROM THE SOCIAL NETWORK TWITTER
DOI:
https://doi.org/10.28925/2663-4023.2025.28.818Keywords:
Natural Language Processing, Long Short-Term Memory, Random Forest, internal threats, forecast quality assessment, forecastingAbstract
Internal threats are a significant problem for the effective functioning of organizations. Researchers usually identify internal threats by analyzing user behavior and traffic. Most existing works focus on methods for detecting internal threats rather than predicting them. In this work, the main focus was on methods for predicting internal threats using social network data analysis. This article analyzes the effectiveness of the following prediction methods: dictionary-based sentiment analysis, machine learning (Random Forest), recurrent neural networks (LSTM), transformers (BERT), a hybrid approach (NLP CBOW + graph neural networks), and a hybrid approach (NLP CBOW + LPA). To ensure the reliability of the results, the effectiveness of the forecasting methods was evaluated using several key metrics: Precision, Recall, F1-score, and ROC-AUC, training time, and inference time. For the analysis, experiments were conducted on the Sentiment140 dataset, which contains 1.6 million tweets labeled as positive or negative. Python programming language libraries were used to process data and build prediction models. 80% of the data was used to train the models, and 20% was used for prediction. The results of the analysis showed that the fastest method for predicting incidents is tone analysis based on dictionaries. It can be used to analyze changes in sentiment within an organization through online analysis of social networks. Each organization can set its own scale for responding to internal incidents. The analysis also found that hybrid approaches that comprehensively analyze both text content and the structure of social connections demonstrate the highest accuracy rates for predicting internal incidents. However, these approaches require significant computing resources, large amounts of training data, and more time for training, and have low interpretability.
Downloads
References
Casolaro, A., Capone, V., Iannuzzo, G., & Camastra, F. (2023). Deep learning for time series forecasting: Advances and open problems. Information, 14(11), 598.. https://doi.org/10.3390/info14110598
Hernández, R., Gutiérrez, I., & Castro, J. (2025). Social Network Analysis: A Novel Paradigm for Improving Community Detection. International Journal of Computational Intelligence Systems, 18(1), 87. https://doi.org/10.1007/s44196-025-00812-9
Rudchenko, D.V. (2017). Vyyavlennya spilʹnot ta yikh lideriv v sotsialʹnykh merezhakh dlya zabezpechennya bezpeky [Identification of communities and their leaders in social networks to ensure security], Information processing systems, 4(150), 128–131. http://dx.doi.org/10.30748/soi.2017.150.26
Lyudmyla, K., Vitalii, B., & Tamara, R. (2017, October). Fractal time series analysis of social network activities. In 2017 4th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T). 456–459). IEEE. https://doi.org/10.1109/INFOCOMMST.2017.8246438
Huang, D., Song, J., & He, Y. (2024). Community detection algorithm for social network based on node intimacy and graph embedding model. Engineering Applications of Artificial Intelligence, 132, 107947. https://doi.org/10.1007/978-3-031-67195-1_79
Kirichenko, L., Radivilova, T., & Anders, C. (2017). Detecting cyber threats through social network analysis: short survey. SocioEconomic Challenges, 1(1), 20—34.
Kyriazos, T., & Poga, M. (2024). Application of machine learning models in social sciences: managing nonlinear relationships. Encyclopedia, 4(4), 1790–1805. https://doi.org/10.3390/encyclopedia4040118
Rawat, R., Mahor, V., Chirgaiya, S., & Rathore, A.S. (2021). Applications of Social Network Analysis to Managing the Investigation of Suspicious Activities in Social Media Platforms. In: Daimi, K., Peoples, C. (eds) Advances in Cybersecurity Management. Springer Cham, 315–335. https://doi.org/10.1007/978-3-030-71381-2_15
Li, W., & Law, K. L. E. (2024). Deep Learning Models for Time Series Forecasting: A Review. IEEE Access, 12, 92306–92327. https://doi.org/10.1109/ACCESS.2024.3422528
Kirichenko, L., Radivilova, T., Zinkevich, I. (2018). Comparative Analysis of Conversion Series Forecasting in E-commerce Tasks. In: Shakhovska N., Stepashko V. (eds) Advances in Intelligent Systems and Computing II. CSIT 2017. Advances in Intelligent Systems and Computing, 689, 230–242. https://doi.org/10.1007/978-3-319-70581-1_16
Mulesa, O., Povkhan, I., Radivilova, T., & Baranovskyi, O. (2021). Devising a method for constructing the optimal model of time series forecasting based on the principles of competition. Eastern-European Journal of Enterprise Technologies, 5 (4 (113)), 6–11. https://doi.org/10.15587/1729-4061.2021.240847
Mulesa, O., Batyuk, A., Geche, F., Melnyk, O., Palinchak, M., & Radivilova, T. (2021). Information technology for time series forecasting based on the evolutionary method of the forecasting scheme synthesis. In IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), 258–261. IEEE. https://doi.org/10.1109/CSIT52700.2021.9648639
Wan, A., Chang, Q., Khalil, A. B., & He J. (2023). Short-term power load forecasting for combined heat and power using CNN-LSTM enhanced by attention mechanism. Energy, 282, 128274. https://doi.org/10.1016/j.energy.2023.128274
Alotaibi, M. A. (2022). Machine Learning Approach for Short-Term Load Forecasting Using Deep Neural Network. Energies, 15(17), 6261. https://doi.org/10.3390/en15176261
Kirichenko, L., Radivilova, T., & Bulakh, V. (2019). Machine learning in classification time series with fractal properties. Data, 4(1(5)), 1-13. https://doi.org/10.3390/data4010005
Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., & Sun, L. (2023). Transformers in time series: A survey. (2022). arXiv preprint arXiv:2202.07125.
Su, X., Xue, S., Liu, F., Wu, J., Yang, J., Zhou, C., Hu, W., Paris, C., Nepal, S., Jin, D., Sheng, Q., & Yu, P. (2024). A comprehensive survey on community detection with deep learning. IEEE Transactions on Neural Networks and Learning Systems, 35(4), 4682–4702. https://doi.org/10.1109/TNNLS.2021.3137396.
Zhang, Y., Zhou, X., Zhang, Y., Li, S., & Liu, S. (2025). Improving time series forecasting in frequency domain using a multi resolution dual branch mixer with noise insensitive ArcTanLoss. Scientific Reports, 15(1), 12557. https://doi.org/10.1038/s41598-025-95529-2
Cai, J., Hao, J., Yang, H., Yang, Y., Zhao, X., Xun, Y., & Zhang, D. (2024). A new community detection method for simplified networks by combining structure and attribute information. Expert Systems with Applications, 246(123103). https://doi.org/10.1016/j.eswa.2023.123103
Yuliansyah, H., Othman, Z. A., Bakar, A. A. (2020). Taxonomy of Link Prediction for Social Network Analysis: A Review. IEEE Access, 8, 183470–183487. https://doi.org/10.1109/ACCESS.2020.3029122
Wu, L., Chen, Y., Shen, K., Guo, X., Gao, H., Li, S., & Long, B. (2023). Graph neural networks for natural language processing: A survey. Foundations and Trends in Machine Learning, 16(2), 119–328. https://doi.org/10.48550/arXiv.2106.06090
22 Zhang, W., He, P., Qin, C., Yang, F., & Liu, Y. (2024). A graph attention network-based model for anomaly detection in multivariate time series. The Journal of Supercomputing, 80(6), 8529–8549. https://doi.org/10.1007/s11227-023-05772-5
Kirichenko, L., Radivilova, T., & Ryzhanov, V.(2022). Applying Visibility Graphs to Classify Time Series. In: Babichev S., Lytvynenko V. (eds) Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2021. Lecture Notes on Data Engineering and Communications Technologies, 77, 397–409. https://doi.org/10.1007/978-3-030-82014-5_26
St-Aubin, P., & Agard, B. (2022). Precision and Reliability of Forecasts Performance Metrics. Forecasting, 4(4), 882–903. https://doi.org/10.3390/forecast4040048
AUC ROC Curve in Machine Learning. 2025. URL: https://www.geeksforgeeks.org/auc-roc-curve/
Shih, S. Y., Sun, F. K., Lee, H. Y. (2019). “Temporal pattern attention for multivariate time series forecasting”. Machine Learning, 108, 1421–1441. https://doi.org/10.1007/s10994-019-05815-0
Sentiment140 dataset with 1.6 million tweets. 2018. URL: https://www.kaggle.com/datasets/kazanova/sentiment140/data
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Тамара Радівілова

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.