MATHEMATICAL METHODS IN CYBER SECURITY: CLUSTER ANALYSIS AND ITS APPLICATION IN INFORMATION AND CYBERNETIC SECURITY
DOI:
https://doi.org/10.28925/2663-4023.2024.23.258273Keywords:
mathematical methods; cluster analysis; informational security; cyber security; “nearest neighbor” algorithm, “k-means” algorithm, “fuzzy c-means” algorithm, “cosine similarity” algorithm.Abstract
The huge number of information threats and their complexity prompts research and modeling of new methodologies and information protection systems. The development and improvement of information and cyber security systems includes the creation and processing of mathematical models using information technologies. This article is a follow-up study on the application of mathematical methods and technologies in cyber security, namely: methods of cluster analysis. The modern development of computer technology and the growth of their power have contributed to the wide implementation of Data Mining algorithms for processing large volumes of information in various fields of society and science, in particular in the field of cyber security. Cluster analysis allows the set to be divided into subsets, so that the elements of each subset are similar to each other, and the elements of different subsets are the most different. This provides an opportunity to eliminate the shortcomings of the qualitative approach in assessing information risks. The paper reviews scientific sources regarding the applied aspect of the application of clustering methods in security systems, because timely forecasting of possible incidents allows you to manage information risks and make effective decisions to ensure confidentiality, availability and integrity of information. The stages of the clustering procedure are characterized, the issues of choosing the distance measure and the similarity measure for the objects under study are highlighted. The comparative characteristics of the most popular methods of cluster analysis are presented: the “nearest neighbor” algorithm, “k-means”, “fuzzy c-means”, “cosine similarity”, their advantages and disadvantages are defined. This study can be useful and used in the educational process of students of the specialty 125 “Cyber security and information protection”.
Downloads
References
Shevchenko, S., et al. (2019) Mathematical Methods in Cybersecurity: Fractals and their Applications in Information And Cyber Security. Cybersecurity: education, science, technique, 1(5), 31–39.
Shevchenko, S., et al. (2021). Mathematical Methods in Cibersecurity: Graphs and their Application in Information and Cybernetic Security. Cybersecurity: education, science, technique, 1(13), 133–144.
Shevchenko, S., et al. (2022). Study of applied aspects of conflict theory in security systems. Cybersecurity: education, science, technique, 2(18), 150–162.
Shevchenko, S., et al. (2023). Conflict Analysis in the Information Security System: Subject – Subject. CEUR Workshop Proceedings, 3421. 56–66.
Shevchenko, S., Zhdanovа, Yu., & Spasiteleva, S. (2023) Mathematical Methods in Cybersecurity: Catastrophe Theory. Cybersecurity: education, science, technique, 3(19), 165–175.
Shevchenko, S., et al. (2023) Game Theoretical Approach to the Modeling Of Conflicts in Information Security Systems. Cybersecurity: education, science, technique, 2(22), 168–178.
Levkin, D., Zhernovnykova, O., & Kotko, Y. (2023). Modern mathematical methods in the cyber security system. Mechanisms for ensuring sustainable development of the economy: problems, prospects, international experience. Materials of the IV international scientific and practical Internet conference.
Lysenko, N., et al. (2021) Review of Mathematical Methods in Cyber Threat Detection and Prevention Systems. Actual problems of automation and information technology, 25, 91–102. http://dx.doi.org/10.15421/432110
Bu, C. (2018). Network Security Based on K-Means Clustering Algorithm in Data Mining Research. Advances in Computer Science Research, 83, 642–645. https://doi.org/10.2991/snce-18.2018.130
Cheon, J., Kim, D., & Park, J. (2019). Towards a Practical Cluster Analysis over Encrypted Data. Conference: Selected Areas in Cryptography (SAC), 1–24.
Raptis, G., Katsini, C., & Alexakos, C. (2021). Towards Automated Matching of Cyber Threat Intelligence Reports based on Cluster Analysis in an Internet-of-Vehicles Environment, 2021 IEEE International Conference on Cyber Security and Resilience (CSR), 366–371, https://doi.org/10.1109/CSR51186.2021.9527983
Gao, Y., et al. (2022). HinCTI: A Cyber Threat Intelligence Modeling and Identification System Based on Heterogeneous Information Network. IEEE Transactions on Knowledge and Data Engineering, 34(2), 708–722. https://doi.org/10.1109/TKDE.2020.2987019
Poh, J., et al. (2020). Physical Access Log Analysis: An Unsupervised Clustering Approach for Anomaly Detection. DSIT 2020: Proceedings of the 3rd International Conference on Data Science and Information Technology, 12–18. https://doi.org/10.1145/3414274.3414285
Rosli, N., et al. (2019). Clustering Analysis for Malware Behavior Detection using Registry Data. International Journal of Advanced Computer Science and Applications (IJACSA), 10(12). http://dx.doi.org/10.14569/IJACSA.2019.0101213
Lysenko, S., & Humenyuk, V. (2017). Malware detection method based on the nearest neighbor algorithm. Bulletin of the Khmelnytskyi National University, 6, 2017 (255), 96–101.
REDDY K.T. (2023). Unveiling the Power of k-Nearest Neighbors in Phishing Detection, Insights2Techinfo. https://insights2techinfo.com/unveiling-the-power-of-k-nearest-neighbors-in-phishing-detection/
Kuehn, P., et al. (2022). Clustering of Threat Information to Mitigate Information Overload for Computer Emergency Response Teams. https://arxiv.org/abs/2210.14067
Patton, R., et al. (2011). Hierarchical clustering and visualization of aggregate cyber data. 2011 7th International Wireless Communications and Mobile Computing Conference, 1287–1291. https://doi.org/10.1109/IWCMC.2011.5982725
Dovbysh, A., et al. (2021). Fundamentals of information-extreme synthesis of an automated cyber defense control system. Modern information technologies in cyber security, 7–75.
Lysenko, S. (2019). A method of ensuring the resilience of computer systems in the face of cyber threats based on self-adaptability. Radioelectronic and computer systems, 4(92), 4–16.
Gerasina, O., et al. (2022).Detecting fishing URLs using fuzzy clustering algorithms with global optimization. System technologies, 2(139), 53–67.
Landauer, M., et al. (2020). System log clustering approaches for cyber security applications: A survey. Computers & Security, 92, 1–18. https://doi.org/10.1016/j.cose.2020.101739
Goncharenko, S. (1997). Ukrainian Pedagogical Dictionary. Lybid.
Jain, A., & Dubes, R. (1988). Algorithms for clustering data. Prentice-Hall, Inc, Upper Saddle River.
Xu, R., & Wunsch, D. (2005) Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678. https://doi.org/10.1109/TNN.2005.845141
Yarovy, A., & Strakhov, E. (2015). Multivariate statistical analysis: an introductory methodological guide for students of mathematics and economics. Astroprint.
Xu, D., & Tian, Y. (2015). Comprehensive Survey of Clustering Algorithms. Ann. Data. Sci. 2, 165–193. https://doi.org/10.1007/s40745-015-0040-1
Abdul Nazeer, K., & Sebastian, M. (2009). Improving the Accuracy and Efficiency of the k-means Clustering Algorithm. Proceedings of the World Congress on Engineering, I.
Dunn, J. (1973) A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Journal of Cybernetics, 3, 32–57. http://dx.doi.org/10.1080/01969727308546046
Bezdek, J. (1981). Pattern recognition with fuzzy objective function algorithms. Plenum Press.
Chen, Z. (2022) Research and Application of Clustering Algorithm for Text Big Data. Comput Intell Neurosci. https://doi.org/10.1155/2022/7042778
Salton, G. (1988). Automatic text processing. Addison-Wesley Longman Publishing.
Sidorov, G., et al. (2014). Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas, 18(3), 491–504. https://doi.org/10.13053/CyS-18-3-2043
Vijaymeena, M., & Kavitha, K. (2016). A Survey on Similarity Measures in Text Mining. Machine Learning and Applications: An International Journal, 3, 19–28. https://doi.org/10.5121/mlaij.2016.3103
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Світлана Шевченко, Юлія Жданова, Світлана Спасітєлєва , Наталія Мазур, Павло Складанний , Віталій Негоденко
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.