MATHEMATICAL METHODS IN CYBER SECURITY: CLUSTER ANALYSIS AND ITS APPLICATION IN INFORMATION AND CYBERNETIC SECURITY

Authors

DOI:

https://doi.org/10.28925/2663-4023.2024.23.258273

Keywords:

mathematical methods; cluster analysis; informational security; cyber security; “nearest neighbor” algorithm, “k-means” algorithm, “fuzzy c-means” algorithm, “cosine similarity” algorithm.

Abstract

The huge number of information threats and their complexity prompts research and modeling of new methodologies and information protection systems. The development and improvement of information and cyber security systems includes the creation and processing of mathematical models using information technologies. This article is a follow-up study on the application of mathematical methods and technologies in cyber security, namely: methods of cluster analysis. The modern development of computer technology and the growth of their power have contributed to the wide implementation of Data Mining algorithms for processing large volumes of information in various fields of society and science, in particular in the field of cyber security. Cluster analysis allows the set to be divided into subsets, so that the elements of each subset are similar to each other, and the elements of different subsets are the most different. This provides an opportunity to eliminate the shortcomings of the qualitative approach in assessing information risks. The paper reviews scientific sources regarding the applied aspect of the application of clustering methods in security systems, because timely forecasting of possible incidents allows you to manage information risks and make effective decisions to ensure confidentiality, availability and integrity of information. The stages of the clustering procedure are characterized, the issues of choosing the distance measure and the similarity measure for the objects under study are highlighted. The comparative characteristics of the most popular methods of cluster analysis are presented: the “nearest neighbor” algorithm, “k-means”, “fuzzy c-means”, “cosine similarity”, their advantages and disadvantages are defined. This study can be useful and used in the educational process of students of the specialty 125 “Cyber security and information protection”.

Downloads

Download data is not yet available.

References

Shevchenko, S., et al. (2019) Mathematical Methods in Cybersecurity: Fractals and their Applications in Information And Cyber Security. Cybersecurity: education, science, technique, 1(5), 31–39.

Shevchenko, S., et al. (2021). Mathematical Methods in Cibersecurity: Graphs and their Application in Information and Cybernetic Security. Cybersecurity: education, science, technique, 1(13), 133–144.

Shevchenko, S., et al. (2022). Study of applied aspects of conflict theory in security systems. Cybersecurity: education, science, technique, 2(18), 150–162.

Shevchenko, S., et al. (2023). Conflict Analysis in the Information Security System: Subject – Subject. CEUR Workshop Proceedings, 3421. 56–66.

Shevchenko, S., Zhdanovа, Yu., & Spasiteleva, S. (2023) Mathematical Methods in Cybersecurity: Catastrophe Theory. Cybersecurity: education, science, technique, 3(19), 165–175.

Shevchenko, S., et al. (2023) Game Theoretical Approach to the Modeling Of Conflicts in Information Security Systems. Cybersecurity: education, science, technique, 2(22), 168–178.

Levkin, D., Zhernovnykova, O., & Kotko, Y. (2023). Modern mathematical methods in the cyber security system. Mechanisms for ensuring sustainable development of the economy: problems, prospects, international experience. Materials of the IV international scientific and practical Internet conference.

Lysenko, N., et al. (2021) Review of Mathematical Methods in Cyber Threat Detection and Prevention Systems. Actual problems of automation and information technology, 25, 91–102. http://dx.doi.org/10.15421/432110

Bu, C. (2018). Network Security Based on K-Means Clustering Algorithm in Data Mining Research. Advances in Computer Science Research, 83, 642–645. https://doi.org/10.2991/snce-18.2018.130

Cheon, J., Kim, D., & Park, J. (2019). Towards a Practical Cluster Analysis over Encrypted Data. Conference: Selected Areas in Cryptography (SAC), 1–24.

Raptis, G., Katsini, C., & Alexakos, C. (2021). Towards Automated Matching of Cyber Threat Intelligence Reports based on Cluster Analysis in an Internet-of-Vehicles Environment, 2021 IEEE International Conference on Cyber Security and Resilience (CSR), 366–371, https://doi.org/10.1109/CSR51186.2021.9527983

Gao, Y., et al. (2022). HinCTI: A Cyber Threat Intelligence Modeling and Identification System Based on Heterogeneous Information Network. IEEE Transactions on Knowledge and Data Engineering, 34(2), 708–722. https://doi.org/10.1109/TKDE.2020.2987019

Poh, J., et al. (2020). Physical Access Log Analysis: An Unsupervised Clustering Approach for Anomaly Detection. DSIT 2020: Proceedings of the 3rd International Conference on Data Science and Information Technology, 12–18. https://doi.org/10.1145/3414274.3414285

Rosli, N., et al. (2019). Clustering Analysis for Malware Behavior Detection using Registry Data. International Journal of Advanced Computer Science and Applications (IJACSA), 10(12). http://dx.doi.org/10.14569/IJACSA.2019.0101213

Lysenko, S., & Humenyuk, V. (2017). Malware detection method based on the nearest neighbor algorithm. Bulletin of the Khmelnytskyi National University, 6, 2017 (255), 96–101.

REDDY K.T. (2023). Unveiling the Power of k-Nearest Neighbors in Phishing Detection, Insights2Techinfo. https://insights2techinfo.com/unveiling-the-power-of-k-nearest-neighbors-in-phishing-detection/

Kuehn, P., et al. (2022). Clustering of Threat Information to Mitigate Information Overload for Computer Emergency Response Teams. https://arxiv.org/abs/2210.14067

Patton, R., et al. (2011). Hierarchical clustering and visualization of aggregate cyber data. 2011 7th International Wireless Communications and Mobile Computing Conference, 1287–1291. https://doi.org/10.1109/IWCMC.2011.5982725

Dovbysh, A., et al. (2021). Fundamentals of information-extreme synthesis of an automated cyber defense control system. Modern information technologies in cyber security, 7–75.

Lysenko, S. (2019). A method of ensuring the resilience of computer systems in the face of cyber threats based on self-adaptability. Radioelectronic and computer systems, 4(92), 4–16.

Gerasina, O., et al. (2022).Detecting fishing URLs using fuzzy clustering algorithms with global optimization. System technologies, 2(139), 53–67.

Landauer, M., et al. (2020). System log clustering approaches for cyber security applications: A survey. Computers & Security, 92, 1–18. https://doi.org/10.1016/j.cose.2020.101739

Goncharenko, S. (1997). Ukrainian Pedagogical Dictionary. Lybid.

Jain, A., & Dubes, R. (1988). Algorithms for clustering data. Prentice-Hall, Inc, Upper Saddle River.

Xu, R., & Wunsch, D. (2005) Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678. https://doi.org/10.1109/TNN.2005.845141

Yarovy, A., & Strakhov, E. (2015). Multivariate statistical analysis: an introductory methodological guide for students of mathematics and economics. Astroprint.

Xu, D., & Tian, Y. (2015). Comprehensive Survey of Clustering Algorithms. Ann. Data. Sci. 2, 165–193. https://doi.org/10.1007/s40745-015-0040-1

Abdul Nazeer, K., & Sebastian, M. (2009). Improving the Accuracy and Efficiency of the k-means Clustering Algorithm. Proceedings of the World Congress on Engineering, I.

Dunn, J. (1973) A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Journal of Cybernetics, 3, 32–57. http://dx.doi.org/10.1080/01969727308546046

Bezdek, J. (1981). Pattern recognition with fuzzy objective function algorithms. Plenum Press.

Chen, Z. (2022) Research and Application of Clustering Algorithm for Text Big Data. Comput Intell Neurosci. https://doi.org/10.1155/2022/7042778

Salton, G. (1988). Automatic text processing. Addison-Wesley Longman Publishing.

Sidorov, G., et al. (2014). Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas, 18(3), 491–504. https://doi.org/10.13053/CyS-18-3-2043

Vijaymeena, M., & Kavitha, K. (2016). A Survey on Similarity Measures in Text Mining. Machine Learning and Applications: An International Journal, 3, 19–28. https://doi.org/10.5121/mlaij.2016.3103

Downloads


Abstract views: 301

Published

2024-03-28

How to Cite

Shevchenko, S., Zhdanovа Y., Spasiteleva, S., Mazur , N., Skladannyi, P., & Nehodenko, V. (2024). MATHEMATICAL METHODS IN CYBER SECURITY: CLUSTER ANALYSIS AND ITS APPLICATION IN INFORMATION AND CYBERNETIC SECURITY. Electronic Professional Scientific Journal «Cybersecurity: Education, Science, Technique», 3(23), 258–273. https://doi.org/10.28925/2663-4023.2024.23.258273

Most read articles by the same author(s)

1 2 3 4 > >>