APPLICATION OF METRIC METHODS OF HISTOGRAM COMPARISON FOR DETECTING CHANGES IN ENCRYPTED NETWORK TRAFFIC

Authors

  • Ihor Subach Institute of special communications and information security National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” https://orcid.org/0000-0002-9344-713X
  • Dmytro Sharadkin Institute of special communications and information security National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” https://orcid.org/0000-0001-6407-8040
  • Ihor Yakoviv Institute of special communications and information security National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” https://orcid.org/0000-0001-7432-898X

DOI:

https://doi.org/10.28925/2663-4023.2024.25.434448

Keywords:

cybersecurity; cyber incident; machine learning, encrypted network traffic; time series; histogram similarity metrics; similarity detection algorithms.

Abstract

With the increase in the share of encrypted traffic transmitted over the Internet, it has become impossible to directly identify the causes of anomalies in network behavior due to the lack of access to the contents of encrypted packets. This has significantly complicated the task of identifying information security threats. Only external symptoms are available for analysis, which manifest as changes in certain basic traffic parameters, such as volume, intensity, delays between packets, etc. As a result, the role and importance of algorithms for detecting changes in traffic have increased. These algorithms, using modern methods like machine learning, can identify various types of anomalies, including previously unknown ones. They analyze network traffic parameters which are available for direct measurement, presenting their development as time series. One of the least studied classes of such algorithms is the direct comparison of histograms of time series value distributions at different time intervals, particularly a subclass known as metric algorithms. These algorithms are based on the assumption that differences between histograms of time series values at adjacent observation intervals indicate changes in the flow of events that generate network traffic. However, the problem of measuring the difference or similarity between histograms, which are considered as objects in a multidimensional space, does not have a unambiguous solution. The paper analyzes existing histogram similarity metrics and describes a series of studies using statistical modeling. These studies evaluated the dependence of algorithm efficiency on external parameters and compared algorithms within this class to other change detection algorithms. This analysis made it possible to assess the practical application of these algorithms. The results showed that metric algorithms for comparing histograms can demonstrate high performance and, in some cases, outperform other known algorithms for detecting changes in time series. They ensure a reduction in the number of false positives and a decrease in the delay between the moment a change appears in the observed object and the moment it is detected by the algorithm.

Downloads

Download data is not yet available.

References

Google Transparency Report. (n. d.). https://transparencyreport.google.com/https/overview

The role of streaming machine learning in encrypted traffic analysis - Help Net Security. (2022). https://www.helpnetsecurity.com/2022/05/09/ml-encrypted-traffic-analysis/

The Challenges of Inspecting Encrypted Network Traffic. Fortinet Blog. (2022). http://www.fortinet.com/blog/industry-trends/keeping-up-with-performance-demands-of-encrypted-web-traffic

Alwhbi, I. A., Zou, C. C., & Alharbi, R. N. (2024). Encrypted Network Traffic Analysis and Classification Utilizing Machine Learning. Sensors, 24(11). https://doi.org/10.3390/s24113509

Papadogiannaki, E., & Ioannidis, S. (2021). A Survey on Encrypted Network Traffic Analysis Applications, Techniques, and Countermeasures. ACM Computing Surveys, 54(6), 1–35. https://doi.org/10.1145/3457904

Encrypted Traffic Analysis: Use Cases & Security Challenges. ENISA Report. European Union Agency for Cybersecurity (ENISA). (2020). https://www.enisa.europa.eu/publications/encrypted-traffic-analysis

Schroth, C., Siebert, J., & Groß, J. (2021). Time Traveling with Data Science: Focusing on Change Point Detection in Time Series Analysis (Part 2). Analytics, Big Data, Data Science, Fraunhofer IESE-Blog, Künstliche Intelligenz Published. https://www.iese.fraunhofer.de/blog/change-point-detection

Mehrotra, K. G, Mohan, C. K., & Huang, H. M. (2017). Anomaly Detection. Principles and Algorithms. Springer International Publishing AG 2017. https://doi.org/10.1007/978-3-319-67526-8

Lakhina, A., Crovella, M., & Diot, C. (2005). Mining anomalies using traffic feature distributions. Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications - SIGCOMM ’05. Philadelphia, Pennsylvania, USA. https://doi.org/10.1145/1080091.1080118

Chen, L., & Dobra, A. (2013). Histograms as statistical estimators for aggregate queries. Information Systems, 38(2), 213–230. https://doi.org/10.1016/j.is.2012.08.003

Oliynyk, O., & Taranenko, Y. (2021). Automated system for identification of data distribution laws by analysis of histogram proximity with sample reduction. Ukrainian metrological journal. NSC “Institute of Metrology”, 3, 31–37. URL: https://doi.org/10.24027/2306-7039.3.2021.241627

Rosenberger, J., Müller, K., Selig, A., Bühren, M., & Schramm, D. (2022). Extended kernel density estimation for anomaly detection in streaming data. Procedia CIRP, 112, 156–161. https://doi.org/10.1016/j.procir.2022.09.065

Cha, S.-H., & Srihari, S. N. (2002). On measuring the distance between histograms. Pattern Recognition, 35(6), 1355–1370. https://doi.org/10.1016/s0031-3203(01)00118-2

Bityukov, S. I., Krasnikov, N. V., Nikitenko, A. N., Smirnova, V. V. (2013). A method for statistical comparison of histograms. Discrete and Continuous Models and Applied Computational Science, (2), 324–330. https://doi.org/10.48550/arXiv.1302.2651

Wood, J. C. S. (2018). Non‐Parametric Comparison of Single Parameter Histograms. Current Protocols in Cytometry, 83(1), 2018. 20p. https://doi.org/10.1002/cpcy.33

Lepskiy, A. (2018). On the Preservation of Comparison of Distorted Histograms. International Journal of Information Technology & Decision Making, 17(01), 2018. p 339–355. DOI:10.1142/s0219622017400028.

Gagunashvili, N. D. Tests for comparing weighted histograms. Review and improvements. The European Physical Journal Plus, 132(5). 2017. https://doi.org/10.1140/epjp/i2017-11481-1

van den Burg, G. J. J., & Williams, C. K. I. (2022). An Evaluation of Change Point Detection Algorithms. https://doi.org/10.48550/arXiv.2003.06222

Bharadiy, J. P. (2023). Machine Learning in Cybersecurity: Techniques and Challenges. European Journal of Technology, 7(2), 1–14. https://doi.org/10.47672/EJT.1486

Sokolov, V. V., Shapoval, O. M., & Sharadkin, D. M. (2020). An ensemble of algorithms for detecting anomalies in time series and its application to real-time monitoring of the state of systems. Collection of scientific papers of VITI, 3, 82–93.

Ryabtsev, V., Sharadkin, D., & Klyat, Y. (2021). A comparative study of algorithms for detecting change points in regression models of time series. Information Technology and Security, 9(2), 137–150. https://doi.org/10.20535/2411-1031.2021.9.2.249887

Truong, C., Oudre, L., & Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing. https://doi.org/10.1016/j.sigpro.2019.107299

Fesokha, V, Subach, I., Kubrak, V., Mykytiuk, A., & Korotaiev, S. (2020). Zero-Day Polymorphic Cyberattacks Detection Using Fuzzy Infetrence System. Austrian Journal of Technical and Natural Sciences, 5-6, 8–14. https://doi.org/10.29013/AJT-20-5.6-8-13

Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Comput. Surv., 41(3), 1–58. https://doi.org/10:1145/1541880:1541882

Aminikhanghahi, S. (2017). Cook D.J. A Survey of Methods for Time Series Change Point Detection. Knowledge and information systems, 51(2), 339–367. https://doi.org/10.1007/s10115-016-0987-z

Moore, A. W., Zuev, D., & Crogan, M. L. (2005). Discriminators for use inflow-based classification. Technical report, RR-05-13, University of Cambridge.

Bi, S., Broggi, M., & Beer, M. (2019). The role of the Bhattacharyya distance in stochastic model updating. Mechanical Systems and Signal Processing, 117, 437–452. https://doi.org/10.1016/j.ymssp.2018.08.017

Lee, S. M., Xin, J. H., & Westland, S. (2005). Evaluation of image similarity by histogram intersection. Color Research & Application, 30(4), 265–274. https://doi.org/10.1002/col.20122

Downloads


Abstract views: 42

Published

2024-09-25

How to Cite

Subach , I., Sharadkin, D., & Yakoviv, I. (2024). APPLICATION OF METRIC METHODS OF HISTOGRAM COMPARISON FOR DETECTING CHANGES IN ENCRYPTED NETWORK TRAFFIC. Electronic Professional Scientific Journal «Cybersecurity: Education, Science, Technique», 1(25), 434–448. https://doi.org/10.28925/2663-4023.2024.25.434448