STUDYING THE RESISTANCE OF BIOMETRIC AUTHENTICATION SYSTEMS TO ATTACKS USING VOICE CLONING TECHNOLOGY BASED ON DEEP NEURAL NETWORKS
DOI:
https://doi.org/10.28925/2663-4023.2024.26.670Keywords:
клонування голосу, біометричні системи автентифікації, глибинні нейронні мережі, безпека, синтез голосу, WaveNet, Tacotron 2Abstract
With the development of voice synthesis technologies based on deep neural networks, the security threats to biometric authentication systems that use voice recognition have increased. These systems, which were considered reliable, can be easily compromised by fake voices created using advanced models such as WaveNet, Tacotron 2, and other modern algorithms. In today's cybersecurity environment, such attacks jeopardize the confidentiality of personal data, which necessitates the improvement of protection methods.
The purpose of this article is to study the resilience of biometric authentication systems to attacks using voice cloning technology, to analyze the effectiveness of modern synthesis methods for circumventing such systems, and to provide a comparative overview of various approaches to protect voice biometric data. The article discusses technologies that allow for the creation of accurate and realistic synthetic voices, as well as methods for detecting and protecting against fake signals. The article also analyzes the current vulnerabilities of voice systems and suggests strategies to increase resistance to such attacks, providing users with greater security and privacy.
Downloads
References
Oleshko, I. (2012). Comparative analysis of biometric authentication methods based on the relative entropy criterion. Bulletin of Lviv Polytechnic National University: Automation, Measurement and Control, 741.
Kustov, A., (2020). Spoofing attacks on biometric authentication systems and methods of countering attacks. Radio electronics and youth in the XXI century: materials of the 24th International Youth Forum, 5, 76–77.
Kishchenko, M. I., & Pastushenko, M. S. (2021). Directions for improving the efficiency of voice authentication systems. Seventh International Scientific and Technical Conference “Problems of electromagnetic compatibility of advanced wireless communication networks (EMC-2021)”, 20–23.
Mohammadi, A., Sood, K., Nazari, A., & Thiruvady, D. (2024). Securing Voice Authentication Applications Against Targeted Data Poisoning. https://doi.org/10.48550/arXiv.2406.17277
Approaches to Address AI-enabled Voice Cloning. (2024). https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/2024/04/approaches-address-ai-enabled-voice-cloning
Milewski, K., Zaporowski, S., & Czyżewski, A. (2023). Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice. Electronics, 12(21). https://doi.org/10.3390/electronics12214458
Maksymenko, O. A. (2019). Bachelor’s thesis: “Generation of the target human voice using neural networks”. National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”.
Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). Wavenet: a generative model for raw audio. https://doi.org/10.48550/arXiv.1609.03499
Victor, A. O., & Ali, M., I. (2024). Enhancing Time Series Data Predictions: A Survey of Augmentation Techniques and Model Performance. ACSW’24: Proceedings of the 2024 Australasian Computer Science Week, 1–13. https://doi.org/10.1145/3641142.364114
Shen, J., et al. (2017). Natural tts synthesis by conditioning wavenet on mel spectrogram predictions. https://doi.org/10.48550/arXiv.1712.05884
Chapuzet, A. (n. d.). Speech Synthesis (TTS), How to Use It and Why Is It So Important? https://vivoka.com/how-to-speech-synthesis-tts
Verma, U., & Padmanaban, R. (2024). Speech Cloning: Text-To-Speech Using VITS. Engineering and Technology Journal, 9(5). https://doi.org/10.47191/etj/v9i05.10
Kim, J., Kong, J., & Son, J. (2021). Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech. https://doi.org/10.48550/arXiv.2106.06103
Malyshev, A. (2023). Voice Cloning: A Blessing or a Curse for the Voice Banking Industry? https://www.finextra.com/blogposting/23813/voice-cloning-a-blessing-or-a-curse-for-the-voice-banking-industry
Cox, J. (2023). How I Broke Into a Bank Account With an AI-Generated Voice. https://www.vice.com/en/article/how-i-broke-into-a-bank-account-with-an-ai-generated-voice/
Audio samples from “Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions”. (n. d.). https://google.github.io/tacotron/publications/tacotron2/index.html
Hulak, H. M., Zhiltsov, O. B., Kyrychok, R. V., Korshun, N. V., & Skladannyi, P. M. (2024). Information and cyber security of the enterprise. Textbook. Lviv: Publisher Marchenko T. V.
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Тетяна Савкова, Іван Опірський, Дмитро Сабодашко
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.