Voiceguard – Ai-Based Voice Authenticity Detection System

Uncategorized

Authors: Dr. C. Saravanabhavan, Akhil R

Abstract: Recent advances in deep learning have en-abled highly realistic synthetic speech, creating serious risks such as impersonation, fraud, and misuse of voice-based authentication systems. Detecting AI-generated speech is increasingly difficult because modern text-to-speech and voice conversion models can closely imitate human prosody and timbre across languages. This paper proposes VoiceGuard, a hybrid deep learning framework that combines complementary spectral and temporal rep-resentations for deepfake voice detection. A Convolutional Neural Network (CNN) branch learns frequency-domain artifacts from spectrograms, while a CNN-GRU branch models temporal inconsistencies from acoustic descriptors. An attention-based fusion mechanism adaptively weights branch outputs to improve discriminative power. The framework is evaluated on benchmark datasets and cross-lingual settings, and it improves performance compared to single-representation approaches while remaining compu-tationally practical for real-world deployment.

DOI: https://doi.org/10.5281/zenodo.19480877

 

× How can I help you?