Authors: Jyothsna M, Srinika Kontham, Sharanya Balachandran, Meghana Danta, Sindhu Naine
Abstract: The SignVoice system is based on artificial intel- ligence technology that offers a bidirectional communication system for deaf, mute, and visually impaired persons to com- municate smoothly with normal persons. The SignVoice system is based on machine learning, deep learning, and computer vision technologies to offer different types of communication such as sign language, speech, text, and image-based communication. The hand gestures are recorded through the webcam and processed through MediaPipe to identify the landmark and classify the image through machine learning to produce text output. The input is converted into text through Whisper for speech input, and the text output is generated through an artificial intelligence- based chatbot and then converted into audio through text- to-speech technology. The SignVoice system is based on the hybrid approach to process the gestures through client-side processing and computationally intensive operations such as speech recognition through cloud-based services. In addition to this, the chatbot can perform image input, text output, and speech output that can be helpful for visually impaired persons. The proposed SignVoice system can communicate efficiently and accurately for impaired persons through gesture, speech, and intelligent text-based responses.
DOI: https://doi.org/10.5281/zenodo.19326545