Real-Time Sign Language Captioning for Video Chats

Supervisor: Mrs. K. Vindhya Rani (Assistant Professor, MVGRCOE)

Abstract

This project proposes a solution for enabling accessible communication for the deaf and hard-of-hearing community through video chats. The system leverages machine learning and computer vision technologies to detect sign language gestures in real-time and translate them into text captions. The goal is to provide a more inclusive communication platform that facilitates seamless virtual conversations for everyone, regardless of their hearing ability. The project integrates cutting-edge tools for video processing, gesture recognition, and low-latency communication to offer an efficient and accessible web-based platform.

Project Highlights

Dataset & Model Overview

The system is built upon a custom-designed dataset, consisting of approximately 100 sign language classes with hundreds of images per class. To ensure robustness and accuracy:

The Random Forest Classifier was chosen for its effectiveness with structured landmark data and its suitability for real-time applications. Mediapipe is utilized for landmark extraction, while the API layer is handled by Flask for smooth integration.

System Architecture & Workflow

Key Technologies & Tools

Testing & Validation

Several testing procedures were conducted to ensure system reliability:

Real-World Applications

Future Enhancements

Conclusion

This project successfully demonstrates the feasibility of integrating real-time sign language recognition into video conferencing systems. By combining AI-powered gesture recognition, efficient machine learning models, and robust real-time communication protocols, the system provides a highly functional, scalable, and socially impactful solution. This work lays the foundation for more inclusive digital environments and promotes accessible technology for all.

Source Code