GestureTalk: Real-Time Sign Language Recognition Using Deep Learning

Article Sidebar

Main Article Content

Nilesh Gupta
Rohit Jadhav
Sonia Behra

GestureTalk+, a real-time Indian Sign Language (ISL) recognition and speech synthesis system designed for commodity Android devices, combining MediaPipe landmark extraction with lightweight temporal models (BiLSTM and Transformer encoders) and quantized on-device inference to achieve sub-300 ms end-to-end latency under subject-independent evaluation.[1]The system targets accessibility in education and healthcare, addressing robustness under variable lighting, backgrounds, and camera viewpoints, and includes bilingual text-to-speech to support Hindi/English output. Experiments on a 26-letter static ISL set and 20 common dynamic words demonstrate macro-F1 of 0.93 and 0.89 respectively, outperforming SVM and CNN-TCN baselines while maintaining 30–45 FPS with NNAPI delegates on mid-range hardware. Ablation studies show that temporal attention and landmark normalization improve confusion cases (e.g., M vs N), and int8 quantization preserves accuracy while reducing compute and power. The paper contributes a reproducible edge-AI pipeline, signer-independent protocols, and deployment telemetry for latency and battery use

GestureTalk: Real-Time Sign Language Recognition Using Deep Learning. (2026). International Journal of Latest Technology in Engineering Management & Applied Science, 15(5), 527-539. https://doi.org/10.51583/IJLTEMAS.2026.150500047

Downloads

References

F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C. L. Chang, and M. Grundmann, “MediaPipe hands: On-device real-time hand tracking,” arXiv preprint arXiv:2006.10214, 2020.

P. Kumar and A. Sharma, “Indian Sign Language Recognition using Deep Convolutional Neural Networks,” International Journal of Computer Applications, vol. 175, no. 8, pp. 23–29, 2021.

L. Zhang, X. Wang, Y. Li, and M. Chen, “CNN-RNN Hybrid Architecture for Dynamic Gesture Recognition in Real-time Applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 6, pp. 1234–1247, 2022.

J. Shen, R. Pang, R. J. Weiss, M. Schuster, N. Jaitly, Z. Yang,…and Y. Wu, “Natural TTS synthesis by conditioning WaveNet on mel spectrogram predictions,” in Proc. IEEE ICASSP, pp. 4779–4783, 2018.

R. Patel, S. Gupta, and M. Jain, “Mobile-based Gesture Recognition System for Augmentative and Alternative Communication,” Journal of Assistive Technologies, vol. 15, no. 2, pp. 89–102, 2021.

Census of India, “Data on Disability – India and States/UTs,” Office of the Registrar General & Census Commissioner, Ministry of Home Affairs, Government of India, 2011.

Microsoft Research, “Kinect-based Sign Language Translation System for American Sign Language,” Microsoft Technical Report MSR-TR-2020-15, 2020.

G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, vol. 25, no. 11, pp. 120–125, 2000.

World Health Organization, Global Report on Assistive Technology, WHO Press, Geneva, Switzerland, 2023.

Google AI, “TensorFlow Lite: Machine Learning for Mobile and Edge Devices,” Google Developers Documentation, 2022.

A. Das, R. Kumar, and P. Singh, “Comparative Analysis of Assistive Communication Technologies in Indian Context,” in Proc. International Conference on Accessibility and Assistive Technologies, pp. 145–152, 2023.

V. Sharma and N. Patel, “Cost-effective Solutions for Speech Impairment: A Survey of Emerging Technologies,” Journal of Rehabilitation Research and Development, vol. 58, no. 3, pp. 34–48, 2021.

Article Details

How to Cite

GestureTalk: Real-Time Sign Language Recognition Using Deep Learning. (2026). International Journal of Latest Technology in Engineering Management & Applied Science, 15(5), 527-539. https://doi.org/10.51583/IJLTEMAS.2026.150500047