INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025
36. S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev,
and P. Woodland, The HTK Book (for HTK Version 3.4), Cambridge University Engineering Department, Cambridge,
UK, 2006.
37. A. Lee and T. Kawahara, “Recent development of open-source large vocabulary continuous speech recognition engine
Julius,” Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
(APSIPA ASC), Sapporo, Japan, pp. 131–137, 2009.
38. P. Lamere, P. Kwok, W. Walker, E. Gouvea, P. Wolf, and J. Glass, “CMU Sphinx: Open source speech recognition,”
Proceedings of the Human Language Technology Conference (HLT), Edmonton, Canada, pp. 1–4, 2003.
39. H. Ney, R. Schluter, T. Niesler, and S. Kanthak, “The RWTH large vocabulary continuous speech recognition system,”
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI,
USA, pp. 849–852, 2007.
40. S. Watanabe, T. Hori, S. Karita, T. Hayashi, J. Nishitoba, Y. Unno, N. Enrique Yalta Soplin, J. Heymann, M. Wiesner, N.
Chen, A. Renduchintala, and T. Ochiai, “ESPnet: End-to-end speech processing toolkit,” Proceedings of Interspeech,
Hyderabad, India, pp. 2207–2211, 2018.
41. O. Kuchaiev, B. Ginsburg, I. Gitman, V. Lavrukhin, J. M. Cohen, H. Nguyen, and J. Keshet, “NVIDIA NeMo: A toolkit
for building AI applications,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), Brighton, UK, pp. 8369–8373, 2019.
42. H. Ahlawat, "Automatic speech recognition: A survey of deep learning techniques," Journal of Speech Technology, vol.
1, no. 1, pp. 1–15, 2025.
43. N. Sethi, "Survey on automatic speech recognition systems for Indic languages," ResearchGate, 2022.
44. A. Mishra, "Comparative wavelet, PLP, and LPC speech recognition techniques on the Hindi speech digits database,"
SPIE Digital Library, 2010.
45. M. Dua, "Optimizing integrated features for Hindi automatic speech recognition," Journal of Intelligent Systems, vol. 28,
no. 5, pp. 123–135, 2019.
46. R. Aggarwal, "Performance evaluation of sequentially combined features for Hindi ASR," SpringerLink, 2013.
47. S. Chadha, "Multilingual ASR system for six Indic languages," arXiv, 2022.
48. V. Bhat and P. Bhattacharyya, "Automatic speech recognition for Indian languages," IIT Bombay, 2023.
49. H. Malik, "Automatic speech recognition: A survey," INAOE Research Center, 2021.
50. A. Seth, "Leveraging Wav2Vec 2.0 and XLS-R for enhanced Hindi ASR," ACM Digital Library, 2024.
51. IndicWhisper and IndicWav2Vec models evaluation, ISCA Archive, 2024.
52. J. Goodman, “A bit of progress in language modeling,” Computer Speech & Language, vol. 15, no. 4, pp. 403–434, 2001.
53. M. Mohri, F. Pereira, and M. Riley, “Weighted finite-state transducers in speech recognition,” Computer Speech &
Language, vol. 16, no. 1, pp. 69–88, 2002.
54. Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of Machine Learning
Research, vol. 3, pp. 1137–1155, 2003.
55. T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, and S. Khudanpur, “Extensions of recurrent neural network language
model,” Proc. ICASSP, 2011.
56. T. Mikolov, M. Karafiát, L. Burget, J. Černocký, and S. Khudanpur, “Recurrent neural network based language model,”
Proc. Interspeech, 2010.
57. A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” Proc. NeurIPS, 2017.
58. Upadhyaya, S., Singh, R., and Agrawal, S. (2017). “Hindi Automatic Speech Recognition using Hybrid DNN-HMM
Acoustic Model.” International Journal of Speech Technology, vol. 20, no. 4, pp. 867–879. Springer.
59. Mittal, N., and Jain, S. (2018). “Performance Evaluation of Deep Neural Network–Hidden Markov Model for Hindi ASR.”
Procedia Computer Science, vol. 132, pp. 796–803.
60. Sharma, P., Gupta, N., and Singh, R. (2020). “HindiSpeech-Net: A CNN-Based End-to-End Automatic Speech
Recognition Model for Hindi Language.” International Journal of Speech Technology, vol. 23, no. 2, pp. 421–430.
Springer.
61. Dua, M., Singh, S., Aggarwal, N., and Sharma, A. (2019). “Performance Analysis of Interpolated Recurrent Neural
Network Language Models for Continuous Hindi Speech Recognition.” International Journal of Speech Technology, vol.
22, no. 3, pp. 879–888. Springer.
62. Kumar, A., and Aggarwal, R. K. (2020). “RNN-Based Language Modeling and Speaker Adaptation Techniques for Hindi
Automatic Speech Recognition.” Journal of Intelligent Systems, vol. 29, no. 1, pp. 150–162. De Gruyter.
63. Graves, A. (2012). “Sequence Transduction with Recurrent Neural Networks.” Proceedings of ICML Workshop on
Representation Learning, pp. 1–9.
64. Graves, A., Mohamed, A.-R., and Hinton, G. (2013). “Speech Recognition with Deep Recurrent Neural Networks.” IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649.
65. Rao, K., and Sak, H. (2017). “Multiple Encoder-Decoder Architectures for End-to-End Speech Recognition.” IEEE
Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 130–135.
Page 1414