AI-Based Deepfake Voice Detection Using MFCC Features and Random Forest Classification

Ashish S. Kangane; Animesh S. Bhopale; Waman R. Parulekar

doi:10.51583/IJLTEMAS.2026.150500121

Ashish S. Kangane

Department of MCA, Finolex Academy of Management and Technology, Ratnagiri, India

Animesh S. Bhopale

Department of MCA, Finolex Academy of Management and Technology, Ratnagiri, India

Waman R. Parulekar

Department of MCA, Finolex Academy of Management and Technology, Ratnagiri, India

DOI: https://doi.org/10.51583/IJLTEMAS.2026.150500121

Published: Jun 6, 2026

The rapid proliferation of AI-generated audio poses a serious threat to digital forensics, voice-based authentication, and information integrity. This paper presents a deepfake voice detection system that combines Mel-Frequency Cepstral Coefficient (MFCC) feature extraction with a Random Forest ensemble classifier to distinguish real human speech from synthetically generated audio. The proposed system processes input audio files in WAV or MP3 format, extracts 40 MFCC coefficients as the feature representation, and classifies each sample as real or fake through a trained Random Forest model. The complete pipeline is deployed as a Flask-based web application, enabling browser-based access without requiring any specialist software.

Experimental evaluation was conducted on a balanced binary dataset comprising 300 real voice recordings and 300 AI-generated voice samples (total: 600 samples), split 80/20 for training and testing. The system was evaluated against two baseline classifiers under identical feature conditions. Results demonstrate that the proposed Random Forest model achieves an accuracy of 92.7%, precision of 91.9%, recall of 93.5%, and an F1-score of 92.7%, indicating strong effectiveness for practical deepfake audio detection. These results represent a substantial improvement over the SVM baseline (accuracy: 76.4%) and Decision Tree baseline (accuracy: 81.2%).

AI-Based Deepfake Voice Detection Using MFCC Features and Random Forest Classification . (2026). International Journal of Latest Technology in Engineering Management & Applied Science, 15(5), 1528-1536. https://doi.org/10.51583/IJLTEMAS.2026.150500121

Downloads

References

T. Kinnunen, M. Sahidullah, H. Delgado, M. Todisco, N. Evans, J. Yamagishi, and K. A. Lee, "The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection," in Proc. Interspeech, Stockholm, Sweden, 2017, pp. 2–6.

A. Nautsch, X. Wang, N. Evans, T. Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, and K. A. Lee, "ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech," IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 3, no. 2, pp. 252–265, 2021.

M. Sahidullah, T. Kinnunen, and C. Hanilci, "A Comparison of Features for Synthetic Speech Detection," in Proc. Interspeech, Dresden, Germany, 2015, pp. 2087–2091.

G. Lavrentyeva, S. Novoselov, A. Malinin, A. Kozlov, O. Kudashev, and V. Shchemelinin, "STC Antispoofing Systems for the ASVspoof 2019 Challenge," in Proc. Interspeech, Graz, Austria, 2019, pp. 1033–1037.

H. Zhang, M. Tan, and X. Zhang, "Fake Speech Detection Using Residual Network with Transformer Encoder," in Proc. ACM Workshop on Information Hiding and Multimedia Security, 2021, pp. 13–22.

R. Reimao and V. Tzerpos, "FoR: A Dataset for Synthetic Speech Detection," in Proc. International Conference on Speech Technology and Human-Computer Dialogue (SpeD), 2019, pp. 1–8.

B. Li, L. Wang, T. Xu, and X. Li, "An Efficient Model for Real-Time Fake Voice Detection," Scientific Reports, vol. 13, no. 1, p. 7867, 2023.

M. Alzantot, B. Balaji, and M. Srivastava, "Did You Hear That? Adversarial Examples Against Automatic Speech Recognition," arXiv preprint arXiv:1801.00554, 2019.

M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, and K. A. Lee, "ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection," in Proc. Interspeech, Graz, Austria, 2019, pp. 1008–1012.

H. Tak, J. Patino, M. Todisco, A. Nautsch, N. Evans, and J. Yamagishi, "End-to-End Anti-Spoofing with RawNet2," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada, 2021, pp. 6369–6373.

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.

How to Cite

AI-Based Deepfake Voice Detection Using MFCC Features and Random Forest Classification . (2026). International Journal of Latest Technology in Engineering Management & Applied Science, 15(5), 1528-1536. https://doi.org/10.51583/IJLTEMAS.2026.150500121

Download Citation

Article Sidebar

Main Article Content

Downloads

References

Article Details

How to Cite