INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 184
A Comparative Study of Machine Learning Models for Gender
Recognition from Voice Samples
Manasi Manoj Sukale*, Pradip Ravindra Jagdale
Department of Statistics, Dr. D. Y. Patil Arts, Commerce, & Science College Pimpri, Pune-411018, Maharashtra, India
*Corresponding Author
DOI: https://doi.org/10.51583/IJLTEMAS.2025.1413SP038
Received: 26 June 2025; Accepted: 30 June 2025; Published: 25 October 2025
Abstract: Voice recognition for gender has come a prominent area of study in machine literacy and speech processing.
Dimorphism, or the clear physiological and aural distinctions between man and woman voices, is a point of mortal voices that
allows automated systems to determine gender grounded on oral traits like pitch, frequency, accentuation, and speech rate. This
study investigates how aural features taken from recorded speech can be used to classify gender using machine literacy algorithms.
The delicacy and effectiveness of several bracket algorithms are compared through perpetration and evaluation. According to the
analysis, woman voices have slightly advanced frequentness than man voices. Mean frequency of man and woman voice is thick
between 0.15- 0.20.
Keywords: Data mining Classifiers: logistic regression, Decision tree, SVM, ANN, Naive Bayesian classifier, Python.
I. Introduction
Gender identity is one of the biggest problems today. Detecting gender from acoustic data, ie medium, frequency, etc. Machine learning
provides promising results for classification problems in all research domains. There are several performance matrix to evaluate the
algorithm of a region. Our comparative model algorithm to evaluate different machine learning algorithms from acoustic data. Dimorphism
is the characteristic of the voice that is strongly observed in humans. Intonation, speech speed and duration are some properties that separate
human voices, mainly male and female voices. Voting datasets are converted to various parameters such as vocal power, pitch, frequency,
Q21, Q25, etc., they are then trained and tested with different algorithms to predict gender based on algorithms. In the real world, it is
possible for a person to confirm a person through voice. The voice is filled with many linguistic properties. These voting features are
considered a voice print to recognize gender on the speaker. The registered voice is considered the entrance to the system, which is then a
system process to achieve voice functions. Check the input and compare it with a trained model, calculated based on the algorithm used
and provides the latest matching output. Gender recognition can be used with various other applications. There are some: male to feel sad,
female anger, etc., separate sound and video using tags, helping individual assistants answer questions with gender -specific results, etc.
The applications include effective advertising and marketing strategies in Customer Relations Management (CRM) system
II. Literature Review
Harb and Chen (2005) showed one of the earliest research studies in the area of voice-based
gender recognition, particularly in multimedia systems. They sought to design a system for the identification of a speaker's
gender based on acoustic signals alone, which has direct applications in personalized content delivery, surveillance, and human-
computer interaction.23
Max Kuhn and Kjell Johnson (2013) Applied Predictive Modeling is a core book that gives a complete picture of the end-to-end
process of developing predictive models. Although the book doesn't concentrate on gender identification by voice,
it gives crucial methodologies and techniques directly implementable in such tasks, especially in feature selection, model
evaluation, and classification algorithms.
Ahmad and Tariq (2020) present an elaborate overview of voice-based gender classification approaches and applications.
They compare both traditional and contemporary approaches, highlighted featureextraction methods such as MFCC, pitch,
formants, spectral and prosodic features, and classifiers such as SVM, KNN, Naïve Bayes, decision trees, and neural networks.
They point out an important trend: hybrid feature sets with ensemble or deep models result in best accuracy (~95%+)
III. Methodology
This database was designed to identify a voice as a man or woman based on the acoustic properties of voice and speech. The dataset
consists of 3,168 registered voice samples, collected from male and female speakers. The voice samples are pre-processed by
acoustic analysis in R using the seewave and tuneR packages, with an analyzed frequency range of 0hz-280hz (human vocal range).
Missing values has been checked and removed if any available by using descriptive analysis. Graphical presentation used for
comparison between male and female voice frequency, comparison between male and female mean dominant voice frequency,
Distribution of Variable, Density plot for mean frequency, outlier detection. Heat map was used to investigate the dependence
between multiple variables at the same time. The resultant Graph contains the correlation coefficients between each variable. For
variable selection, the Gini Index is calculated by subtracting the sum of the squared probabilities of each class from one. After data
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 185
pre-processing the logistic regression, support vector machine, K Nearest Neighbour and naïve Bayesian classifier algorithms,
Artificial Neural Network, Decision tree was used.
Descriptive Analysis:
Table 1: Descriptive Analysis of data
Sr. No Variable Type Min Mean median max Missing value
1 Meanfreq Numeric 0.0394 0.1809 0.1848 0.2511 0
2 Sd Numeric 0.0184 0.0571 0.0592 0.1153 0
3 Median Numeric 0.0110 0.1856 0.1900 0.2612 0
4 Q25 Numeric 0.0002 0.1405 0.1403 0.24730. 0
5 Q75 Numeric 0.0429 0.2248 0.2257 0.2735 0
6 IQR Numeric 0.0146 0.0843 0.0946 0.2522 0
7 Skew Numeric 0.1417 3.1402 2.1971 34.7255 0
8 Kurt Numeric 2.0685 36.5685 8.3185 1309.612 0
9 sp.ent Numeric 0.7387 0.8951 0.9018 0.9820 0
10 Sfm Numeric 0.0369 0.4082 0.3963 0.8429 0
11 Mode Numeric 0.0000 0.1653 0.1866 0.2800 0
12 Centroid Numeric 0.0394 0.1809 0.1848 0.2511 0
13 Meanfun Numeric 0.0556 0.1428 0.1405 0.2376 0
14 Minfun Numeric 0.0098 0.0368 0.0461 0.2041 0
15 Maxfun Numeric 0.1031 0.2588 0.2712 0.2791 0
16 Meandom Numeric 0.0078 0.8292 0.7658 2.9577 0
17 Mindom Numeric 0.0049 0.0526 0.0234 0.4590 0
18 Maxdom Numeric 0.0078 5.0473 4.9922 21.8672 0
19 Dfrange Numeric 0.0000 4.9946 4.9453 21.8438 0
20 Modindx Numeric 0.0000 0.1738 0.1394 0.9324 0
21 Lable Factor 0.0000 0.0000 0.0000 0.0000 0
No missing value is present in the data
Graphical Presentation: Comparison Between Male and Female Voice Frequency
Fig. 1: Comparison between male and female voice frequency
The voice frequency of female is slightly greater than male.
0
200
400
600
800
1000
1200
Below mean Above mean
915
669
471
1113
Mean Frequency
male
Female
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 186
Comparison Between Male and Female Mean Dominant Voice Frequency
Fig. 2: Comparison between male and female mean dominant voice frequency
The average mean dominant frequency of female is greater than male frequency.
Distribution of Variable:
Fig. 3 Distribution of variable
From above, no variable has normal distribution.
Density plot for mean frequency:
Fig. 4: Density plot
Mean frequency of male and female voice is dense between 0.15-0.20.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 187
Outlier Detection:
Fig. 5: Outlier Detection
From above boxplots, only variable which have outliers is the mean frequency.
Heatmap: It is used to investigate the dependence between multiple variables at the same time. The resultant Graph contains the
correlation coefficients between each variable.
Fig. 6: Heatmap
The values of correlation coefficients are grater for some variables. There is dependency in variables. Hence select those variables
which gives more accuracy.
Variable Selection: Gini index has used as variable selection method. The Gini Index is calculated by subtracting the sum of the
squared probabilities of each class from one.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 188
Fig. 7: Variable Selection
Here 12 features out of total 21 features having considerable importance. Cumulative feature importance shows that 9 out of total
21 features cover 90% of feature importance.
Model building: The methodology and performance of every model is as follows:
Fig. 8: Model building
1) Logistic Regression: Logistic regression is a statistical method for predicting binary classes. The outcome or target variable is
binary. It computes the probability of an event occurrence. We selected this model for prediction purpose i.e. identify gender.
Using R software results are as follows:
Now after variable selection the data is devoided into training and testing data. The training data is used to build the model and
testing data for prediction.
Confusion Matrix:
Predicted
Actual Female Male
Female 302 14
Male 6 310
2) Decision Tree: A Decision Tree is a simple representation of classifying examples. It is a Supervised Machine Learning method.
It provides a graphical representation of how different factors influence on target. It breaks down a dataset into smaller and smaller
subsets. In the literature there are so many popular decision algorithms like ID3, C4.5, C5 etc. As the name implies these techniques
recursively separates observations in branches who construct a tree for the purpose of improving the prediction accuracy.
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 189
Predicted
Actual Female Male
Female 304 10
Male 12 306
Fig. 9: Decision Tree
3) Artificial Neural Network: Artificial Neural Network is commonly known as biological inspired, sophisticated analytical
technique, capable of modelling extremely complex non-linear function. We used a popular ANN architecture called multilayer
perception (MLP) with black proposition. It is most commonly used and well-studied ANN architecture.
By using R software, model and corresponding result are as follows,
Predicted
Actual Female Male
Female 297 8
Male 19 308
Fig. 10: ANN
4) Support Vector Machine: Support Vector Machine (SVM) is a supervised machine learning algorithm. It is used for
classification and regression tasks. It tries to find the best boundary known as hyperplane that separates different classes in the
data. It is useful when you want to do binary classification.
By using Python software, the model and corresponding result are as follows,
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Special Issue | Volume XIV, Issue XIII, October 2025
www.ijltemas.in Page 190
Predicted
Actual Female Male
Female 333 4
Male 11 286
5) Naïve Bayesian Classifier: The Naïve Bayes algorithm is a simple probabilistic classifier that calculates a set of probabilities by
counting the frequency and combinations of values in a given dataset. The algorithm uses Bayes theorem and assumes all attributes
to be independent given the value of class variable. Naïve Bayes classifier is based on Bayes theorem and the theorem of total
probability. In this classifier we compute the conditional probability P(Cj/X) and assign X to those class Ci having large probability
i.e. X ϵ Cj if P(Cj/X)>P(Ci/X) for all i ≠ j=1, 2, …, m
By using Python software, the model and corresponding result are as follows,
Predicted
Actual Female Male
Female 298 39
Male 38 259
Comparison of Models:
Sr. No. Model Accuracy sensitivity Specificity
1 Logistic Regression 0.9683 0.9810 0.9556
2 Decision Tree 0.9651 0.9622 0.9681
3 ANN 0.9572 0.9418 0.9737
4 SVM 0.9763 0.9629 0.9881
5 Naïve Bayes 0.8785 0.8720 0.8842
IV. Conclusion
The voice frequency of female is slightly greater than male. The distribution of male and female voice mean dominant frequency is
near about same. No one variable have normal distribution. Mean frequency of male and female voice is dense between 0.15-0.20.
The Support Vector Machine i.e. SVM model has the highest accuracy. Logistic regression has highest sensitivity.
V. Future Scope
This type of study can be done for the problems related with medical studies like predicting cancer survivability.
References
1. Harb, H., & Chen, L. (2005). Voice-based gender identification in multimedia applications. Journal of Intelligent Information
Systems, 24(2–3), 179–198.
2. Kuhn, M., & Johnson, K. (2013). Applied predictive modelling. Springer.
3. Nair, R. R., & Vijayan, B. (2019). Voice-based gender recognition. International Research Journal of Engineering and
Technology (IRJET), 6(5), 2109–2112.
4. Tiwari, V., & Pandey, V. (2021). Gender classification using voice features and machine learning techniques. International
Journal of Computer Applications, 174(3), 28–34.
5. Venkatesh, S., & Patra, M. (2018). Gender identification from voice using spectral and prosodic features. International Journal
of Speech Technology, 21(2), 231–239.
6. Bocklet, J., & Cordier, J. (2019). Deep learning-based gender classification using speech signals. Journal of Computer Science
and Technology, 34(6), 1183–1192.
7. Khan, S., & Khan, S. (2018). Gender detection using voice: A hybrid approach based on MFCC and LPC. International Journal
of Artificial Intelligence & Applications, 9(5), 12–19.
8. Ahmad, S., & Tariq, S. (2020). Gender classification using voice signals: A review of methods and applications. International
Journal of Speech Processing, 35(1), 11–23.
9. Kumar, S., & Rani, R. (2020). Gender classification using voice features and deep learning techniques. IEEE Transactions on
Audio, Speech, and Language Processing, 28, 1550–1558.