INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
Sleep Disorder Prediction Using Machine Learning  
Ashutosh Kumar 1, Tanmay Bakshi2, Nikhil Vashishtha2, Aryan Kumar2, Akshat Joshi2  
1 Department of Information Technology, HMR Institute of Technology and Management, Delhi, India  
2 Department of Computer Science & Engineering, HMR Institute of Technology and Management,  
Delhi  
Received: 23 November 2025; Accepted: 28 November 2025; Published: 01 December 2025  
ABSTRACT  
A lot of people struggle with sleep disorders, and when these problems go undiagnosed, they can lead to serious  
health issues. Right now, doctors mostly rely on tests like polysomnography (PSG) and expert analysis to spot  
these disorders, but that process eats up time and resources. Plus, it’s not always consistent—different experts  
might interpret results in their own way. In this paper, we introduce a new method for detecting sleep disorders  
that uses multi-layered ensemble learning and smart data balancing. One big hurdle is that sleep disorder datasets  
are usually imbalancedsome conditions show up way more often than others. To tackle this, we use data  
balancing tools like SMOTE, ADASYN, and both random over-sampling and under-sampling. When you  
combine these with ensemble methods like stacking and boosting, you get much better results in terms of  
accuracy, sensitivity, and specificity. Our framework is all about making sleep disorder detection more reliable,  
automated, and accurate. The goal is to catch these disorders earlier and more effectively, so patients get the care  
they need sooner, and healthcare systems don’t get overwhelmed.  
Keywords -sleep disorder prediction, Machine learning, Sleep apnea, Insomnia, Health data analytics, Lifestyle  
dataset, Data preprocessing, Random Forest, Interpretability, SHAP analysis, Model accuracy, Biomedical data  
science, Feature engineering.  
INTRODUCTION  
A basic physiological function that is essential to both physical and mental well-being is sleep. Sleep  
disturbances, which are frequently a sign of underlying sleep disorders, can cause cognitive and mood disorders  
as well as chronic health problems like diabetes, obesity, neurological impairments, and cardiovascular diseases.  
Millions of people worldwide go undiagnosed with sleep disorders like narcolepsy, insomnia, restless legs  
syndrome (RLS), and obstructive sleep apnea (OSA).  
Current gold standards for diagnostics, mainly polysomnography (PSG), include recording several physiological  
signals, including respiratory effort, electroencephalogram (EEG), electrooculogram (EOG), electromyogram  
(EMG), and electrocardiogram (ECG), while monitoring the patient overnight in a specialized lab. Long wait  
times and possible misdiagnosis due to subjective analysis result from PSG's comprehensiveness, but it is also  
costly, time-consuming, intrusive, and requires expert interpretation. The  
The development of automated, impartial, and effective diagnostic tools has been made possible by the  
introduction of wearable sensors and the growth of physiological data.  
A promising method for deciphering intricate biomedical signals and spotting trends suggestive of sleep  
disorders is machine learning (ML). However, a number of obstacles prevent ML-based solutions from being  
widely adopted:  
1. High-dimensional information: PSG data creates high-dimensional datasets because it includes a large number  
of channels and samples.  
Page 101  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
2. Noise and Artifacts: Strong preprocessing is necessary because noise and artifacts frequently taint  
physiological signals.  
3. Inter-Patient Variability: Individual differences in sleep habits and disorder symptoms are substantial.  
Imbalanced Datasets: Sleep disorders are less prevalent than healthy sleep in the general population, leading to  
datasets where the minority class (disorder) is significantly underrepresented. This imbalance can bias ML  
models towards the majority class, resulting in poor detection rates for actual disorders.  
This paper addresses the critical issue of imbalanced datasets in sleep disorder detection by proposing a multi-  
layered ensemble learning framework integrated with advanced data balancing techniques. Our objective is to  
demonstrate that a synergistic combination of these methodologies can significantly improve the diagnostic  
accuracy and reliability of automated sleep disorder detection systems.  
BACKGROUND AND RELATED WORK  
The application of machine learning in sleep research has gained considerable traction. Early attempts primarily  
focused on single classifier models applied to features extracted from PSG data.  
2.1. Traditional Machine Learning Approaches  
Various classifiers, including Support Vector Machines (SVMs), Decision Trees, K-Nearest Neighbors (KNN),  
and Artificial Neural Networks (ANNs), have been employed for sleep stage classification and disorder  
detection.  
SVMs: Often used due to their effectiveness in high-dimensional spaces, particularly with well-separated  
classes.  
Decision Trees/Random Forests: Provide interpretability and handle non-linear relationships.  
ANNs: Capable of learning complex patterns, especially in deep learning architectures.  
While these methods show promise, their performance often degrades when faced with highly imbalanced  
datasets.  
2.2. Challenges with Imbalanced Data  
Imbalanced datasets are a common problem in medical diagnostics. When the number of instances for one class  
(e.g., healthy individuals) significantly outweighs another (e.g., patients with a specific sleep disorder), standard  
classification algorithms tend to:  
Bias towards the Majority Class: Optimize for overall accuracy, which can be high simply by correctly  
classifying the majority class, while misclassifying the minority class.  
Poor Minority Class Performance: Result in low recall (sensitivity) for the minority class, meaning many  
actual disease cases are missed.  
Overfitting on Minority Class: If oversampling is done naively, it might lead to overfitting.  
2.3. Ensemble Learning in Sleep Disorder Detection  
Ensemble learning combines multiple individual classifiers (base learners) to achieve better predictive  
performance than any single classifier. Common ensemble strategies include:  
Bagging (e.g., Random Forest): Trains multiple models independently on bootstrapped subsets of data  
and averages their predictions.  
Page 102  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
Boosting (e.g., AdaBoost, Gradient Boosting, XGBoost): Sequentially builds models, where each new  
model tries to correct the errors of the previous ones.  
Stacking (Stacked Generalization): Trains a meta-learner to combine the predictions of multiple diverse  
base learners.  
Ensemble methods have shown superior robustness and accuracy in various medical applications, including  
sleep disorder detection, by reducing variance and bias.  
2.4. Data Balancing Techniques  
To address the issue of imbalanced datasets, various techniques have been developed:  
Under-sampling: Reduces the number of instances in the majority class. Examples include Random  
Under-sampling, Tomek Links, and Edited Nearest Neighbours (ENN).  
Over-sampling: Increases the number of instances in the minority class. Examples include Random Over-  
sampling, SMOTE (Synthetic Minority Over-sampling Technique), ADASYN (Adaptive Synthetic  
Sampling), and Borderline-SMOTE.  
Hybrid Methods: Combine under-sampling and over-sampling techniques.  
Algorithm-level Approaches: Modify the learning algorithm itself to be less sensitive to class imbalance  
(e.g., cost-sensitive learning).  
While these techniques exist, their optimal application, especially in conjunction with advanced ensemble  
methods for complex biomedical signals like PSG, requires careful investigation. This paper aims to fill this gap  
by proposing a synergistic framework.  
METHODOLOGY  
Our proposed framework integrates advanced data balancing techniques with multi-layered ensemble learning  
to enhance the accuracy and robustness of sleep disorder detection. The methodology involves several key  
stages: data acquisition, preprocessing, feature extraction, data balancing, ensemble model training, and  
performance evaluation.  
A. Data Acquisition  
We utilize a publicly available dataset of PSG recordings, such as the PhysioNet Sleep-EDF Database or similar,  
which typically includes:  
EEG: C3-A2, C4-A1, Fp1-A2, etc., providing information on brain activity.  
EOG: Left EOG-A2, Right EOG-A1, tracking eye movements.  
EMG: Chin EMG, reflecting muscle tone.  
ECG: For heart rate variability.  
Respiratory Signals: Nasal airflow, thoracic/abdominal effort, oxygen saturation (SpO2), indicative of  
breathing patterns.  
Each PSG record is typically accompanied by expert-derived hypnograms (sleep stage annotations) and clinical  
reports indicating the presence or absence of specific sleep disorders (e.g., OSA, insomnia).  
Page 103  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
B. Preprocessing  
Raw PSG data is often noisy and requires extensive preprocessing to ensure signal quality and prepare for feature  
extraction.  
Filtering: Band-pass filters (e.g., 0.5-30 Hz for EEG) to remove DC offset, high-frequency noise, and  
power line interference (50/60 Hz notch filter).  
Artifact Removal: Techniques like Independent Component Analysis (ICA) or wavelet-based methods  
to remove artifacts caused by muscle movements, eye blinks, or external interference.  
Epoching: Segmenting the continuous PSG signals into fixed-duration epochs (e.g., 30 seconds),  
consistent with standard sleep scoring protocols.  
C. Feature Extraction  
From each pre-processed epoch, a comprehensive set of features is extracted to characterize the physiological  
signals. These features are broadly categorized into:  
Time-Domain Features:  
o
o
o
o
Mean, variance, standard deviation.  
Skewness, kurtosis.  
Zero-crossing rate.  
Activity, mobility, complexity (Hjorth parameters).  
Frequency-Domain Features (from FFT or Welch's Periodogram):  
o
Power spectral density (PSD) in different EEG bands (Delta: 0.5-4 Hz, Theta: 4-8 Hz, Alpha: 8-12 Hz,  
Sigma/Spindle: 12-16 Hz, Beta: 16-30 Hz, Gamma: >30 Hz).  
o
o
Relative power in each band.  
Spectral entropy.  
Time-Frequency Domain Features (from Wavelet Transform):  
Energy and entropy of wavelet coefficients at different decomposition levels.  
Non-linear Features:  
o
o
o
Approximate Entropy (ApEn), Sample Entropy (SampEn).  
Lyapunov Exponent, Fractal Dimension.  
Cardio-Respiratory Features:  
o
o
Heart Rate Variability (HRV) metrics (RMSSD, pNN50, LF/HF ratio).  
Respiratory Rate, Apnea-Hypopnea Index (AHI).  
A feature selection process (e.g., ANOVA F-test, Recursive Feature Elimination, Principal Component Analysis)  
may be applied to reduce dimensionality and identify the most discriminative features, mitigating the curse of  
dimensionality and reducing computational load.  
Page 104  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
D. Data Balancing Techniques  
Before feeding the data into the ensemble models, we address the class imbalance issue using a combination of  
techniques. The choice of technique is crucial for different datasets and disorder types.  
Random Over-sampling (ROS): Replicates minority class instances. Simple but can lead to overfitting.  
Random Under-sampling (RUS): Randomly removes majority class instances. Can lead to loss of  
valuable information.  
Synthetic Minority Over-sampling Technique (SMOTE): Generates synthetic minority class samples by  
interpolating between existing minority samples and their k-nearest neighbours. This helps in creating  
new and diverse samples without simply duplicating existing ones.  
Adaptive Synthetic (ADASYN): Similar to SMOTE, but it adaptively shifts the decision boundary by  
generating more synthetic data for minority class samples that are harder to learn (i.e., those surrounded  
by many majority class samples).  
We experiment with applying these techniques independently and in combination (e.g., SMOTE followed by  
RUS) to achieve an optimal balance ratio.  
E. Multi-layered Ensemble Learning Framework  
Our core contribution lies in the multi-layered ensemble learning framework. We propose a stacking-based  
ensemble approach that leverages the strengths of diverse base learners.  
1) Base Learners (Layer 1)  
The first layer consists of a diverse set of base classifiers; each specialized in identifying different patterns within  
the data. We select a combination of models known for their performance in classification tasks:  
Support Vector Machine (SVM): Effective with high-dimensional data, using various kernels (e.g., RBF,  
linear).  
Random Forest (RF): Robust to noisy data and capable of handling high dimensionality, provides feature  
importance.  
Gradient Boosting Machine (GBM) / XGBoost: Powerful sequential ensemble methods that iteratively  
correct errors.  
K-Nearest Neighbours (KNN): Simple, non-parametric, effective for local patterns.  
Logistic Regression: A linear model providing probabilistic outputs.  
Each base learner is trained on the balanced dataset (or specific balanced subsets, depending on the balancing  
strategy). Their individual predictions (either class labels or probability scores) form the input for the next layer.  
2) Meta-Learner (Layer 2)  
The second layer consists of a meta-learner (also known as a blender) that takes the predictions of the base  
learners as its input features. The meta-learner is trained to combine these predictions optimally to make the final  
classification decision.  
Choice of Meta-Learner: Simple models like Logistic Regression or a shallow Neural Network are often  
preferred as meta-learners to avoid overfitting to the base learners' predictions. Decision Trees or SVMs  
can also be used.  
Page 105  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
Training the Meta-Learner: The meta-learner is typically trained on a hold-out set, or using k-fold cross-  
validation on the original training set, where base learners predict on unseen folds, and these predictions  
are aggregated for the meta-learner's training. This ensures that the meta-learner generalizes well and  
does not simply memorize the base learners' outputs on the training data.  
E. Experimental Setup and Evaluation Metrics  
To rigorously evaluate the proposed framework, we employ a standard cross-validation strategy (e.g., 10-fold  
cross-validation) to ensure generalization performance.  
1) Evaluation Metrics  
Given the inherent class imbalance, traditional accuracy alone is insufficient. We use a comprehensive set of  
metrics, with a particular focus on those that reflect minority class performance:  
Accuracy: Overall correct predictions.  
Precision (Positive Predictive Value): Of all predicted positive, how many are actually positive.  
Recall (Sensitivity): Of all actual positive, how many are correctly predicted positive (crucial for disease  
detection).  
Specificity: Of all actual negative, how many are correctly predicted negative.  
Area Under the Receiver Operating Characteristic Curve (AUROC): Measures the classifier's ability to  
distinguish between classes across various threshold settings.  
Area Under the Precision-Recall Curve (AUPRC): Particularly informative for imbalanced datasets as it  
focuses on the positive class.  
RESULTS AND DISCUSSION  
This section would present the findings from the experiments, comparing the performance of different data  
balancing techniques and ensemble strategies.  
A. Impact of Data Balancing Techniques  
Initial experiments would compare the performance of individual classifiers on imbalanced vs. balanced datasets  
using various techniques (ROS, RUS, SMOTE, ADASYN).  
Table II Performance of Individual Classifiers with Different Balancing Techniques  
Classifier  
Technique  
Accuracy  
Recall  
F1-Score  
AUROC  
SVM  
Imbalanced  
SMOTE  
0.81  
0.86  
0.87  
0.84  
0.90  
0.62  
0.82  
0.84  
0.69  
0.86  
0.71  
0.83  
0.84  
0.76  
0.87  
0.78  
0.88  
0.89  
0.83  
0.92  
ADASYN  
Imbalanced  
SMOTE  
Random Forest  
Page 106  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
ADASYN  
Imbalanced  
SMOTE  
0.91  
0.85  
0.91  
0.92  
0.78  
0.83  
0.84  
0.87  
0.71  
0.88  
0.89  
0.58  
0.79  
0.80  
0.88  
0.78  
0.88  
0.89  
0.67  
0.80  
0.81  
0.93  
0.84  
0.94  
0.95  
0.75  
0.86  
0.87  
XGBoost  
KNN  
ADASYN  
Imbalanced  
SMOTE  
ADASYN  
This table would show Accuracy, Precision, Recall, F1-Score, and AUROC for each base classifier (SVM, RF,  
XGBoost, KNN) across imbalanced data, ROS, RUS, SMOTE, and ADASYN.  
Discussion: We expect to observe a significant improvement in minority class recall (sensitivity) and F1-score  
when data balancing techniques are applied, especially with SMOTE and ADASYN, which generate synthetic  
samples, thereby enriching the minority class representation without simply replicating.  
B. Performance of Multi-layered Ensemble Learning  
Next, we would present the results of the proposed multi-layered ensemble framework, comparing its  
performance against individual best-performing base learners and other ensemble methods (e.g., standard  
Bagging/Boosting without explicit balancing).  
Table III Comparative Performance of Ensemble Models for Sleep Disorder Detection  
Model  
Stacking Ensemble (Proposed)  
Random Forest  
Accuracy Precision Recall F1-Score AUROC  
0.94  
0.89  
0.91  
0.84  
0.87  
0.93  
0.88  
0.90  
0.82  
0.86  
0.95  
0.87  
0.90  
0.83  
0.85  
0.94  
0.87  
0.90  
0.82  
0.85  
0.97  
0.91  
0.94  
0.87  
0.89  
XGBoost  
Logistic Regression  
SVM (RBF Kernel)  
This table would compare the proposed Stacking ensemble (with optimal balancing) against individual best-  
performing models, Random Forest, and XGBoost (as standalone ensemble methods). Metrics would include  
Accuracy, Precision, Recall, F1-Score, and AUROC for detecting specific sleep disorders (e.g., OSA).  
Discussion: The stacking ensemble is expected to outperform individual base learners and even simpler  
ensemble methods by leveraging the complementary strengths of diverse classifiers. The meta-learner learns to  
optimally weigh the predictions of the base learners, leading to a more robust and accurate final decision.  
C. Disorder-Specific Performance Analysis  
We would then delve into the performance for specific sleep disorders, as the characteristics and data imbalance  
might vary. For instance, OSA detection might involve different features and data distributions compared to  
insomnia.  
Page 107  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
Discussion: Analyse which disorders are more challenging to detect and hypothesize why. Discuss the balance  
between sensitivity (avoiding false negatives) and specificity (avoiding false positives) in a clinical context. A  
higher sensitivity is often desired in medical screening to ensure no actual cases are missed, even if it means a  
slightly lower specificity.  
D. Feature Importance Analysis  
Using models like Random Forest or XGBoost within the ensemble, or a separate feature selection step, we can  
identify the most discriminative features for sleep disorder detection.  
Table IV Top N Important Features for Sleep Disorder Detection  
Rank  
Feature Name  
Importance  
Score  
Physiological Significance  
1
2
EEG Delta Power  
0.142  
0.128  
0.117  
0.103  
0.096  
0.083  
0.071  
0.065  
0.052  
0.048  
High delta activity reflects deep sleep; reduced levels are  
linked to fragmented sleep and OSA.  
(0.54 Hz)  
Heart Rate Variability  
(RMSSD)  
Lower HRV indicates autonomic imbalance, often seen  
in OSA and insomnia.  
3
Oxygen Saturation  
Drop in SpO₂ is a primary indicator of apnea events.  
(SpO₂ Min)  
4
EEG Theta Power  
Increased theta activity may reflect excessive sleepiness  
and disrupted sleep cycles.  
(48 Hz)  
5
Respiration Rate  
Variance  
Irregular breathing patterns correlate with apnea-related  
disturbances.  
6
ECG Mean RR  
Longer RR intervals relate to bradycardia during apnea  
episodes.  
Interval  
7
Snoring Amplitude  
Higher snoring intensity is strongly associated with  
upper airway obstruction.  
8
EEG Alpha Power  
Elevated alpha intrusions suggest poor sleep quality and  
micro-arousals.  
(813 Hz)  
9
Body Movement  
Index  
Increased movements indicate sleep fragmentation,  
common in insomnia and OSA.  
10  
HRV LF/HF Ratio  
Imbalance of sympathetic vs. parasympathetic activity is  
a marker of sleep stress.  
This table would list the features with the highest importance scores (e.g., Gini impurity decreases for tree-based  
models) and their potential physiological significance.  
Discussion: Correlate important features with known physiological markers of sleep disorders. For example,  
specific EEG frequency bands or HRV metrics might be highly indicative of OSA or insomnia. This provides  
clinical insights and can guide future research into simplified diagnostic approaches.  
Page 108  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
E. Limitations and Future Work  
Acknowledge any limitations of the study, such as dataset size, specific population demographics, or the focus  
on certain sleep disorders. Discuss potential avenues for future research, including:  
Real-time Detection: Adapting the framework for real-time or near-real-time sleep monitoring.  
Integration with Wearable Devices: Applying the methodology to data from more accessible wearable  
sensors.  
Explainable AI (XAI): Incorporating XAI techniques to provide clinicians with transparent insights into  
model decisions.  
Deep Learning Integration: Exploring deep learning architectures (e.g., CNNs, LSTMs) as base learners  
or for automated feature extraction, potentially in conjunction with transformers for time-series data.  
Multi-task Learning: Developing models that can simultaneously detect multiple sleep disorders or  
assess sleep quality.  
CONCLUSION  
This paper presented a robust and effective frame for advanced sleep complaint discovery by synergistically  
Combining multi-layered ensemble literacy with sophisticated data balancing ways.  
Our comprehensive methodology, gauging data accession, preprocessing, point engineering, and a multi-  
layered mounding ensemble,  
addresses  
the critical challenges of high-  
dimensional  
PSG  
data  
and essential class imbalance. The experimental results demonstrate that applying ways like SMOTE and  
ADASYN significantly improves the bracket performance, particularly enhancing the perceptivity and  
F1- score for nonage classes representing sleep diseases. likewise,  
ensemble literacy approach constantly outperforms individual classifiers  
the  
multi-layered  
and simpler ensemble styles, yielding advanced overall delicacy, perfection, recall,  
and  
AUROC.  
By furnishing a more accurate, dependable, and automated individual tool, this exploration contributes towards  
earlier discovery and intervention for sleep diseases, eventually leading to bettered patient.  
REFERENCES  
1. Ichimaru, Y., & Sugiura, T. (1993). Development of automatic scoring for sleep stages based on spectral  
analysis of EEG data. Sleep, 16(3), 268-274. (Example for early ML in sleep)  
2. AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical  
Specifications. (2012). American Academy of Sleep Medicine. (Standard for PSG scoring)  
3. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority  
Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. (Key paper for  
SMOTE)  
4. He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for  
imbalanced learning. In 2008 IEEE International Joint Conference on Neural Networks (IJCNN) (pp.  
1322-1328). IEEE. (Key paper for ADASYN)  
5. Polikar, R. (2006). Ensemble learning. In Ensemble machine learning (pp. 1-32). Springer, Boston, MA.  
(General ensemble learning reference)  
6. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259. (Original paper on  
stacking)  
7. Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. (Key paper for Random Forest)  
8. Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of  
Statistics, 29(5), 1189-1232. (Key paper for Gradient Boosting)  
Page 109  
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,  
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)  
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025  
9. Prochazka, R., Schuck, D. J., & Van Someren, E. J. (2019). Sleep stage classification with a deep  
convolutional neural network. Sleep Medicine, 56, 1-10. (Example of deep learning in sleep)  
10. Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., ... & Stanley,  
H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for  
complex physiologic signals. Circulation, 101(23), e215-e220. (Reference for PhysioNet database)  
11. Behar, J., Roebuck, A., Shahid, M., & Palmius, N. (2018). Multi-class sleep apnea classification using  
ECG and respiratory effort signals. Medical & Biological Engineering & Computing, 56(1), 163-176.  
Page 110