A Dual-Phase Hyperparameter Tuning Approach for Emotion Detection Using Boosting-Based Machine Learning Algorithms
Article Sidebar
Main Article Content
The purpose of this study is to develop and evaluate a dual-phase hyperparameter tuning approach for enhancing the performance of emotion detection systems using boosting-based machine learning algorithms. The methodology involves data collection and preprocessing, feature engineering, model definition, training, and evaluation. Specifically, the study applies a two-stage optimization process in initial coarse tuning with RandomizedSearchCV followed by fine-tuning with GridSearchCV on models including XGBoost, LightGBM, CatBoost and GradientBoosting. The results showed that LightGBM achieved the highest overall accuracy of 92.20%, followed by XGBoost with 91.47%, GradientBoosting with 91.19%, and CatBoost with 88.23%. Confusion matrix analysis revealed that LightGBM and XGBoost produced more balanced and accurate classifications across the six emotion classes, while CatBoost exhibited higher misclassification rates in challenging classes. In terms of computational efficiency, LightGBM provided the best balance between accuracy and training speed, whereas XGBoost demonstrated the lowest memory usage. GradientBoosting achieved competitive performance but required significantly higher computational resources, while CatBoost achieved the fastest prediction time. Based on the findings, LightGBM was identified as the most suitable boosting algorithm for emotion classification due to its superior balance of predictive performance, efficiency, and reliability. Future studies are recommended to explore hybrid and deep learning approaches, larger datasets, and real-time implementation strategies to further improve emotion classification systems.
Downloads
References
Al-Zakhali, O. A., Zeebaree, S., & Askar, S. (2024). Comparative analysis of XGBoost performance for text classification with CPU parallel and non-parallel processing. The Indonesian Journal of Computer Science, 13(2). https://doi.org/10.33022/ijcs.v13i2.3798
Alswaidan, N., & Menai, M. E. B. (2020). A survey of state-of-the-art approaches for emotion recognition in text. Knowledge and Information Systems, 62(8), 2937–2987. https://doi.org/10.1007/s10115-020-01449-0
Athanasiou, V., & Maragoudakis, M. (2017). A novel, gradient boosting framework for sentiment analysis in languages where NLP resources are not plentiful: A case study for modern Greek. Algorithms, 10(1), 34. https://doi.org/10.3390/a10010034
Bischl, B., Binder, M., Lang, M., Pielok, T., Richter, J., Coors, S., Thomas, J., Ullmann, T., Becker, M., Boulesteix, A., Deng, D., & Lindauer, M. (2023). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 13(2), e1484. https://doi.org/10.48550/arXiv.2107.05847
Gayed, J. M., Carlon, M. K. J., Oriola, A. M., & Cross, J. S. (2022). Exploring an AI-based writing assistant's impact on English language learners. Computers and Education: Artificial Intelligence, 3, 100055. https://doi.org/10.1016/j.caeai.2022.100055
Khalil, R. A., Jones, E., Babar, M. I., Jan, T., Zafar, M. H., & Alhussain, T. (2019). Speech emotion recognition using deep learning techniques: A review. IEEE Access, 7, 117327–117345. https://doi.org/10.1109/ACCESS.2019.2936124
Khansa, S. F. A., Ulinnuha, N., & Utami, W. D. (2025). Grid search and random search hyperparameter tuning optimization in XGBoost algorithm for Parkinson’s disease classification. BAREKENG: Jurnal Ilmu Matematika dan Terapan, 19(3), 1609–1624. https://doi.org/10.30598/barekengvol19iss3pp16091624
Kumar, A., & Guleria, K. (2024, November). Leveraging machine learning algorithms for threat detection using AI-enhanced cybersecurity datasets. In 2024 4th International Conference on Technological Advancements in Computational Sciences (ICTACS) (pp. 483–488). IEEE. https://doi.org/10.1109/ICTACS62700.2024.10840478
Lokker, C., Abdelkader, W., Bagheri, E., Parrish, R., Cotoi, C., Navarro, T., Germini, F., Linkins, L., Haynes, R., Chu, L., Afzal, M., & Iorio, A. (2024). Boosting efficiency in a clinical literature surveillance system with LightGBM. PLOS Digital Health, 3(9), e0000299. https://doi.org/10.1371/journal.pdig.0000299
Malhotra, R., & Cherukuri, M. (2024). A systematic review of hyperparameter tuning techniques for software quality prediction models. Intelligent Data Analysis, 28(5), 1131–1149. https://doi.org/10.3233/IDA-230653
Narasamma, V. L., & Sreedevi, M. (2021). Twitter based data analysis in natural language processing using a novel CatBoost recurrent neural framework. International Journal of Advanced Computer Science and Applications, 12(5). https://dx.doi.org/10.14569/IJACSA.2021.0120555
Prabhudesai, S., Mhaske, A., Parmar, M., & Bhagwat, S. (2021, June). Depression detection and analysis using deep learning: Study and comparative analysis. In 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT) (pp. 570–574). IEEE. https://doi.org/10.1109/CSNT51715.2021.9509707
Sadaf, K. (2023, February). Phishing website detection using XGBoost and CatBoost classifiers. In 2023 International Conference on Smart Computing and Application (ICSCA) (pp. 1–6). IEEE. https://doi.org/10.1109/ICSCA57840.2023.10087829
Salvador, E. L. (2024). Use of boosting algorithms in household-level poverty measurement: A machine learning approach to predict and classify household wealth quintiles in the Philippines. arXiv Preprint arXiv:2407.13061. https://doi.org/10.48550/arXiv.2407.13061
Sengar, S., & Liu, X. (2020). Ensemble approach for short term load forecasting in wind energy system using hybrid algorithm. Journal of Ambient Intelligence and Humanized Computing, 11(11), 5297–5314. https://doi.org/10.1007/s12652-020-01866-7
Triana, E., Purnamasari, A. I., Bahtiar, A., & Tohidi, E. (2025). Improved spam email detection performance based on Naïve Bayes approach TF-IDF vectorizer with multi-metric optimization. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 4(3), 1667–1672. https://doi.org/10.59934/jaiea.v4i3.981
Yan, M., Deng, Z., He, B., Zou, C., Wu, J., & Zhu, Z. (2022). Emotion classification with multichannel physiological signals using hybrid feature and adaptive decision fusion. Biomedical Signal Processing and Control, 71, 103235. https://doi.org/10.1016/j.bspc.2021.103235

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.