Imbalance-Aware Evaluation and Hyperparameter Optimization of Supervised Machine Learning Models for Credit Card Fraud Detection
Article Sidebar
Main Article Content
The financial sector is one of the industries where credit card fraud detection is a critical issue because the number of legitimate transactions is by far outnumbered by the number of fraud transactions carried out. This paper performs an imbalance-sensitive analysis and hyperparameter optimization of three supervised machine learning (SML) models (Logistic Regression, Random Forest, and XGBoost) on the European credit card fraud dataset (n = 284,807; fraud rate = 0.172%). It embraced the CRISP-DM process model as the data lifecycle model to guide it. The training partition was only subjected to SMOTE after 80/20 stratified split to avoid data leaking and the hyperparameters are optimized using stratified 3-fold cross-validation. Each tuned model was further probability threshold tuned with probability threshold set to 0.70 to maximize Precision-Recall operating point. All the experiments were executed in Google Colaboratory on Python 3.10. Precision, Recall, F1-Score, ROC-AUC and the Area Under the Precision-Recall Curve (AUPRC) were used to evaluate model performance, and AUPRC was chosen as the ultimate measure due to extreme imbalance in the classes. XGBoost developed as the most effective model in general, having the highest AUPRC (0.817), ROC-AUC (0.970) and the perfect combination of Precision = Recall = F1 = 0.81, which was achieved by tuning the probability threshold to 0.70. Random Forest had the best Precision (0.93) with AUPRC of 0.805 and hence it is the most appropriate model in the minimum false positive. Logistic Regression achieved maximum Recall (0.86) but had low Precision (0.10) which restricted its feasibility of operation even with threshold modification. These results indicate that XGBoost, together with SMOTE, systematic hyperparameter optimization, and threshold calibration, offers the best and balanced fraud detection at extreme imbalance in the classes.
Downloads
References
Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research, 13(10), 281–305. https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
Bisong, E. (2019). Google Colaboratory. In Building machine learning and deep learning models on Google Cloud Platform: A comprehensive guide for beginners (pp. 59–64). Apress. https://doi.org/10.1007/978-1-4842-4470-8_7
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Carneiro, T., Da NóBrega, R. V. M., Nepomuceno, T., Bian, G.-B., De Albuquerque, V. H. C., & Reboucas Filho, P. P. (2018). Performance analysis of Google Colaboratory as a tool for accelerating deep learning applications. IEEE Access, 6, 61677–61685. https://doi.org/10.1109/ACCESS.2018.2874767
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953
Cheah, P. C. Y., Yang, Y., & Lee, B. G. (2023). Enhancing financial fraud detection through addressing class imbalance using hybrid SMOTE-GAN techniques. International Journal of Financial Studies, 11(3), Article 110. https://doi.org/10.3390/ijfs11030110
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Chung, J., & Lee, K. (2023). Credit card fraud detection: An improved strategy for high recall using KNN, LDA, and linear regression. Sensors, 23(18), Article 7788. https://doi.org/10.3390/s23187788
Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). SAGE Publications.
Dantas, R. M., Firdaus, R., Jaleel, F., Mata, P. N., Mata, M. N., & Li, G. (2022). Systemic acquired critique of credit card deception exposure through machine learning. Journal of Open Innovation: Technology, Market, and Complexity, 8(4), Article 192. https://doi.org/10.3390/joitmc8040192
Davis, J., & Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning, 233–240. https://doi.org/10.1145/1143844.1143874
Du, H., Lv, L., Guo, A., & Wang, H. (2023). AutoEncoder and LightGBM for credit card fraud detection problems. Symmetry, 15(4), Article 870. https://doi.org/10.3390/sym15040870
Du, H., Zhang, Y., Li, X., & Wang, Q. (2024). A novel method for detecting credit card fraud problems. PLOS ONE, 19(3), Article e0294537. https://doi.org/10.1371/journal.pone.0294537
Fernández, A., García, S., Herrera, F., & Chawla, N. V. (2018). SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research, 61(1), 863–905. https://doi.org/10.1613/jair.1.11192
Herland, M., Bauder, R. A., & Khoshgoftaar, T. M. (2018). Approaches for identifying U.S. medicare fraud in provider claims data. Health Care Management Science, 23(1), 2–19. https://doi.org/10.1007/s10729-018-9460-8
Ileberi, E., Sun, Y., & Wang, Z. (2021). Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access, 9, 165286–165294. https://doi.org/10.1109/ACCESS.2021.3134330
Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9, Article 24. https://doi.org/10.1186/s40537-022-00573-8
Nilson Report. (2022). Card fraud losses worldwide [Issue 1209]. Nilson Report. https://nilsonreport.com
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE, 10(3), Article e0118432. https://doi.org/10.1371/journal.pone.0118432
Schroer, J., Kruse, F., & Gómez, J. M. (2021). A systematic literature review on applying CRISP-DM process model. Procedia Computer Science, 181, 526–534. https://doi.org/10.1016/j.procs.2021.01.199
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. arXiv. https://doi.org/10.48550/arxiv.1206.2944
Strelcenia, E., & Prakoonwit, S. (2023). Improving classification performance in credit card fraud detection by using new data augmentation. AI, 4(1), 172–198. https://doi.org/10.3390/ai4010008
Tripathy, N., Nayak, S. K., Godslove, J. F., Friday, I. K., & Dalai, S. S. (2022). Credit card fraud detection using logistic regression and synthetic minority oversampling technique (SMOTE) approach. International Journal of Computer and Communication Technology, 8(4), 38–45. https://doi.org/10.47893/ijcct.2022.1438
ULB Machine Learning Group. (2013). Credit card fraud detection [Dataset]. Kaggle. https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a standard process model for data mining. Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, 29–39. http://cs.unibo.it/~danilo.montesi/CBD/Beatriz/10.1.1.198.5133.pdf
Zhu, M., Zhang, Y., Gong, Y., Xu, C., & Xiang, Y. (2024). Enhancing credit card fraud detection: A neural network and SMOTE integrated approach. Journal of Theory and Practice of Engineering Science, 4(02), 23–30. https://doi.org/10.53469/jtpes.2024.04(02).04

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.