Phishing Emails: Analysis and Detection with Comparison of Three Machine Learning Models (LR, NB and MLP)
Article Sidebar
Main Article Content
Phishing attacks are performed by writing and forwarding falsified body of email messages which look legitimate or real from an undisputed origin to a victim or different category of victims. They focus at acquiring the sensitive data of users or by transferring and loading malware on the user's computers. Consequently, this study aims to implement an AI-driven approach to detect emails that seems to be phishing while the features are analyzed. This project leverage three machine learning models namely: MLP, a deep learning algorithm, Naïve Bayes and Logistic Regression. The following performance metrics were obtained -> for Multilayer Perceptron (MLP) model: accuracy: 98.57%, precision: 100.00%, recall: 90.00% while the f1_score metrics was 94.74%. For Naïve Bayes (NB) model: accuracy: 96.95%, precision: 100.00%, recall: 78.75% while the f1_score metrics was 88.11%. For Logistic Regression (LR) model: accuracy: 94.71%, precision: 99.03%, recall: 63.75% while the f1_score metrics was 77.57%. The result shows that MLP Classifier may better capture complex patterns in phishing emails, leading to higher detection rates. Naive Bayes is still a strong choice, especially for simpler or smaller datasets due to its speed and efficiency. Logistic Regression is reliable but slightly less accurate on this particular task. For this project, a phishing email dataset from the Kaggle Machine Learning Repository was utilized. This dataset contains 5000+ instances of phishing and ham emails.
Downloads
References
Alahmar, M.I., Abdullah, L., Abdullah, H., Fahad, R., & Abdullah, L. (2023). Naïve Bayes Algorithms. DOI:10.13140/RG.2.2.15378.73921
Ajala, T.B., Oloko, R.K., & Agboola, A.R. (2025). Developing a Dashboard Embedded with KNN Machine Learning Algorithm for Wine Quality Prediction. International Journal of Innovative Science and Research Technology, 10(11), 1096-1103. DOI: doi.org/10.38124/ijisrt/25nov409.
Dinesh, P.M., Mukesh, M., Navaneethan, B., Sabeenian R.S., Paramasivam, M.E., and Manjunathan. A. (2023). Identification of Phishing Attacks using Machine Learning Algorithm. DOI: https://doi.org/10.1051/e3sconf/202339904010.
Elzeiny, M. (2024). The Ultimate Guide to Naive Bayes. Available at: https://mlarchive.com/machine-learning/the-ultimate-guide-to-naive-bayes/.
Fares, H., Mouakkal, N., Baddi, Y., and Hajraoui, N. (2024). Robust Email Phishing Detection using Machine Learning and Deep Learning Approach. ” International Journal of Communication Networks and Information Security (IJCNIS), vol. 16, no. 3, pp. 19-32.
Mohammad, R., Thabtah, F., and McCluskey, L. (2014). “Intelligent rule-based phishing websites classification,” IET Inf. Secur., pp. 153–160.
Popescu, M.C., Balas, V.E., Popescu, L.P., & Mastorakis, N.E. (2009). Multilayer perceptron and neural networks.
Rashedi, K.A., Ismail, M.T., Wadi, S.A., Serroukh, A., Aishammari, T.S., & Jaber, J.J. (2024). Multi-Layer Perceptron-Based Classification with Application to Outlier Detection in Saudi Arabia Stock Returns. DOI: doi.org/10.3390/jrfm17020069.
Rawal, S., Rawal, B., Shaheen, A., and Malik, S. (2017). Phishing Detection in E-mails using Machine Learning. International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868. DOI:10.5120/ijais2017451713.
Rezazadeh, S. (2025). Review of Machine Learning. TMP Universal Journal of Research and Review Archives 4(2s). DOI:10.69557/ujrra.v4i2s.190
Salahdine, F and Kaabouch, N. (2019). “Social Engineering Attacks: A Survey,” Future Internet J,, 11, 89, pp. 1-17.
Sambare, G.B., Galande, S.B., Kale, S., Nehete, P., Jadhav, V., and Tadavi, N. (2024). Towards Enhanced Security: An improved approach to Phishing Email Detection. J. Electrical Systems 20-2 (2024): 2763-2772.
Sasirekha, C., Nandhini, R., Karthiga, M.N., Bhuvaneshwari, R.S., and Chandra, V.S. (2023).Email Phishing Detection Using Machine Learning.
Yusoff, M.I.M. (2024) Machine Learning: An Overview. Open Journal of Modelling and Simulation, 12, 89-99. doi: 10.4236/ojmsi.2024.123006.
Vikramkumar, V., Vjaykumar, B., & Tripathy, T. (2014). Bayes and Naive Bayes Classifier.Web linkhttps://www.futurelearn.com/info/courses/machine-learning-for-image-data/0/steps/362737 https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-logistic-regression/

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.