INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025
Phishing Emails: Analysis and Detection with Comparison of Three
Machine Learning Models (LR, NB and MLP)
Theophilus Bamise Ajala
Department of Computer Science, Caleb University, Imota, Lagos, Nigeria
Received: 10 November 2025; Accepted: 20 November 2025; Published: 08 December 2025
ABSTRACT
Phishing attacks are performed by writing and forwarding falsified body of email messages which look
legitimate or real from an undisputed origin to a victim or different category of victims. They focus at acquiring
the sensitive data of users or by transferring and loading malware on the user's computers. Consequently, this
study aims to implement an AI-driven approach to detect emails that seems to be phishing while the features are
analyzed. This project leverage three machine learning models namely: MLP, a deep learning algorithm, Naïve
Bayes and Logistic Regression. The following performance metrics were obtained -> for Multilayer Perceptron
(MLP) model: accuracy: 98.57%, precision: 100.00%, recall: 90.00% while the f1_score metrics was 94.74%.
For Naïve Bayes (NB) model: accuracy: 96.95%, precision: 100.00%, recall: 78.75% while the f1_score metrics
was 88.11%. For Logistic Regression (LR) model: accuracy: 94.71%, precision: 99.03%, recall: 63.75% while
the f1_score metrics was 77.57%. The result shows that MLP Classifier may better capture complex patterns in
phishing emails, leading to higher detection rates. Naive Bayes is still a strong choice, especially for simpler or
smaller datasets due to its speed and efficiency. Logistic Regression is reliable but slightly less accurate on this
particular task. For this project, a phishing email dataset from the Kaggle Machine Learning Repository was
utilized. This dataset contains 5000+ instances of phishing and ham emails.
Keywords: Multilayer perceptron Neural Network (MLP), Naïve Bayes (NB), Logistic Regression (LR), Deep
Learning, AI, Phishing Email
How to cite: Ajala, T.B. (2025). Phishing Emails: Analysis and Detection with Comparison of Three Machine
Learning Models (LR, NB and MLP).
INTRODUCTION
In the year 2021, there is a record of over seven billion registered email accounts in the world and people send
more than three million emails per second, for transactions relating to professional and personal matters, emails
services is a vital tool to handle such matters in a smooth and stress-free manner. Howbeit, attackers have seized
the opportunity to employ the mammoth use of emails services to launch their prosperous and growing attacks.
It is nearly impossible for an email account to be compromised due to the End to End (E2E) encrypting strategies
integrated into email services by email service provider. Based on the challenge of the foregoing for attackers,
attackers, decide to employ social engineering tactics manipulate email accounts by indulging the wisdom of
humans to get sensitive and critical information (Salahdine & Kaabouch, 2019).
Phishing attacks are performed by writing and forwarding falsified body of email messages which look
legitimate or real from an undisputed origin to a victim or different category of victims (Mohammad, et. al.,
2014). They focus at acquiring the sensitive data of users or by transferring and loading malware on the user's
computers. For example, the attackers forward an email with a link that will redirect the user to a website that
contains malicious content, by so doing the user is asked to proffer some confidential information such as
password, login detains, bank account details such as cvv/cvc, expiry date etc. The attacker can equally include
a file to the forged email to be loaded by the victim when they click it, which can make the unseen malware
attached to the file to be executed. Phishing is a specific form of cybercrime that permits offenders to scheme
users and abscond with their sensitive data. Victims of phishing attacks can suffer noteworthy losses and
Page 497