INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue IV, April 2025
www.ijltemas.in Page 563
Credit Card Fraud Detection Using Random Forest and CART
Algorithms: A Machine Learning Perspective
Syed Saaduddin Azhaan
1
, Syed Mohiuddin Jeelani Jaffri
2
, Mirza Younus Ali Baig
3
, Adeeba Anjum
4
1,2
UG Scholar, Lords Institute of Engineering and Technology
3 ,4
Assistant Professor, Lords Institute of Engineering and Technology
DOI : https://doi.org/10.51583/IJLTEMAS.2025.140400059
Received: 28 April 2025; Accepted: 30 April 2025; Published: 13 May 2025
Abstract: The increasing adoption of online payments and e-commerce platforms has amplified the threat of credit card fraud. As
fraudsters continuously develop advanced techniques to bypass traditional security systems, it becomes essential to deploy smart,
adaptive solutions. This study focuses on leveraging machine learning—specifically Random Forest and Classification and
Regression Trees (CART)—to build a high-performance fraud detection system. Using a publicly available dataset from Kaggle,
the model analyzes transaction records to uncover patterns indicative of fraudulent behavior. Emphasis is placed on accuracy,
scalability, and the potential for real-time deployment. The implemented model achieved an impressive accuracy of 99.78%, with
strong precision and recall scores. The paper discusses the methodologies applied, evaluates the outcomes, and recommends
directions for future development.
Keywords: Fraud Detection, Random Forest, CART, Credit Card Transactions, Machine Learning, PCA, Supervised Learning,
Data Imbalance
I. Introduction
In the current digital landscape, credit card transactions have become commonplace due to their ease of use and speed. However,
this convenience brings with it the growing threat of fraudulent activity. Credit card fraud typically involves unauthorized access
to sensitive user data to carry out illicit financial transactions. As the financial sector experiences mounting losses due to such
activities, the need for efficient, automated detection systems becomes paramount.
Traditional fraud detection techniques often rely on predefined rules, which lack the adaptability needed to keep up with ever-
changing fraud tactics. Machine learning provides a more flexible solution by learning from historical data to identify fraudulent
patterns. This project introduces a fraud detection system based on two prominent algorithms: Random Forest and CART. These
algorithms are known for their robustness and interpretability, making them well-suited for classification tasks like fraud
detection. The system is also designed to support real-time analytics and visualization.
II. Literature Review
Several studies have attempted to address the problem of credit card fraud using various machine learning and statistical
techniques:
Kosemani Temitayo Hafiz [1] et al. analyzed predictive analytics tools used in Canada, highlighting challenges and limitations in
current solutions.
Kundu [2] et al. proposed a hybrid sequence alignment technique using BLAST and SSAHA to compare transaction patterns with
known fraudulent behavior.
Wen-Fang Yu [3] and Na Wang developed an outlier detection model based on distance sum to identify anomalous transactions.
Nipane [4] et al. utilized a hybrid SVM and decision tree approach, demonstrating improved accuracy but requiring complex
tuning.
Sahin [5] and Duman compared SVM and decision trees using real-world datasets, providing valuable insights into the
effectiveness of supervised learning.
While these methods have made progress, they often struggle with data imbalance, real-time processing, or generalization. Our
study aims to address these limitations by leveraging Random Forest's ensemble structure and CART’s decision-making clarity.
III. Methodology
Dataset
The dataset used comes from Kaggle and includes 284,807 transaction records, out of which only 492 are labeled as fraudulent.
Each transaction includes 30 features—'Time', 'Amount', and 28 anonymized features (V1–V28) obtained through Principal
Component Analysis (PCA). The binary 'Class' label denotes the transaction status.