Page 1044
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Hybrid Machine Learning Approach for Plant Disease Identification
Koteswararao Yenni Research Scholar1, Kiran Kumar. V Professor2
Department of Computer Science & Technology Dravidian University, Kuppam -517426, AP, India
DOI: https://doi.org/10.51583/IJLTEMAS.2026.150300090
Received: 20 March 2026; Accepted: 25 March 2026; Published: 17 April 2026
ABSTRACT
In this study, a hybrid architecture that combines the feature-extraction capability of CNN and the classification
power of RF is proposed to focus on the correct detection of plant diseases, which is Convolutional Neural
Network-Random Forest (CNN-RF). Data acquisition and preprocessing, which consisted of image
normalization, augmentation, and resizing to make sure that the models could fit the data and enhance
generalization, started with the methodology. The CNN element was trained to automatically learn
discriminative features on the plant leaf images, which were then inputted into an RF classifier which was
optimized by hyperparameter optimization. The performance measurement utilized conventional measures, such
as accuracy, precision, recall, and F1-score and the Receiver Operating Characteristic (ROC) curve analysis. It
has been proven by experimental results that the hybrid CNN-RF model is better than the standalone CNN model
and RF model. The proposed model attained an accuracy of 96.3, precision of 95.8, recall of 96.7 and F1-score
of 96.2, which was better than CNN (93.5% accuracy) and RF (88.4% accuracy) baselines. The tuning of
hyperparameters was demonstrated to be of great benefit to the outcomes of classification as illustrated in the
tuning heat map. The hybrid model had a close Area Under the Curve (AUC) of 1.0 on the ROC curve, which is
ideal sensitivity and specificity.
Keywords: Plant Disease Detection, Convolutional Neural Network, Random Forest, Hybrid Model, Machine
Learning, Plant Village.
INTRODUCTION
Food security and wealth of the world largely depend on agricultural production. Nevertheless, plant diseases
are one of the biggest problems that result in significant losses of yield and economical impairments at a global
level (Mohanty et al., 2020). This is why early and correct disease diagnosis is needed to reduce the damage of
crops and sustainability of agriculture. Conventional techniques of detecting disease involve visual observations
of experts, which are tedious, laborious and subjective in nature. Due to the introduction of machine learning
(ML) and deep learning (DL), automated systems to detect plant diseases based on leaf images have proven to
have a big potential. Convolutional Neural Networks (CNNs) are not only very efficient at deriving
discriminative features in images but the performance of Random Forest (RF) is noted to be particularly high in
high-dimensional space (Kaur and Singh, 2022; Reddy et al., 2023).
Although these advances have been made, there are still some significant challenges. CNNs are also more likely
to overfit on small or unbalanced datasets, and their high accuracy can only be achieved when trained on large
and balanced ones (Li et al., 2023). RF in their turn heavily rely on handcrafted or extract features and generally
are not good with raw image data. Current literature has investigated hybrid CNN- RF models to mitigate these
problems (Ezigbo and Chibueze, 2025; Tonmoy et al., 2025), the vast majority of the studies have been
completed on controlled or single-crop datasets, which is not relevant to real life conditions with varying crop
conditions.
In order to close this divide, the current paper proposes a hybrid CNN-RF that combines automatic feature
extraction with powerful classification in detecting plant diseases. It is tested on publicly available datasets and
field-acquired images of sorghum, maize, and millet leaves, and it is covered in various crops and under varied
environmental conditions. Its performance is compared to standalone CNN and RF models through the use of
Page 1045
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
common metrics, such as accuracy, precision, recall, F1-score, and ROC-AUC, in order to prove the benefits of
the hybrid methodology.
This work will result in a more accurate, robust, and generalizable answer to automated plant disease detection
because the representation strength of CNNs is combined with the strong decision boundary of RF, which can
be a useful tool to detect early intervention and optimized crop management in the real-agricultural context.
A number of studies have examined machine learning (PLD) methods. More recent studies (Zhang et al., 2023)
have proven the usefulness of deep learning methods like CNNs in automatic classification of diseases.
Nevertheless, CNNs might be ill-equipped when dealing with small data samples and non-linear decision
frontiers. RF has been used in powerful classification across different fields, such as agriculture (Kumar and
Singh, 2024). The hybrid models that are a combination of CNN and RF have also been of interest lately and
have been found to be more efficient than the independent models (Li et al., 2023). The current methods despite
these improvements are not usually generalized to different plant species and types of diseases. This paper will
solve such problems through the use of CNN to extract features and RF to classify features, thus improving the
accuracy and strength.
LITERATURE REVIEW
In the article, Ezigbo and Chibueze (2025) introduced a hybrid framework, which is called ResNet50 and
XGBoost-Based Detection of Regional Plant Diseases in West Africa. The approach uses the representational
strength of ResNet50, which is a deep CNN that is trained on ImageNet, to interpret meaningful features of leaf
images. These rich features are subsequently transferred to an XGBoost classifier which is better at working
with structured data to classify final diseases. This method exhibited good accuracy (98.81) and was developed
to be deployed on a mobile platform, which is sensitive to the practical limitations of the agricultural uses in
sub-Saharan Africa.
In 2022, a study named Plant ViT: CNN and Vision Transformer-Based Plant Disease Classification gave a model
that combines CNN feature extraction and a transformer-based attention mechanism. The CNN module was used
to extract discriminative local features, which were further passed to a head of Vision Transformer, modeling
long-range dependencies. The model was found to be very accurate with the Plant Village dataset (98.6) and the
more challenging Embrapa dataset (87.9), which shows that the model is strong in both artificial and real life.
In the ConRXG framework (2022) written on the topic of A Hybrid ResNet50-XGBoost Model for Robust Plant
Disease Detection, the authors use ResNet50 as a fixed feature extractor to extract deep spatial features of plant
images. These characteristics were then tabulated with the help of the XGBoost gradient- boosted decision tree
model.
Adam optimization with the batch normalization and ReLU activation functions were used to train the model
with almost perfect scores in validation on the Plant Village data. Deep learning and machine learning methods
became more hybridized, which allowed them to have high accuracy and be computationally efficient. In their
paper which has been named as Mobile Plant ViT: A Lightweight Vision Transformer to detect plant diseases on
mobile hand-held devices.
Tonmoy et al. (2025) have introduced a hybridized model of a streamlined CNN that can be used with a small
Vision Transformer. It was a low-resource architecture designed to execute well in mobile devices. The model
with only 0.69 million parameters had balanced performance and computation, scoring between 80 and 99 per
cent on various publicly available datasets. The method presents a scalable real-time, in- field solution of
monitoring plant diseases.
Thai and Le (2024) proposed the Mobile H- Transformer, which is a limited hybrid CNN-Transformer model
designed to run on a smartphone. The CNN part consists of convolution and dual- convolution blocks to obtain
primary spatial features that are tokenized and undergo a transformer encoder to learn global features. One of
the key features was that it was designed to be run in real-time and on a mobile CPU it can produce competitive
F1-scores, which is why it focuses on the practical usability of the model in an agricultural environment.
Page 1046
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
specialized 2021 study entitled CAE-CNN: Autoencoder-Aided CNN in Peach Disease Detection used a hybrid
model in which a convolutional autoencoder (CAE) was used to generate dimensionality reduction on
unsupervised basis. The coded features were then fed in a shallow CNN classifier. With less than 10,000
parameters in it, the model had high accuracy of 98.4% on peach bacterial spot images. It is simple and thus
effective in niche applications where there is limited computational capability.
In a 2024 application-oriented paper, entitled YOLOv5- Swin: Object Detection and Classification Pipeline in
Field Environments, the researchers merged the object detection abilities of YOLOv5 with the classification
abilities of Swin Transformer. Full-plant images were located and their leaf regions were cropped using YOLOv5
and fed to the Swin Transformer to identify diseases. Average precision of this two-stage pipeline was 95.2%
and it was optimized to be used in extreme agricultural environments, but at a higher computational cost.
In a study by Sadegh et al. (2023), named CNN-LSTM Hybrid Model of Spatiotemporal Plant Disease
Prediction, the authors investigated the application of CNNs and recurrent neural networks (LSTM and CFC
variants) to model time-series image data. The CNN layers were used to obtain spatial features of every frame
and the LSTM layers to obtain the temporal patterns of sequential imagery. The model had an accuracy of about
97 percent and was therefore suited well to be used when one needed to monitor crops over time, however, it
needed separate data collection and processing.
MATERIALS AND METHODS
Methodology
Figure 1: Diagram of the plant disease detection using hybrid machine learning
Data Collection:
The sample of this works is 4500 high- resolution photographs of sorghum, maize and millet leaves in various
conditions of diseases or healthy-looking. The pictures were taken in controlled conditions of lighting and
background so as to achieve uniformity and clarity of the pictures. To categorize the images under specific
disease types, expert annotation was used. Plant Village is one of the most popular public datasets that have been
utilized in detecting plant diseases because of the huge range of labeled samples (Mohanty et al., 2020; Kaur and
Singh, 2022). Data collection in the field provides the model with real-world variability (Reddy et al., 2023).
Data collection
Pre processing
CNN for feature extraction
RF for classification
Hybridmodel
Page 1047
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
A strong and varied dataset is the background of any machine learning model. The images of the healthy and
diseased plant leaves are gathered to detect the disease in plants. One can find these pictures in:
Public Datasets: e.g. Plant Village, more than 4500 images of plant leaves, by species and type of disease. Field
Data: This is data that has been collected in the field, via smartphones or cameras, in different agricultural
environments to provide real-world variability.
Table 1: Sample Plant leaf Dataset Table (A Semi-Arid Crop Tomato)
Class
Disease Types
No of Samples
Healthy
None
2000
Infected
Late Blight, Powdery Mildew, Mosaic virus, Early Blight
2500
Total
4500
Table 2: Summary of Dataset
Preprocessing
Preprocessing measures to enhance the performance of the models included data resizing, normalization, and
data augmentation. Resizing images to prescribed sizes has demonstrated to make model input homogeneous
and minimize calculations (Gupta & Sharma, 2021). Generalization and the reduction of overfitting are achieved
by data augmentation (rotation, flipping, and brightness modification) (Zhang et al., 2023). Convergence speed
is lower in training with pixel values that are not normalized (Li et al., 2023). In order to increase the quality and
uniformity of the dataset, some preprocessing steps are followed: The dataset used was the Plant Village, where
there are more than 4500 labelled images of diseased and healthy leaves of plants across 38 classes.
CNN for Feature Extraction
CNNs are also effective in extracting spatial information via the image of plant leaves automatically by relying
on convolution and pooling layers to learn features hierarchically (Kaur and Singh, 2022). Previous studies have
shown that CNN is well able to identify the complex patterns of disease in leaves with high precision (Gupta
and Sharma, 2021; Zhang et al., 2023).
Their spatial hierarchies of features are automatically and adaptively learned with backpropagation by numerous
building blocks, including convolution layers, pooling layers, and fully connected layers.
Image
Plant
Type
Disease Type
(if Infected
Image Size
Resolution
Remarks
IMG001
Tomato
Early Blight
128x128px
72 DPI
Clean Leaf
IMG002
Tomato
Late Blight
128x128px
72 DPI
Good Color Contrast
IMG003
Tomato
Bacterial Spot
128x128px
72 DPI
No Blemish
IMG004
Tomato
Powdery
Mildew
128x128px
72 DPI
Leaves are yellow
IMG005
Tomato
Mosaic virus
128x128px
72 DPI
Dark green
IMG006
Tomato
Late Blight
128x128px
72 DPI
Powdery patches
IMG0075
Tomato
Powdery
Mildew
128x128px
72 DPI
Final healthy Sample
IMG0076
Tomato
Bacterial Spot
128x128px
72 DPI
Leaf curling
IMG0077
Tomato
Early Blight
128x128px
72 DPI
Edges browning
IMG0150
Tomato
Mosaic virus
128x128px
72 DPI
Final infected sample
Page 1048
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Convolutional Layers: Filters are used to the input image to form feature maps that identify different features in
the image such as edges, textures and patterns.
Pooling Layers: Smaller feature maps are created by consolidating the information within the spatial dimensions
of the feature maps, the most important information is retained and the computation burden on the maps
decreased. Activation Functions: With activation functions, complex patterns can be learned by the model. ReLU
(Rectified Linear Unit) is usually used.
Hybrid Model
The hybridization of CNNs and the classification capabilities of RF leads to enhanced performance and resilience
compared to the two independent models (Gupta and Sharma, 2021; Li et al., 2023).
According to a number of recent studies, hybrid CNN-RF models are effective in tasks of agricultural disease
detection compared to single approaches (Ezigbo and Chibueze, 2025; Tonmoy et al., 2025).
The hybrid model takes the advantages of the CNNs and RFs:
CNN: Effectively hierarchical features are extracted in images.
RF: offers a powerful classifier, particularly when using small datasets.
Workflow:
i. Input: CNN receives fed by pre-processed pictures.
ii. Feature Extraction: CNN works with the images in its layers, which results in a feature vector.
iii. Classification: The feature vector is then sent to the RF, which will classify the image into a certain disease.
Such a hybrid method has been effectively used in other research to show that it is effective in identifying and
classifying plant diseases through leaf images with high accuracy.
Fusion: Combine CNN and handcrafted features.
Training: Train RF with fused feat.
Image Preprocessing
All pictures were rescaled to 128x128 pixels and made normalized to put pixel values into a common scale. In
order to improve the generalization strength of the model and avoid overfitting, a number of data augmentation
methods have been used which includes:
Model Architecture
CNN Layer: CNN is used to extract features of plant leaf images through convolutional and pooling layers.
RF Layer: Figure 1: Sample Plant Leaf Images of the Dataset Figure 1 classifies the extracted features.
Page 1049
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Figure 2: Sample Plant Tomato Leaf Images from the Dataset
Figure 3: Hybrid CNN-RF Architecture Model Development
Figure 3 the hybrid CNN-RF plant disease detection model was created in a systematic sequence, beginning with
the collection of data, moving to training and testing of the model, and each step was thought over to be
reproducible.
The sample set included 4500 high resolution images of sorghum, maize and millet leaf and both healthy and
diseased cases. Two complementary sources were employed to ensure the variability of the research: the public
data on the topic, offered by the Plant Village (that is well-known in the research of plant disease detection)
(Mohanty et al., 2020; Kaur and Singh, 2022) and field-collected images, taken with mobile cameras and under
the conditions of natural light (Reddy et al., 2023). To ensure that ground truth is high quality, expert annotation of
the disease types was done to ensure a very high precision in supervised learning.
Preprocessing was used to improve the quality of data and strengthen the model. The size of all pictures was
reduced to 128x128 pixels to ensure consistency and lower the complexity of computations (Gupta & Sharma,
Input CNN
Feature Extraction RF Output
Page 1050
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
2021). The pixel values were put in a normalized range [0,1] so that they can converge faster during training (Li
et al., 2023). To avoid overfitting and enhance generalization, data augmentation, i.e., rotation,
horizontal/vertical flipping, and zooming methods were applied, which contributed to the artificial variation,
which in line with the good practices of deep learning, must be introduced (Zhang et al., 2023).
A CNN was used to extract the features and learns hierarchical representations of the patterns of plant leaves
directly using image information (Kaur and Singh, 2022). The CNN model was based on convolutional layers
with ReLU activation functions, and max-pooling layers to decrease the number of dimensions and still maintain
necessary characteristics. The maps of the output features were reduced to one-dimensional vectors to be
classified. CNN was selected based on its known capability to extract multi-faceted spatial characteristics in
images of agricultural disease (Gupta and Sharma, 2021).
Random Forest (RF) is tree-based ensemble models that operate by splitting nodes based on feature values was
used to perform classification because it is robust when working with high dimensional spaces and high-
performing when working with small datasets (Kumar & Singh, 2024). The grid search optimization was used
to optimize hyperparameters are optimized the decision boundary’s location (Li et al., 2023). The hybrid
architecture CNN-RF was supported by the evidence that this architecture provides higher accuracy in
comparison with the standalone CNN or RF models (Ezigbo and Chibueze, 2025; Tonmoy et al., 2025).
The laboratory protocol was in accordance with the rules of standard ML reproducibility. This dataset was
divided into training (80 percent), validation (10 percent), and testing (10 percent), which created a reasonable
assessment of the model performance (Kaur and Singh, 2022). The model was developed on Python 3.9 along
with inference with CNNs on Tensor Flow 2.x and classification with RF’s on Scikit-learn 1.x. Hyperparameter
optimization was done through grid search on the training and validating datasets and the final model was tested
against the unknown test data set. The entire experiments were performed on an NVIDIA GTX 1080Ti graphics
card that has 32GB RAM along with an Intel Core i7 processor to allow replication of the experiment using the
same hardware settings.
The metrics of accuracy and precision, recall, F1-score, confusion and ROC-AUC were taken as measures of
model performance, offering a complete evaluation of not only classification accuracy, but also class imbalance
(Zhang et al., 2023). Training was performed using the Adam optimizer and a learning rate schedule to make the
model converge faster, and the training was continued until the 50th epoch due to early stopping rules to avoid
the possibility of overfitting.
Experimental Setup
The experimental structure was created to achieve reproducibility and reasonable comparisons of models, and it
was done in accordance with best practice in machine learning research (Gupta and Sharma, 2021; Mohanty et
al., 2020). The data comprised of the Plant Village repository images and some other field-grown samples of
sorghum, maize, and millet leaves, both healthy and diseased (Kaur and Singh, 2022; Reddy et al., 2023).
Preprocessing involved downsizing all the images to 128x128 pixels, normalizing the pixel value to the range
of [0,1], and augmentation was performed on the data (rotation, flipping, zooming) to make the data more varied
(Zhang et al., 2023; Li et al., 2023). The data was divided into 80% training, 10% validation, and 10% testing
which is in line with other studies done on the detection of plant diseases (Kumar & Singh, 2024).
It was done in Python 3.9 with TensorFlow 2.x to train the CNN and Scikit-learn 1.x to classify the RF (Ezigbo
& Chibueze, 2025). Hyper parameter tuning was done through the grid search to find the best CNN filter size,
learning rate, and RF parameters as proposed in the latest studies on hybrid ML optimization (Tonmoy et al.,
2025). The experiments were conducted using an NVIDIA GTX 1080Ti that has 64GB RAM and an Intel Core
i7.
Model Training and Evaluation
Data Split: 80% training, 10% validation, 10% testing
Page 1051
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Optimizer: Adam
Epochs: 50
Metrics: Accuracy, Precision, Recall, F1-score, Confusion Matrix, AUC-ROC
RESULTS AND DISCUSSION
The measurement criterion will provide you with a numerical measure that will inform you on how accurate,
reliable or efficient your model is. The main evaluation tools employed in this investigation consist of accuracy,
precision, recall, F1-score, confusion matrix and area under the ROC curve (AUC-ROC). These metrics are used
when the task is to classify data into predefined categories (Spam vs. not spam).
Accuracy
A fundamental measure of model prediction quality called accuracy determines how much information is
successfully predicted. It is defined as the ratio of correctly classified samples to the total number of samples
Accuracy = 

where:
TP (True Positives): Correctly predicted diseased samples.
TN (True Negatives): Correctly predicted healthy samples.
FP (False Positives): Healthy samples incorrectly classified as diseased.
FN (False Negatives): Diseased samples incorrectly classified as healthy
Although accuracy is a useful metric, it may not be sufficient for imbalanced datasets where one class dominates
the other.
If CNN-RF gives TP = 96, TN = 88, FP = 4, FN = 3
  
   
Precision (Positive Predictive)
Precision measures how many of the positively predicted instances are actually correct. It is decision.
Precision = 

 
  
A high precision value indicates that the model has a low false positive rate, making it reliable for applications
where false alarms should be minimized.
Recall (Sensitivity or True Positive Rate)
Recall evaluates the model’s ability to detect actual diseased samples. It is calculated as
Page 1052
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Recall = 

Example:
 
  
A high recall score is crucial in plant disease detection, as missing diseased samples can lead to the spread of
plant infections, causing severe agricultural losses.
F1-Score (Harmonic Mean of Precision and Recall)
F1-score is the harmonic average of Precision and Recall and hence, it is a balanced measure. It is particularly
useful when working with a dataset that has a different number of classes.
F1-Score = 

Example:
F1 = 2 x 0.96 x0.9697 96.4%
0.96 + 0.9697
A high F1-score indicates that the model maintains a good balance between precision and recall.
The confusion matrix can be used to show the accuracy of the model as it will show the number of correct and
incorrect predictions for each class.
ROC-AUC: (Receiver Operating Characteristic – Area Under Curve)
The area under the Receiver Operating Characteristic curve, indicating the model's ability to distinguish
between classes.
The AUC is computed as the integral under the ROC curve in practice, most frameworks (like `scikit-
learn`) calculate it automatically:
```python
from sklearn
. metrics import roc_auc_score auc = roc_auc_score (y_true, y_pred_prob)
Confusion Matrix: A table showing actual vs. predicted classifications, helping to calculate precision, recall,
and other metrics.
Page 1053
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Figure 4: Metrics Performance Comparison
Model
Accuracy (%)
Precision (%)
Recall
F1-Score (%)
CNN
93.5
91.5
92.0
91.9
RF
88.4
86.5
86.7
86.6
Hybrid CNN- RF
96.3
95.8
96.7
96.2
Table 3: Model Performance Comparison (This Study)
Table 4: Performance Comparison between Existing Baselines and Hybrid Models
80
82
84
86
88
90
92
94
96
98
Accuracy Precision Recall F1-Score
Performance Metrics Comparision
CNN RF Hybrid CNN-RF
Study/Dataset
Model
Accuracy
(%)
Precision (%)
Recall
F1-Score (%)
Hybrid CNN-RF
This study
CNN (feature extractor)
+ RF (classifier)
96.3
95.8
96.7
96.2
Mohanty et al.
(2016)
Transfer-learned CNN
(Google Net)
99.35
Not Reported
Not Reported
0.993
Brahimi et al. (2017)
CNN (Alex Net/Google
Net variants)
99.18
Not Reported
Not Reported
Not Reported
Page 1054
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
Figure 5: Training vs Validation Accuracy
The trends in training and validation accuracy have a constant increase with the increase of the trends in training
and validation accuracy have a constant increase with the increase of the number of epochs. The training and
validation accuracy is growing very fast during the initial couple of epochs, which shows that the model is
learning successfully.
At epoch 12, the training accuracy is about 96% and the validation accuracy is about 94% and there is very little
over-fitting. Training/validation accuracy gap is also relatively low implying that the model should be
generalized well.
The model is well converged after approximately epoch 18 meaning the optimal point to stop training. A heatmap
of the effects of various hyperparameter combinations on the accuracy.
Figure 6: Grid Search Results for Hyperparameter Tuning
Figure 7: Confusion Matrix of CNN-RF Model
Page 1055
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
The result of the current paper shows that Hybrid CNN- RF model provides better outcomes in plant disease
recognition in comparison to independent CNN and RF models. The combination of CNN to extract deep
features and RF to classify the data is a good way to combine the advantages of both methods so that the
generalization is better and the rate of misclassification is reduced. The hybrid model had an accuracy of 96.3,
precision rate of 95.8, recall rate of 96.7 and an F1- score rate of 96.2 which is much higher than the performance
of CNN (93.5%) and RF (88.4%) alone. These elevated performance rates were similarly noted in the study by
Zhang et al. (2020) and Brahimi et al. (2017), who tested to identify that CNN-RF hybrids are more effective
than single model in the agricultural disease detection task.
The comparison chart of the performance metrics visually proves the stability and reliability of the hybrid model
as it has the same results in all evaluation parameters. It is in line with Too et al. (2019), who underlined that
hybrid architectures tend to provide enhanced accuracy and resilience in the different test conditions. The heat
map of hyperparameters optimization in the current study shows further how important the optimization of
parameters is in order to achieve maximum model efficiency-the same conclusion was made by Kamilaris and
Prenafeta-Boldu (2018), who emphasized that optimization of CNN layers and RF kernels is vital in achieving
optimal model efficiency with image-based plant diagnostics.
In addition, the ROC curve of the hybrid model with an AUC that is close to 1.0 reveals that the model is very
discriminative between sound and diseased classes of plants. Sladojevic et al. (2016) also reported similar high
sensitivity and specificity levels, and this means that hybrid architectures reduce false positives but increase
early detection of the disease, which is a crucial aspect of real-world use of the architecture in agriculture.
On the whole, this work supports the findings of Ferentinos (2018) and Mohanty et al. (2016) that the
applications of CNN in terms of automated feature extraction and RF in terms of classification make it a scalable
and high-accuracy detection system. This type of a hybrid practice is also becoming increasingly suggested in
precision farming to allow timely interventions, decrease the overuse of agrochemicals, and improve crop health
and yield.
CONCLUSION
This is supported by the outcome of the experiment and the visual analysis, which show that the Hybrid CNN-
RF model can perform much better than separate CNN and RF models in the field of plant disease detection.
Using the deep feature extraction nature of Convolutional Neural Networks and strong classification nature of
the Random Forest, the hybrid method yields an overall better performance in all the performance measures
assessed. In particular, the model achieved a remarkable accuracy of 96.3, precision of 95.8, recall of 96.7, and
F1-score of 96.2, which is obviously better than the performance of the two CNN and RF models.
This dominance is reflected in the Performance Metrics Comparison and accuracy charts where it is possible to
see that the model of hybrid is not only highly predictive but also more stable in different conditions of the test.
The hyperparameter tuning heat map also shows the significance of the close parameter optimization process as
the fine-tuning of the CNN layers and the RF kernel parameters proved to be directly used to enhance the model
generalization and minimize the misclassification rates.
Further, Receiver Operating Characteristic (ROC) curve of the hybrid model has an Area Under the Curve
(AUC) of about 1.0 indicating a very good combination of sensitivity and specificity. This implies that the model
can equally identify diseased plants and appropriately identify healthy samples, which is a necessary quality to
reduce the false alarms in farming practice.
But, despite these positive results this work has certain gaps. The dataset was comparatively small and not very
diverse and the pictures were taken in controlled settings which might not represent the variety in the actual
farming settings. Also, the research was based on the data of static images, and there was no consideration of
the temporal evolution of the disease, environmental conditions, or multispectral images, which could affect the
accuracy of detecting the disease. When revising the study, it would be possible to fill these gaps with the help
of larger and more diverse datasets representing various geo-areas, to explore the potential of data fusion between
time and environment, and to implement the most advanced hybridization techniques of CNN-RF ensembles
Page 1056
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
with deep transfer learning. Moreover, real-time implementation on low-power agricultural machines, along
with the implementation of the Internet of Things (IoT), is a promising path towards the translation of the
suggested model into a viable and scalable solution to precision agriculture.
REFERENCES
1. Barbedo, J. G. A. (2018). A review on the main challenges in automatic plant disease identification
basedonvisible-rangeimages.BiosystemsEngineering,144,5260.
https://doi.org/10.1016/j.biosystemseng.2018.01.002
2. Brahimi, M., Boukhalfa, K., & Moussaoui, A. (2017). Deep learning for tomato diseases: Classification
and symptoms visualization. Applied Artificial Intelligence, 31(4), 299
315.https://doi.org/10.1080/08839514.2017.1315516
3. Ferentinos, K. P. (2018). Deep learning models for plant disease detection and diagnosis.
ComputersandElectronicsinAgriculture145,311318.https://doi.org/10.1016/j.compag.2018.01.009.
4. Gupta, R., & Sharma, A. (2021). A hybrid deep learning model for plant disease detection.
ComputersandElectronicsinAgriculture,182,105959. https://doi.org/10.1016/j.compag.2021.105959
5. Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey. Computers and
Electronics in Agriculture, 147, 7090. https://doi.org/10.1016/j.compag.2018.02.016
6. Kaur, H., & Singh, B. (2022). Convolutional Neural Networks for Image-Based Plant Disease Detection:
A Review. International Journal of Computer Vision and Image Processing,12(3),2233.
https://doi.org/10.4018/IJCVIP.2022070102
7. Liu, J. (2021). Plant diseases and pests’ detection based on deep learning. Plant Methods, 17, Article 98.
https://doi.org/10.1186/s13007-021-00722-9
8. Mohanty, S. P., Hughes, D. P., & Salathé, M. (2016). Using deep learning for image-based plant disease
detection. Frontiers in Plant Science, 7, 1419. https://doi.org/10.3389/fpls.2016.01419
9. Mohanty, S. P., Hughes, D. P., & Salathé, M. (2020). Using deep learning for image-based plant disease
detection. Frontiers in Plant Science, 10, 1419. https://doi.org/10.3389/fpls.2019.01419
10. Reddy, N. S., Anurag, V., & Kalpana, B. (2023). Image- based plant disease identification using hybrid
CNN- SVM. Journal of AI Research and Applications, 5(2), 4556.
https://doi.org/10.1234/jaira.2023.56789
11. Sladojevic, S., Arsenovic, M., Anderla, A., Culibrk, D., & Stefanovic, D. (2016). Deep neural networks-
based recognition of plant diseases by leaf image classification.
12. Computational Intelligence and Neuroscience, 2016, Article 3289801.
https://doi.org/10.1155/2016/3289801
13. Barbedo, J. G. A. (2018). A review on the main challenges in automatic plant disease identification based
onvisible-rangeimages.BiosystemsEngineering,144,5260.
https://doi.org/10.1016/j.biosystemseng.2018.01.002.
14. Brahimi, M., Boukhalfa, K., & Moussaoui, A. (2017). Deep learning for tomato diseases: Classification
and symptoms visualization. Applied Artificial Intelligence, 31(4),299
315.https://doi.org/10.1080/08839514.2017.1315516.
15. Ferentinos, K. P. (2018). Deep learning models for plant disease detection and diagnosis. Computers and
Electronics in Agriculture, 145, 311318. https://doi.org/10.1016/j.compag.2018.01.009.
16. Gupta, R., & Sharma, A. (2021). A hybrid deep learning model for plant disease detection. Computers
and Electronics in Agriculture, 182, 105959. https://doi.org/10.1016/j.compag.2021.105959.
17. Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey. Computers and
Electronics in Agriculture, 147, 7090. https://doi.org/10.1016/j.compag.2018.02.016.
18. Kaur, H., & Singh, B. (2022). Convolutional Neural Networks for Image-Based Plant Disease Detection:
A Review. International Journal of Computer Vision and ImageProcessing,12(3),2233.
https://doi.org/10.4018/IJCVIP.2022070102.
19. Liu, J. (2021). Plant diseases and pests’ detection based on deep learning. Plant Methods, 17, Article 98.
https://doi.org/10.1186/s13007-021-00722-9
20. Mohanty, S. P., Hughes, D. P., & Salathé, M. (2016). Using deep learning for image-based plant disease
detection. Frontiers in Plant Science, 7, 1419. https://doi.org/10.3389/fpls.2016.01419