Page 156
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
A Comparative Analysis of Random Forest and Gradient Tree
Boosting for Cropland Mapping Using Multi-Sensor Sentinel Data: A
Case Study of Abuja Municipal Area Council, Nigeria
Idris Ibrahim
1*
, Salman Salis Khalid
1
, Hudu Hamza Musa
2
, Nafisah Abdullahi Ahmed
2
1
Strategic Space Applications Department, National Space Research and Development Agency, Abuja,
Nigeria
2
Atlantic International Research Center, National Space Research and Development Agency, Abuja,
Nigeria
*Corresponding Author
DOI:
https://doi.org/10.51583/IJLTEMAS.2026.150400014
Received: 01 April 2026; 06 April 2026; Published: 02 May 2026
ABSTRACT
Accurate mapping of cropland in urbanizing areas is critical for sustainable land use planning and climate-
resilient urban development. In this study, the suitability of two machine learning algorithms, namely Random
Forest and Gradient Tree Boosting, was tested and compared in the context of land use/land cover classification
in the Abuja Municipal Area Council in Nigeria. The evaluation was carried out with the integration of multi-
sensor data obtained from Sentinel-1 Synthetic Aperture Radar and Sentinel-2 optical imagery in the Google
Earth Engine platform. In order to improve the overall accuracy of the classification results, an extensive feature
set was developed and incorporated into the machine learning algorithms. The feature set was developed based
on the integration of spectral and phenological features with texture features. The overall accuracy of the
classification results obtained with the two algorithms was found to be satisfactory, with values above 77%.
However, the overall accuracy obtained with the Gradient Tree Boosting classifier was slightly better at 77.5%,
with a Kappa coefficient of 0.719. However, when the class-specific accuracy was evaluated, the Random Forest
classifier was found to be better in the context of classifying the cropland class. The overall accuracy of the
Random Forest classifier in classifying the cropland class was found to be better in terms of precision at 78.9%,
recall at 75.7%, and F1 score at 77.2%. Therefore, the Random Forest classifier was selected for the final
classification of the study area. The classified image obtained with the Random Forest classifier estimated the
total extent of the cropland class to be around 47,924 hectares in the study area in the year 2025. The estimated
extent of the cropland class accounts for around 27.1% of the total area. The results obtained in this study
demonstrate the suitability of the integration of multi-sensor data in improving the overall accuracy of the
classification results. In addition, the results obtained in this study demonstrate the importance of class-specific
accuracy in the selection of the machine learning algorithms for the purpose of classifying the land cover class.
Keywords: Random Forest, Gradient Tree Boosting, Cropland Mapping, Sentinel-1, Sentinel-2, Data Fusion.
INTRODUCTION
Agricultural land makes up 38% of the Earth’s terrestrial surface and is of crucial importance to food security
and ecosystem services (Azadi et al., 2021). Accurate and timely information regarding cropland is essential in
monitoring and forecasting crop yield, ensuring the sustainable management of cropland in the face of climate
change and land degradation (Dubey et al., 2025).
Remote sensing enables observations of the land surface in a continuous and repetitive way, unfettered by
geographical boundaries (Du et al., 2019). The European Space Agencys SENTINEL program is instrumental
in significantly improving the capabilities of remote sensing in land observations and management. SENTINEL-
Page 157
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
2 is capable of providing multispectral observations with a spatial resolution of 10-20m and a 5-day revisit time,
allowing for crop phenology characterization, while SENTINEL-1 is capable of providing all-weather
observations using C-band SAR data, which is useful in providing complementary information regarding crop
structure (Li et al., 2025). The synergistic use of optical and SAR data has also shown improved accuracy in
land classification through the use of biochemical indices and backscatter and texture information, which is not
affected by clouds (Li et al., 2025). Ensemble machine learning algorithms are also particularly useful in dealing
with the high-dimensional nature of remote sensing data (Kalantar et al., 2020). Random Forest (RF) uses
bagging and random selection of features, which ensures robust performance with low chances of overfitting
(Arora & Kaur, 2020). Gradient Tree Boosting (GTB) uses sequential tree building to reduce residuals, which
can provide high accuracy but requires hyperparameter tuning (Bogale et al., 2025). These algorithms can also
be used in an efficient manner through Google Earth Engine (GEE), which is a cloud-based platform for
planetary-scale geospatial analysis (Gorelick et al., 2017).
Comparative analysis of these algorithms has also shown that algorithm performance is context-specific. RF
performed better than GTB in mangrove monitoring (Simarmata et al., 2025) and some cases of fused optical-
SAR tree species classification. However, it is also evident that cropland mapping is a particularly challenging
task due to similarity in spectral signatures of crops with other vegetation, phenological variability, and
management heterogeneity. Therefore, it is expected that findings of other LULC-specific works may not be
applicable in this context. The objective of this research is to compare the performance of RF and GTB in
cropland mapping in a tropical savanna environment in a developing country context. In addition to this, an
exhaustive feature stack will be used in GEE. Therefore, the objectives of this research are to: construct an
exhaustive feature stack of Sentinel-1 and Sentinel-2 data in GEE, compare and evaluate the performance of RF
and GTB in cropland mapping in GEE, and map the 2025 cropland extent using the best-performing algorithm.
MATERIALS AND METHODS
Study Area
The study area is located in Nigeria, in the Federal Capital Territory, covering 1,769 km² in the Abuja Municipal
Area Council (AMAC) (45′–9°10′N, 7°10′–7°35′E) (figure 1). The climate is classified as tropical savanna
with 1,500 mm rainfall annually and temperatures ranging from 2635 °C. The land cover types include urban,
informal settlement, savanna, cropland, and water. Urbanization in the study area (Abubakar, 2014).
Figure 1: The study area within Federal Capital Territory, Abuja
Page 158
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
Satellite Data and Preprocessing
All data processing was implemented within a cloud environment, Google Earth Engine. The data used including
an optical dataset; Sentinel-2 Level 2A Surface Reflectance data (COPERNICUS/S2_SR_HARMONIZED)
covering the time period from January to December 2025. From these spectral bands, six bands are used: B2,
B3, B4, B8, B11, and B12. For cloud and shadow removal, Scene Classification Layer was applied. Also, scaling
is done to the range of 0 to 1. Normalized Difference Vegetation Index (NDVI) and Normalized Difference
Built-up Index (NDBI) were used. Median composites were implemented to obtain five phenological metrics
such as minimum NDVI, maximum NDVI, mean NDVI, standard deviation, and amplitude from NDVI time
series.
For the Synthetic Aperture Radar (SAR) data, the study utilized Sentinel-1 Ground Range Detected Imagery
Interferometric Wide mode data (COPERNICUS/S1_GRD) with polarization of both VV and VH. This was also
for the period 2025. GEE's built-in function was applied for preprocessing. That included thermal noise removal,
radiometric calibration, and terrain correction using SRTM. Median composites were used for generating five
phenological metrics such as minimum, maximum, mean, standard deviation, and temporal variance of VH
backscatters. Also, GLCM texture was used. Contrast, entropy, and variance were applied. These were calculated
from scaled VH backscatters with a 3 x 3 moving window.
Feature Stack and Reference Data
A feature stack consisting of 20 bands was created, which including 6 spectral bands, 2 spectral indices (NDVI
and NDBI), 5 NDVI phenological metrics, 3 SAR backscatter values, 1 temporal variance image, and 3 GLCM
texture values. All images were resampled to 10 m spatial resolution using bilinear resampling.
Stratified random sampling was used to generate 250 reference points from five different land use land cover
classes, namely cropland, forest/savanna, built-up, bare soil, and water, which were further divided into training
and validation sets (30% and 70%, respectively). Image feature values are extracted using nearest neighbor
resampling.
Classification and Accuracy Assessment
RF classifier was applied with 200 trees, default number of trees in each node, and minimum leaf population of
1 (Arora & Kaur, 2020).
GTB classifier was used with 100 trees and learning rate of 0.05 (Bogale et al., 2025).
Fixed random seed of 42 was set in the code to ensure reproducibility of results.
Accuracy assessment was carried out using the validation set, and confusion matrix is used to evaluate accuracy,
Kappa coefficient, and precision, recall, and F1-score of cropland class.
Cropland Area Estimation and Implementation
RF classifier was used to classify the image and appeared to be the better classifier based on accuracy and other
parameters.
Binary image is generated showing cropland and non-cropland classes.
Cropland area was estimated using ee.Image.pixelArea() and reduceRegion (30 m scale) in GEE.
All code was written and executed in Google Earth Engine (Gorelick et al., 2017).
Page 159
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
RESULTS
Classification Accuracy
The accuracy of both classifiers was high. However, the overall accuracy and Kappa statistic were slightly higher
for the GTB classifier: overall accuracy was 77.5%, and the Kappa statistic was 0.719. The overall accuracy and
Kappa statistic for the RF classifier were slightly lower. The RF classifier was better at detecting the class of
cropland: it had a precision of 78.9%, recall of 75.7%, and F1-score of 77.2%.
Table 1 presents the accuracy of the classifiers. The overall accuracy and Kappa statistic of the Gradient Tree
Boosting classifier were slightly higher: overall accuracy was 77.5%, and the Kappa statistic was 0.719.
However, the overall accuracy and Kappa statistic of the Random Forest classifier were slightly lower. The
Random Forest classifier was better at detecting
Table 1: Comparative accuracy metrics for RF and GTB classifiers
Metric
RF
GTB
Overall Accuracy
0.772
0.775
Kappa
0.715
0.719
Precision
0.789
0.783
Recall
0.757
0.730
F1 Score
0.772
0.755
Classification Confusion matrix
The confusion matrix for the RF classification (Table 2) shows that the algorithm performed well across all
classes, with the highest producer's accuracy observed for water bodies (Class 5). Cropland (Class 2) had 38 true
positives, with 18 omission errors (misclassified as other classes) and 21 commission errors.
Table 2. Confusion matrix for Random Forest classification
Class
TP
FP
Predicted Total
Reference Total
0
Class 1
56
15
71
74
1
Class 2
38
21
59
56
2
Class 3
49
19
68
69
3
Class 4
50
23
73
71
4
Class 5
75
1
76
77
Class
TP
FP
Predicted Total
Reference Total
0
Overall Accuracy
0.772334
1
Kappa Coefficient
0.714816
From the confusion matrix, the following accuracy metrics were derived:
Precision (User's Accuracy): 0.789
Recall (Producer's Accuracy): 0.757
F1-Score: 0.772
Gradient Tree Boosting Performance
The confusion matrix for the GTB classification (Table 3) indicates a similar pattern, with slightly different error
distributions. Notably, cropland recall for GTB (73.0%) was lower than for RF (75.7%).
Page 160
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
Table 3. Confusion matrix for Gradient Tree Boosting classification
Class
TP
FP
Predicted Total
Reference Total
0
Class 1
54
15
69
74
1
Class 2
40
24
64
56
2
Class 3
49
15
64
69
3
Class 4
50
23
73
71
4
Class 5
76
1
77
77
Class
TP
FP
Predicted Total
Reference Total
0
Overall Accuracy
0.775216
1
Kappa Coefficient
0.718622
Derived accuracy metrics for GTB:
Cropland-specific metrics (class 1):
Precision (User's Accuracy): 0.783
Recall (Producer's Accuracy): 0.730
F1-Score: 0.755
Classification Maps
The initial LULC classes derived by the two algorithms (Figure 2a and Figure 2b, respectively, for RF and GTB)
successfully distinguished the five land cover classes such as cropland, vegetation, built-up bare soil, and water
bodies in the AMAC area.
Based on the accuracy comparison results, RF was finally used to classify the cropland land cover class in the
area. Figure 3b shows the binary cropland and non-cropland classification map derived by the RF classifier. The
figure indicates that cropland is mainly located in the southern river valleys and plains of the area.
Visual comparison of the maps derived by the two classifiers (Figure 3a and Figure 3b, respectively, for GTB
and RF classifiers) indicates that RF performed better in terms of clarity of boundaries and homogeneity of the
derived cropland class.
The total cropland area in the year 2025 is estimated to be 47,924 hectares, which accounts for 21.7% of the total
area of the AMAC area. The GTB classifier estimated the cropland area to be larger (49,361 ha), and this could
be explained by the higher commission error of the classifier for the cropland class.
Page 161
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
Figure: 2b GTB Classified Image
Figure: 2a RF Classified Image
Page 162
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
Figure: 3a GTB Cropland
Figure: 3b RF Cropland
Page 163
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
DISCUSSION
The results show that the classifiers, namely Random Forest (RF) and Gradient Tree Boosting (GTB), were
effective in classifying the LULC types in the complex urban and peri-urban area using multi-sensor Sentinel
data, with an accuracy greater than 77%. Although GTB performed slightly better in terms of global accuracy
and Kappa values, RF performed significantly better in the classification of cropland types, as indicated by the
precision, recall, and F1-score values. This is an important aspect of the results, as it emphasizes the importance
of choosing classifiers based on class-wise accuracy rather than global accuracy.
It is believed that RF performed significantly better in the classification of cropland types because of the inherent
robustness of the classifier. The bagging method and random selection of features in RF make the classifier more
robust and less prone to overfitting and noise, which is more likely in complex landscapes where the spectral
signature of cropland, fallow land, and natural vegetation is likely to be similar (Belgiu and Drăguţ, 2016). The
balanced omission and commission errors of RF in classifying cropland types are more significant in generating
accurate agricultural statistics. The sequential learning in GTB might be more prone to overfitting and noise,
which could have led to a higher commission error (24%) compared to RF (21%).
The effectiveness of the 20-band multi-sensor feature stack can be clearly observed. The incorporation of NDVI-
derived phenological metrics was instrumental in distinguishing the phenology of crops from the stable
phenology of natural vegetation in the savanna environment. At the same time, the incorporation of SAR
backscatter (VV, VH), the VH/VV ratio, temporal variance, and GLCM textures provided essential structural
and spatial information, which can be invaluable in areas with persistent cloudy conditions. This synergy was
able to mitigate the common confusion between croplands, grasslands, and bare soil, which is a common
challenge in agricultural areas, particularly in peri-urban areas (Van Tricht et al., 2018).
The results are in line with previous comparative studies. Simarmata et al. (2025) similarly observed that RF
outperformed GTB in a complex mangrove environment, which is also characterized by high spectral variability.
These findings support the reliability of RF as a robust, computationally efficient, and scalable algorithm for
cropland monitoring in the GEE environment, particularly in areas where tuning of hyperparameters is limited.
Despite the robustness of the RF algorithm, there are some limitations to the study. For instance, the number of
samples used in the study (n=250) is limited for an area of high landscape variability, which could potentially
improve the accuracy of the models by using a larger dataset, which could be validated in the field. Moreover,
the hyperparameters used in the study were standard, which could potentially improve the performance of the
models through a rigorous optimization process, such as cross-validation. For future work, the following
improvements are proposed: (1) the development of a rigorous hyperparameter tuning process for both models,
(2) the use of ensemble models to potentially harness the best of both models, (3) using SHAP analysis to
improve model interpretability, and (4) the generalization of the developed framework to other agro-ecological
zones in Nigeria and West Africa.
CONCLUSIONS
This study aims to give a comparative overview of the Random Forest and Gradient Tree Boosting algorithms
in cropland mapping using multi-sensor Sentinel imagery in Google Earth Engine. The key findings of the study
were:
Although Gradient Tree Boosting had a marginally higher accuracy (77.5%), Random Forest had higher cropland
mapping performance in terms of F1-score (77.2%), which is more representative of precision-recall trade-offs.
The feature stack used in this study, which combines Sentinel-2 optical, Sentinel-1 radar imagery, and other
features, proved highly effective in cropland mapping in a complex tropical savanna setting with high levels of
urbanization.
Page 164
www.rsisinternational.org
INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026
Thirdly, Random Forest was selected as the most effective algorithm in cropland mapping, generating a map of
cropland extent in 2025 for the Abuja Municipal Area Council in Nigeria, with an estimated cropland area of
47,924 hectares or 21.7% of the study area.
Finally, the study also shows that in land cover mapping, using class-specific metrics is more effective in
selecting the most appropriate algorithm than using global metrics.
The scalable approach developed in this study using GEE is highly effective in cropland mapping, which is
highly valuable in sustainable management and food security assessments in data-deficient regions.
REFERENCES
1. Abubakar, I. R. (2014). Abuja city profile. Cities, 41, 81-91.
2. Arora, N., & Kaur, P. D. (2020). A Bolasso based consistent feature selection enabled random forest
classification algorithm: An application to credit risk assessment. Applied Soft Computing, 86, 105936.
3. Azadi, H., Taheri, F., Burkart, S., Mahmoudi, H., De Maeyer, P., & Witlox, F. (2021). Impact of
agricultural land conversion on climate change: H. Azadi et al. Environment, Development and
Sustainability, 23(3), 3187-3198.
4. Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future
directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31.
5. Bogale, T., Degefa, S., Dalle, G., & Abebe, G. (2025). Machine learning-based analysis of land use and
land cover trends in southeastern Ethiopia using Google Earth Engine. Discover Sustainability, 6(1), 878.
6. Du, J., Watts, J. D., Jiang, L., Lu, H., Cheng, X., Duguay, C., & Tarolli, P. (2019). Remote sensing of
environmental changes in cold regions: Methods, achievements and challenges. Remote Sensing, 11(16),
1952.
7. Dubey, A., Singh, P. K., Singh, S., Manzoor, U., Sahoo, C., & Saini, A. (2025). Impact of Climate Change
and Land Degradation on Agriculture. In Eco-Resilience: Climate Change, Land Degradation and
Sustainable Solutions (pp. 49-95). Cham: Springer Nature Switzerland.
8. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth
Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment, 202, 18-27.
9. Kalantar, B., Ueda, N., Saeidi, V., Ahmadi, K., Halin, A. A., & Shabani, F. (2020). Landslide
susceptibility mapping: Machine and ensemble learning based on remote sensing big data. Remote
Sensing, 12(11), 1737.
10. Li, Y., & Xiao, X. (2025). Deep learning-based fusion of optical, radar, and LiDAR data for advancing
land monitoring. Sensors, 25(16), 4991.
11. Simarmata, N., Wikantika, K., Tarigan, T. A., Aldyansyah, M., Tohir, R. K., Fauzi, A. I., & Fauzia, A.
R. (2025). Comparison of random forest, gradient tree boosting, and classification and regression trees
for mangrove cover change monitoring using Landsat imagery. The Egyptian Journal of Remote Sensing
and Space Sciences, 28(1), 138-150.
12. Van Tricht, K., Gobin, A., Gilliams, S., & Piccard, I. (2018). Synergistic use of radar Sentinel-1 and
optical Sentinel-2 imagery for crop mapping: A case study for Belgium. Remote Sensing, 10(10), 1642.