Page 156

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

A Comparative Analysis of Random Forest and Gradient Tree

Boosting for Cropland Mapping Using Multi-Sensor Sentinel Data: A

Case Study of Abuja Municipal Area Council, Nigeria

Idris Ibrahim

, Salman Salis Khalid

, Hudu Hamza Musa

, Nafisah Abdullahi Ahmed

Strategic Space Applications Department, National Space Research and Development Agency, Abuja,

Nigeria

Atlantic International Research Center, National Space Research and Development Agency, Abuja,

Nigeria

*Corresponding Author

DOI:

https://doi.org/10.51583/IJLTEMAS.2026.150400014

Received: 01 April 2026; 06 April 2026; Published: 02 May 2026

ABSTRACT

Accurate mapping of cropland in urbanizing areas is critical for sustainable land use planning and climate-

resilient urban development. In this study, the suitability of two machine learning algorithms, namely Random

Forest and Gradient Tree Boosting, was tested and compared in the context of land use/land cover classification

in the Abuja Municipal Area Council in Nigeria. The evaluation was carried out with the integration of multi-

sensor data obtained from Sentinel-1 Synthetic Aperture Radar and Sentinel-2 optical imagery in the Google

Earth Engine platform. In order to improve the overall accuracy of the classification results, an extensive feature

set was developed and incorporated into the machine learning algorithms. The feature set was developed based

on the integration of spectral and phenological features with texture features. The overall accuracy of the

classification results obtained with the two algorithms was found to be satisfactory, with values above 77%.

However, the overall accuracy obtained with the Gradient Tree Boosting classifier was slightly better at 77.5%,

with a Kappa coefficient of 0.719. However, when the class-specific accuracy was evaluated, the Random Forest

classifier was found to be better in the context of classifying the cropland class. The overall accuracy of the

Random Forest classifier in classifying the cropland class was found to be better in terms of precision at 78.9%,

recall at 75.7%, and F1 score at 77.2%. Therefore, the Random Forest classifier was selected for the final

classification of the study area. The classified image obtained with the Random Forest classifier estimated the

total extent of the cropland class to be around 47,924 hectares in the study area in the year 2025. The estimated

extent of the cropland class accounts for around 27.1% of the total area. The results obtained in this study

demonstrate the suitability of the integration of multi-sensor data in improving the overall accuracy of the

classification results. In addition, the results obtained in this study demonstrate the importance of class-specific

accuracy in the selection of the machine learning algorithms for the purpose of classifying the land cover class.

Keywords: Random Forest, Gradient Tree Boosting, Cropland Mapping, Sentinel-1, Sentinel-2, Data Fusion.

INTRODUCTION

Agricultural land makes up 38% of the Earth’s terrestrial surface and is of crucial importance to food security

and ecosystem services (Azadi et al., 2021). Accurate and timely information regarding cropland is essential in

monitoring and forecasting crop yield, ensuring the sustainable management of cropland in the face of climate

change and land degradation (Dubey et al., 2025).

Remote sensing enables observations of the land surface in a continuous and repetitive way, unfettered by

geographical boundaries (Du et al., 2019). The European Space Agency’s SENTINEL program is instrumental

in significantly improving the capabilities of remote sensing in land observations and management. SENTINEL-

Page 157

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

2 is capable of providing multispectral observations with a spatial resolution of 10-20m and a 5-day revisit time,

allowing for crop phenology characterization, while SENTINEL-1 is capable of providing all-weather

observations using C-band SAR data, which is useful in providing complementary information regarding crop

structure (Li et al., 2025). The synergistic use of optical and SAR data has also shown improved accuracy in

land classification through the use of biochemical indices and backscatter and texture information, which is not

affected by clouds (Li et al., 2025). Ensemble machine learning algorithms are also particularly useful in dealing

with the high-dimensional nature of remote sensing data (Kalantar et al., 2020). Random Forest (RF) uses

bagging and random selection of features, which ensures robust performance with low chances of overfitting

(Arora & Kaur, 2020). Gradient Tree Boosting (GTB) uses sequential tree building to reduce residuals, which

can provide high accuracy but requires hyperparameter tuning (Bogale et al., 2025). These algorithms can also

be used in an efficient manner through Google Earth Engine (GEE), which is a cloud-based platform for

planetary-scale geospatial analysis (Gorelick et al., 2017).

Comparative analysis of these algorithms has also shown that algorithm performance is context-specific. RF

performed better than GTB in mangrove monitoring (Simarmata et al., 2025) and some cases of fused optical-

SAR tree species classification. However, it is also evident that cropland mapping is a particularly challenging

task due to similarity in spectral signatures of crops with other vegetation, phenological variability, and

management heterogeneity. Therefore, it is expected that findings of other LULC-specific works may not be

applicable in this context. The objective of this research is to compare the performance of RF and GTB in

cropland mapping in a tropical savanna environment in a developing country context. In addition to this, an

exhaustive feature stack will be used in GEE. Therefore, the objectives of this research are to: construct an

exhaustive feature stack of Sentinel-1 and Sentinel-2 data in GEE, compare and evaluate the performance of RF

and GTB in cropland mapping in GEE, and map the 2025 cropland extent using the best-performing algorithm.

MATERIALS AND METHODS

Study Area

The study area is located in Nigeria, in the Federal Capital Territory, covering 1,769 km² in the Abuja Municipal

Area Council (AMAC) (8°45′–9°10′N, 7°10′–7°35′E) (figure 1). The climate is classified as tropical savanna

with 1,500 mm rainfall annually and temperatures ranging from 26–35 °C. The land cover types include urban,

informal settlement, savanna, cropland, and water. Urbanization in the study area (Abubakar, 2014).

Figure 1: The study area within Federal Capital Territory, Abuja

Page 158

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

Satellite Data and Preprocessing

All data processing was implemented within a cloud environment, Google Earth Engine. The data used including

an optical dataset; Sentinel-2 Level 2A Surface Reflectance data (COPERNICUS/S2_SR_HARMONIZED)

covering the time period from January to December 2025. From these spectral bands, six bands are used: B2,

B3, B4, B8, B11, and B12. For cloud and shadow removal, Scene Classification Layer was applied. Also, scaling

is done to the range of 0 to 1. Normalized Difference Vegetation Index (NDVI) and Normalized Difference

Built-up Index (NDBI) were used. Median composites were implemented to obtain five phenological metrics

such as minimum NDVI, maximum NDVI, mean NDVI, standard deviation, and amplitude from NDVI time

series.

For the Synthetic Aperture Radar (SAR) data, the study utilized Sentinel-1 Ground Range Detected Imagery

Interferometric Wide mode data (COPERNICUS/S1_GRD) with polarization of both VV and VH. This was also

for the period 2025. GEE's built-in function was applied for preprocessing. That included thermal noise removal,

radiometric calibration, and terrain correction using SRTM. Median composites were used for generating five

phenological metrics such as minimum, maximum, mean, standard deviation, and temporal variance of VH

backscatters. Also, GLCM texture was used. Contrast, entropy, and variance were applied. These were calculated

from scaled VH backscatters with a 3 x 3 moving window.

Feature Stack and Reference Data

A feature stack consisting of 20 bands was created, which including 6 spectral bands, 2 spectral indices (NDVI

and NDBI), 5 NDVI phenological metrics, 3 SAR backscatter values, 1 temporal variance image, and 3 GLCM

texture values. All images were resampled to 10 m spatial resolution using bilinear resampling.

Stratified random sampling was used to generate 250 reference points from five different land use land cover

classes, namely cropland, forest/savanna, built-up, bare soil, and water, which were further divided into training

and validation sets (30% and 70%, respectively). Image feature values are extracted using nearest neighbor

resampling.

Classification and Accuracy Assessment

RF classifier was applied with 200 trees, default number of trees in each node, and minimum leaf population of

1 (Arora & Kaur, 2020).

GTB classifier was used with 100 trees and learning rate of 0.05 (Bogale et al., 2025).

Fixed random seed of 42 was set in the code to ensure reproducibility of results.

Accuracy assessment was carried out using the validation set, and confusion matrix is used to evaluate accuracy,

Kappa coefficient, and precision, recall, and F1-score of cropland class.

Cropland Area Estimation and Implementation

RF classifier was used to classify the image and appeared to be the better classifier based on accuracy and other

parameters.

Binary image is generated showing cropland and non-cropland classes.

Cropland area was estimated using ee.Image.pixelArea() and reduceRegion (30 m scale) in GEE.

All code was written and executed in Google Earth Engine (Gorelick et al., 2017).

Page 159

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

RESULTS

Classification Accuracy

The accuracy of both classifiers was high. However, the overall accuracy and Kappa statistic were slightly higher

for the GTB classifier: overall accuracy was 77.5%, and the Kappa statistic was 0.719. The overall accuracy and

Kappa statistic for the RF classifier were slightly lower. The RF classifier was better at detecting the class of

cropland: it had a precision of 78.9%, recall of 75.7%, and F1-score of 77.2%.

Table 1 presents the accuracy of the classifiers. The overall accuracy and Kappa statistic of the Gradient Tree

Boosting classifier were slightly higher: overall accuracy was 77.5%, and the Kappa statistic was 0.719.

However, the overall accuracy and Kappa statistic of the Random Forest classifier were slightly lower. The

Random Forest classifier was better at detecting

Table 1: Comparative accuracy metrics for RF and GTB classifiers

Metric

GTB

Overall Accuracy

0.772

0.775

Kappa

0.715

0.719

Precision

0.789

0.783

Recall

0.757

0.730

F1 Score

0.772

0.755

Classification Confusion matrix

The confusion matrix for the RF classification (Table 2) shows that the algorithm performed well across all

classes, with the highest producer's accuracy observed for water bodies (Class 5). Cropland (Class 2) had 38 true

positives, with 18 omission errors (misclassified as other classes) and 21 commission errors.

Table 2. Confusion matrix for Random Forest classification

Class

Predicted Total

Reference Total

Class 1

Class 2

Class 3

Class 4

Class 5

Class

Predicted Total

Reference Total

Overall Accuracy

0.772334

Kappa Coefficient

0.714816

From the confusion matrix, the following accuracy metrics were derived:

Precision (User's Accuracy): 0.789

Recall (Producer's Accuracy): 0.757

F1-Score: 0.772

Gradient Tree Boosting Performance

The confusion matrix for the GTB classification (Table 3) indicates a similar pattern, with slightly different error

distributions. Notably, cropland recall for GTB (73.0%) was lower than for RF (75.7%).

Page 160

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

Table 3. Confusion matrix for Gradient Tree Boosting classification

Class

Predicted Total

Reference Total

Class 1

Class 2

Class 3

Class 4

Class 5

Class

Predicted Total

Reference Total

Overall Accuracy

0.775216

Kappa Coefficient

0.718622

Derived accuracy metrics for GTB:

Cropland-specific metrics (class 1):

Precision (User's Accuracy): 0.783

Recall (Producer's Accuracy): 0.730

F1-Score: 0.755

Classification Maps

The initial LULC classes derived by the two algorithms (Figure 2a and Figure 2b, respectively, for RF and GTB)

successfully distinguished the five land cover classes such as cropland, vegetation, built-up bare soil, and water

bodies in the AMAC area.

Based on the accuracy comparison results, RF was finally used to classify the cropland land cover class in the

area. Figure 3b shows the binary cropland and non-cropland classification map derived by the RF classifier. The

figure indicates that cropland is mainly located in the southern river valleys and plains of the area.

Visual comparison of the maps derived by the two classifiers (Figure 3a and Figure 3b, respectively, for GTB

and RF classifiers) indicates that RF performed better in terms of clarity of boundaries and homogeneity of the

derived cropland class.

The total cropland area in the year 2025 is estimated to be 47,924 hectares, which accounts for 21.7% of the total

area of the AMAC area. The GTB classifier estimated the cropland area to be larger (49,361 ha), and this could

be explained by the higher commission error of the classifier for the cropland class.

Page 161

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

Figure: 2b GTB Classified Image

Figure: 2a RF Classified Image

Page 162

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

Figure: 3a GTB Cropland

Figure: 3b RF Cropland

Page 163

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

DISCUSSION

The results show that the classifiers, namely Random Forest (RF) and Gradient Tree Boosting (GTB), were

effective in classifying the LULC types in the complex urban and peri-urban area using multi-sensor Sentinel

data, with an accuracy greater than 77%. Although GTB performed slightly better in terms of global accuracy

and Kappa values, RF performed significantly better in the classification of cropland types, as indicated by the

precision, recall, and F1-score values. This is an important aspect of the results, as it emphasizes the importance

of choosing classifiers based on class-wise accuracy rather than global accuracy.

It is believed that RF performed significantly better in the classification of cropland types because of the inherent

robustness of the classifier. The bagging method and random selection of features in RF make the classifier more

robust and less prone to overfitting and noise, which is more likely in complex landscapes where the spectral

signature of cropland, fallow land, and natural vegetation is likely to be similar (Belgiu and Drăguţ, 2016). The

balanced omission and commission errors of RF in classifying cropland types are more significant in generating

accurate agricultural statistics. The sequential learning in GTB might be more prone to overfitting and noise,

which could have led to a higher commission error (24%) compared to RF (21%).

The effectiveness of the 20-band multi-sensor feature stack can be clearly observed. The incorporation of NDVI-

derived phenological metrics was instrumental in distinguishing the phenology of crops from the stable

phenology of natural vegetation in the savanna environment. At the same time, the incorporation of SAR

backscatter (VV, VH), the VH/VV ratio, temporal variance, and GLCM textures provided essential structural

and spatial information, which can be invaluable in areas with persistent cloudy conditions. This synergy was

able to mitigate the common confusion between croplands, grasslands, and bare soil, which is a common

challenge in agricultural areas, particularly in peri-urban areas (Van Tricht et al., 2018).

The results are in line with previous comparative studies. Simarmata et al. (2025) similarly observed that RF

outperformed GTB in a complex mangrove environment, which is also characterized by high spectral variability.

These findings support the reliability of RF as a robust, computationally efficient, and scalable algorithm for

cropland monitoring in the GEE environment, particularly in areas where tuning of hyperparameters is limited.

Despite the robustness of the RF algorithm, there are some limitations to the study. For instance, the number of

samples used in the study (n=250) is limited for an area of high landscape variability, which could potentially

improve the accuracy of the models by using a larger dataset, which could be validated in the field. Moreover,

the hyperparameters used in the study were standard, which could potentially improve the performance of the

models through a rigorous optimization process, such as cross-validation. For future work, the following

improvements are proposed: (1) the development of a rigorous hyperparameter tuning process for both models,

(2) the use of ensemble models to potentially harness the best of both models, (3) using SHAP analysis to

improve model interpretability, and (4) the generalization of the developed framework to other agro-ecological

zones in Nigeria and West Africa.

CONCLUSIONS

This study aims to give a comparative overview of the Random Forest and Gradient Tree Boosting algorithms

in cropland mapping using multi-sensor Sentinel imagery in Google Earth Engine. The key findings of the study

were:

Although Gradient Tree Boosting had a marginally higher accuracy (77.5%), Random Forest had higher cropland

mapping performance in terms of F1-score (77.2%), which is more representative of precision-recall trade-offs.

The feature stack used in this study, which combines Sentinel-2 optical, Sentinel-1 radar imagery, and other

features, proved highly effective in cropland mapping in a complex tropical savanna setting with high levels of

urbanization.

Page 164

www.rsisinternational.org

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue IV, April 2026

Thirdly, Random Forest was selected as the most effective algorithm in cropland mapping, generating a map of

cropland extent in 2025 for the Abuja Municipal Area Council in Nigeria, with an estimated cropland area of

47,924 hectares or 21.7% of the study area.

Finally, the study also shows that in land cover mapping, using class-specific metrics is more effective in

selecting the most appropriate algorithm than using global metrics.

The scalable approach developed in this study using GEE is highly effective in cropland mapping, which is

highly valuable in sustainable management and food security assessments in data-deficient regions.

REFERENCES

1. Abubakar, I. R. (2014). Abuja city profile. Cities, 41, 81-91.

2. Arora, N., & Kaur, P. D. (2020). A Bolasso based consistent feature selection enabled random forest

classification algorithm: An application to credit risk assessment. Applied Soft Computing, 86, 105936.

3. Azadi, H., Taheri, F., Burkart, S., Mahmoudi, H., De Maeyer, P., & Witlox, F. (2021). Impact of

agricultural land conversion on climate change: H. Azadi et al. Environment, Development and

Sustainability, 23(3), 3187-3198.

4. Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future

directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31.

5. Bogale, T., Degefa, S., Dalle, G., & Abebe, G. (2025). Machine learning-based analysis of land use and

land cover trends in southeastern Ethiopia using Google Earth Engine. Discover Sustainability, 6(1), 878.

6. Du, J., Watts, J. D., Jiang, L., Lu, H., Cheng, X., Duguay, C., & Tarolli, P. (2019). Remote sensing of

environmental changes in cold regions: Methods, achievements and challenges. Remote Sensing, 11(16),

1952.

7. Dubey, A., Singh, P. K., Singh, S., Manzoor, U., Sahoo, C., & Saini, A. (2025). Impact of Climate Change

and Land Degradation on Agriculture. In Eco-Resilience: Climate Change, Land Degradation and

Sustainable Solutions (pp. 49-95). Cham: Springer Nature Switzerland.

8. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth

Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment, 202, 18-27.

9. Kalantar, B., Ueda, N., Saeidi, V., Ahmadi, K., Halin, A. A., & Shabani, F. (2020). Landslide

susceptibility mapping: Machine and ensemble learning based on remote sensing big data. Remote

Sensing, 12(11), 1737.

10. Li, Y., & Xiao, X. (2025). Deep learning-based fusion of optical, radar, and LiDAR data for advancing

land monitoring. Sensors, 25(16), 4991.

11. Simarmata, N., Wikantika, K., Tarigan, T. A., Aldyansyah, M., Tohir, R. K., Fauzi, A. I., & Fauzia, A.

R. (2025). Comparison of random forest, gradient tree boosting, and classification and regression trees

for mangrove cover change monitoring using Landsat imagery. The Egyptian Journal of Remote Sensing

and Space Sciences, 28(1), 138-150.

12. Van Tricht, K., Gobin, A., Gilliams, S., & Piccard, I. (2018). Synergistic use of radar Sentinel-1 and

optical Sentinel-2 imagery for crop mapping: A case study for Belgium. Remote Sensing, 10(10), 1642.