District-Level Crop Yield Prediction in India: A Random Forest Framework with SHAP-Enhanced Explainability and Spatial Residual Analysis.

Article Sidebar

Main Article Content

Abdulmumini Imam Ibrahim
Amina Muhammad Dawud
Jidda Harun Abba

Precise assessment of district-level crop yields is crucial for food security planning and targeted agricultural interventions in India; however, conventional statistical methods fail to account for spatial variability and nonlinear connections among agronomic variables. This study developed a Random Forest-based framework for predicting crop yield across Indian districts using multi-year data on crop type, season, production, and cultivated area, complemented by open-source agronomic datasets. Yield was log-transformed to stabilise variance, and the model was trained with an 80:20 train–test split and hyperparameter tuning via grid search and cross-validation, while permutation importance and SHAP analyses were applied to interpret feature contributions and district-level residual patterns. The Random Forest model achieved strong predictive performance on the test set, with , low RMSE and MAE, and close alignment between predicted and observed yields for most districts. Feature attribution indicated that production, cultivated area, and season were the most influential predictors, and spatial aggregation of residuals revealed clusters of systematic over- and under-prediction linked to data-poor or agro-ecologically complex regions. An explainable machine learning pipeline, resolved at the district level, can accurately forecast crop output variability in India, providing detailed insights that exceed those of conventional regression techniques and facilitate region-specific policy and management decisions.  The framework necessitates enhanced regional data quality and the incorporation of more comprehensive meteorological and soil information to better operational agriculture monitoring.

District-Level Crop Yield Prediction in India: A Random Forest Framework with SHAP-Enhanced Explainability and Spatial Residual Analysis. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(12), 242-250. https://doi.org/10.51583/IJLTEMAS.2025.1412000021

Downloads

References

“Final estimates of production of major crops for the year 2022-23.” [Online]. Available: www.phdcci.in

“Ministry of Agriculture & Farmers Welfare Department of Agriculture and Farmers’ Welfare releases Final Estimates of major agricultural crops for 2023-24.” [Online]. Available:

https://www.pib.gov.in/PressReleasePage.aspx?PRID=2058534

S. Saiful and N. B. Wibisono, “Crop Yield Prediction Using Random Forest Algorithm and XGBoost Machine Learning Model,” International Journal of Research and Innovation in Social Science, vol. IX, no. III, pp. 1983–1994, Apr. 2025, doi: 10.47772/IJRISS.2025.90300155.

R. Prathiba, D. Sri Harsha, D. Madhu, D. Chaitanya Venkata Ajay, and D. Harsha Vardhan Assistant Professor, “International Journal of Innovative Research in Science Engineering and Technology (IJIRSET) Crop Yield Prediction using Random Forest Algorithm”, doi:

15680/IJIRSET.2025.1404465.

T. van Klompenburg, A. Kassahun, and C. Catal, “Crop yield prediction using machine learning: A systematic literature review,” Comput Electron Agric, vol. 177, p. 105709, Oct. 2020, doi: 10.1016/j.compag.2020.105709.

S. K. Sharma, D. P. Sharma, and K. Gaur, “Machine Learning Techniques for Crop Yield Forecasting in Semi-Arid (3A) Zone, Rajasthan (India),” Current Agriculture Research Journal, vol. 11, no. 3, pp. 895–914, Jan. 2024, doi: 10.12944/CARJ.11.3.19.

Article Details

How to Cite

District-Level Crop Yield Prediction in India: A Random Forest Framework with SHAP-Enhanced Explainability and Spatial Residual Analysis. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(12), 242-250. https://doi.org/10.51583/IJLTEMAS.2025.1412000021