Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine learning methods.

Johnson, M.D., Hsieh, W.W., Cannon, A.J., Davidson, A.M., and Bedard, F. (2016). "Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine learning methods.", Agricultural and Forest Meteorology, 218-219, pp. 74-84. doi : 10.1016/j.agrformet.2015.11.003  Access to full text

Abstract

Crop yield forecast models for barley, canola and spring wheat grown on the Canadian Prairies were developed using vegetation indices derived from satellite data and machine learning methods. Hierarchical clustering was used to group the crop yield data from 40 Census Agricultural Regions (CARs) into several larger regions for building the forecast models. The Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) derived from the Moderate-resolution Imaging Spectroradiometer (MODIS), and NDVI derived from the Advanced Very High Resolution Radiometer (AVHRR) were considered as predictors for crop yields. Multiple linear regression (MLR) and two nonlinear machine learning models – Bayesian neural networks (BNN) and model-based recursive partitioning (MOB) – were used to forecast crop yields, with various combinations of MODIS-NDVI, MODIS-EVI and NOAA-NDVI as predictors. Crop yield forecasts made using predictors from July and earlier were evaluated by the cross-validated mean absolute error skill score (in reference to climatological forecasts) during 2000–2011. While MODIS-NDVI was found to be the most effective predictor for all three crops, having MODIS-EVI as an additional predictor enhanced the forecast skills. While MLR, BNN and MOB all showed significantly higher skills than climatological forecasts for all three crops, barley was the only case where the nonlinear BNN and MOB models showed slightly higher skills than MLR. The lack of skill improvement by nonlinear models over MLR is likely due to the short (12 years) record available for MODIS data, which limits our study to 2000–2011, with very low yields coming from a single severe drought year (2002).

Date modified: