Project
Modelling potato yield response to fertilizers and soil nutrients using semi-mechanistic models and machine learning
In this project, we will combine laboratory measurements on soil and potato nutrients with machine learning and semi-mechanistic models, and consider independent variables such as climate, terrain, soil nutrient concentrations and other soil properties, potato uptake, management, and fertilizer application, to explore and model relationships between fertilizer and soil nutrient supply, potato uptake and yield. To this end, large datasets from field experiments and farms will be used to: 1) predict potato yield across space and time; 2) quantify potato yield gaps in China; 3) develop and test a hybrid modelling approach; 4) provide regional recommendations for optimizing fertilizer application.
Potato (Solanum tuberosum L.) is a crop with a high yield potential and is an important crop for food security in temperate regions. China is the world's largest potato producer, representing 28.3% of the total potato planting area in the world (FAOSTAT, 2021). During the past decades, there was an upward trend in potato production in China, with a total production of 99 million tons in 2017, ranking fourth after wheat, corn and rice (Cui et al., 2019). However, the difference between potential yield and actual yield in important potato producing regions in China is substantial. The yield gap between potential yield and actual yield ranges from 17.6 to 48.9 t FM per ha, while the difference between water limited yield and actual yield ranges from 16.2 to 37.8 t FM per ha (Wang et al., 2019). These large yield gaps indicate that there are opportunities to enhance yields by improving management practices.
Mechanistic, semi-mechanistic and machine learning models are recognized as a crucial and operational tool in yield prediction and agricultural impact studies. However, both mechanistic models and machine learning models have intrinsic drawbacks for modelling and predicting crop yield and nutrient use efficiencies. Combining the two approaches might improve both the prediction accuracy and interpretability.
The first part of project aims to explain the relationship between explanatory variables (i.e., covariates) and potato yield based on statistical machine learning approaches using the “field experiment dataset”., random forest models (RF) with different cross validation schemes will be used to predict potato yield across space and time. The performance of mode will be evaluated, uncertainty and errors of RF models will be qualified.
In the Second part of project, we will use the field experiment dataset and part of the farms dataset to quantify the potato yield gap and identify factors that can narrow the gap. Potential yield (YP) and water-limited yield (YW) will be estimated by the WOFOST model (World Food Studies) and nutrient-limited yield (YNPK) will be derived from the QUEFTS model (Quantitative Evaluation of the Fertility of Tropical Soils).
In the third part, the hybrid approach consists of using the outputs of a semi-mechanistic model, such as YW, YP, water use efficiency and leaf area index from WOFOST and soil nutrient availability, fertilizer nutrient requirements and YNPK from QUEFTS, as covariates in the machine learning model. In the fourth part, we will provide site-specific fertilizer recommendations based on the models developed in previous sub-projects. This will be done by running the trained models with different fertilizer application levels. With this trained model, predictions of the potato yield and nutrient use efficiency can be obtained for different values and combinations of various fertilizer applications
Results
The model performance is evaluated by 10-fold cross-validation, the model performance was therefore also evaluated by leave-block-out (LBOCV), leave-site-out (LSOCV) and leave-year-out cross-validation (LYOCV).