Input data
Soil data
The soil characteristics required by AquaCrop include soil water content at permanent wilting point (PWP), field capacity (FC), and saturation (SAT), as well as the hydraulic conductivity at saturation (Ksat). These parameters were estimated using pedotransfer functions based on soil texture (granulometric content), gravel content, and organic matter, following the methodology by Saxton and Rawls (2006). Soil texture, organic matter content (organic carbon), gravel content, and rootable soil depth were sourced from the Harmonized World Soil Database version 2.0 (HWSD v2.0). This database is a comprehensive global soil inventory providing detailed information on the morphological, chemical, and physical properties of soils on a 30 arc-second global grid. HWSD v2.0 was jointly developed by the International Institute for Applied Systems Analysis (IIASA) and FAO, and in partnership with International Soil Reference and Information Centre (ISRIC), the European Soil Bureau Network (ESBN) and the Institute for Soil Sciences, Chinese Academy of Sciences (CAS).
The HWSD comprises a raster image file linked to an attribute database in Microsoft Access format, providing detailed information on soil composition for each of the 29,538 soil association mapping units (SMUs). Each SMU can include up to 12 soil unit/soil phase combination records with standardized soil parameter values for seven depth layers. To integrate this data with the simulation grid, the HWSD raster was clipped using QGIS (Open-source Geographic Information System software), determining the predominant SMU value for each grid cell via zonal statistics. Special considerations were made for coastal areas and to exclude pixels corresponding to water bodies. Since an SMU can consist of multiple soil units, each represented by a 'SHARE' (%) field, the weighted mean values of the target soil variables were calculated for each SMU across the seven depth layers. After this process, some areas had empty or null soil parameter values, corresponding to soils classified as sand dunes, salt flats, urban areas, and mining zones, among others. Specifically, 303 soil mapping units were identified, with 125 having a single soil unit (SHARE = 100%) and 178 having multiple soil units (SHARE < 100%). For the latter, the 'SHARE' percentages of soil units with non-null values were reweighted, excluding the percentage of soil units without information, using the formula:

For the group of 125 SMUs with SHARE = 100% (i.e., not associated with any other soil unit), a new grid was generated in QGIS using the r.reclass algorithm, where these SMU units were designated a NULL value. Subsequently, a new zonal statistics analysis was conducted to assign new SMU values to these conflicting cells (Figure 1). Finally, the weighted mean values of soil variables were calculated for each SMU across all available horizons.

Screenshot of Raster Reclassification of the soil layer under the cropland layer. The blank areas correspond to SMU = NULL.
Regarding the soil variables, organic matter content was estimated from organic carbon using the 'van Bemmelen' conversion factor, which assumes that organic matter contains 58% organic carbon. For estimating the hydraulic soil properties (PWP, FC, SAT, Ksat), the Saxton and Rawls (2006) pedotransfer functions (PTFs) were employed via an R package (primarily providing wrapper functions for running and analysing the outputs of DSSAT CSM). Utilizing the 'Robjects' library, this R package was integrated into the Python code. Adjustments were made to establish minimum and maximum limits for each soil parameter, ensuring proper functionality of the PTFs:
