Code comments #11

jrising · 2025-01-27T18:02:52Z

Comments on code in R:

gaia/R/crop_calendars.R

Line 26 in f4a0e78

d$plant <- ifelse(d$Plant.start < d$Plant.end, ((d$Plant.start + d$Plant.end) / 2 / 30), ((d$Plant.start + d$Plant.end + 365) / 2 / 30))

: Day-of-year divided by 30 isn't a great approximation for month, unless you're working with 360-day calendar data.
gaia/R/crop_calendars.R

Line 47 in f4a0e78

# Deal with exceptions (countries by states or regions)

: This would be less error-prone if encoded in a data file.
gaia/R/data_aggregation.R

Line 29 in f4a0e78

input_file = "crop_calendar.csv"

: If you intend for users to modify this, provide them a way to use their own filename.
gaia/R/data_aggregation.R

Line 51 in f4a0e78

pattern = utils::glob2rx("*precip*rfc*"),

: Why is only the RFC weighted data used?
gaia/R/weighted_climate.R

Line 98 in f4a0e78

dplyr::left_join(mirca_harvest_area, by = c("lon", "lat")) %>%

: This assumes a 1-to-1 mapping between region and grid cell. At the country level, this is usually fine, but would not work at higher resolutions.
gaia/R/yield_impacts_functions.R

Line 439 in f4a0e78

reg$se_robust <- sandwich::vcovHC(reg, type = "HC", weights = iso)

: The SE should also be clustered, since there will be correlation between error terms in the each country.

The text was updated successfully, but these errors were encountered:

For #11 #10 - flexibility to use user provided crop calendar - flexibility to select crops - updated to the latest FAO data - updated with FAO crop to MIRCA2000 crop mapping - updated default MIRCA2000 crops to GCAM commodity mapping - make cleaning sage data its own function

mengqi-z · 2025-02-20T01:17:55Z

Dear @jrising,

Thanks for reviewing the code! We've made modifications based on your feedback and have also provided more detailed responses to some of your comments:

Day-of-year divided by 30 isn't a great approximation for month, unless you're working with 360-day calendar data.

Thank you for your comment. We use 30 as a divisor because the model operates at a monthly temporal resolution. The final planting and harvesting months are determined by rounding up the computed value using the ceiling function:
d$plant <- ceiling(ifelse(d$plant > 12, d$plant - 12, d$plant)).
Since the output is ultimately mapped to discrete months, small variations in day length (e.g., using 30 vs. 31 days) usually do not affect the final values of planting and harvesting months.

This would be less error-prone if encoded in a data file.

Thank you for your suggestion. We see the benefit of encoding this in the data file to reduce errors. However, our goal is to keep the data in its raw form while providing the processing codes for transparency. If this part were embedded in the data file, it wouldn’t be as clear for users to check the code. Additionally, the processing behaves differently depending on the selected crops. To improve clarity, we’ve cleaned up the crop calendar and moved the exception-handling code into a separate function.

Why is only the RFC weighted data used?

Thank you for your question. The aggregate nature of the yield data at the country level limits our ability to separately analyze the impacts of climate change on irrigated versus rainfed crops. We use rainfed-land-weighted climate data because rainfed crops are directly dependent on precipitation and temperature, making them more sensitive to climate variability than irrigated crops, which have more stable water availability. By incorporating rainfed land-weighted precipitation and temperature, the model better captures yield responses to changing climate conditions in historical periods. To ensure consistency, future climate projections also use rainfed land-weighted climate to project yield shocks.

This assumes a 1-to-1 mapping between region and grid cell. At the country level, this is usually fine, but would not work at higher resolutions.

Thank you for your insightful comment. Gaia is designed to operate at the country level for several reasons. First, it is intended to support global integrated assessment modeling, which typically relies on coarser spatial resolutions, such as country and regional scales. Additionally, the availability of global crop yield data imposes constraints; for example, FAOSTAT provides yield data at the country level, which is essential for gaia's empirical model fitting.
In addition, gaia's design aligns with available data standards. Key datasets, such as MIRCA2000, which provides crop-specific harvest area data at a 0.5-degree resolution, and ISIMIP’s climate data, which follows the same resolution, dictate the model's spatial scale. Given these constraints, gaia's finest resolution is set at 0.5 degrees, which ensures consistency across input datasets and model outputs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code comments #11

Code comments #11

jrising commented Jan 27, 2025

mengqi-z commented Feb 20, 2025