import Pkg;
Adding stations and weather variables
This tutorial has two objectives
- Train a multivariate multisite (simplistic) model, reusing the hidden states trained in the other tutorial.
- How to use this model with a crop model to generate annual crop yield for maize.
In the first part, the tutorial shows how to easily train weather stations given the hidden states sequence z
obtained in the previous tutorial. We will show how to make a (simplistic) mutlisite SWG with multiple correlated weather variables such as daily Rain RR
($\mathrm{m}\mathrm{m}/\mathrm{m}^2$), daily Temperature minimum TN
(°C), maximum TX
(°C), total daily solar irradiance QQ
(MJ/$\mathrm{m}^2$) and daily evapotranspiration Penman ETPP
($\mathrm{m}\mathrm{m}/\mathrm{m}^2$). This model will be trained with respect to the given hidden states, and the parameters will be periodic and vary smoothly during a calendar year.
The hidden states and the seasonality are enough to correlate well the weather variables without extra codependency between simulated variables.
This multisite, multivariable model has been used as input of the STIC crop model to generate data of annual crop yield for maize in the GenHack 3 Hackathon. See the associated published dataset https://doi.org/10.57745/C3FNBY. In the second part, we show what steps to follow to generate a similar dataset.
Set up
Package and functions
using CSV, DelimitedFiles# File Read/Load/Save/dwl
import JLD# File Read/Load/Save/dwl
import Downloads
using DataFrames, DataFramesMeta # DataFrames
using Dates
using Random, Distributions
using LaTeXStrings
The two main packages for this tutorial are not yet registered in the official Julia registry, since they are not quite fully ready. They can be either add
ed through my local Julia registry with the LocalRegistry.jl package i.e.
using Pkg
pkg"registry add https://github.com/dmetivie/LocalRegistry"
Pkg.add("SmoothPeriodicStatsModels")
Pkg.add("StochasticWeatherGenerators")
using SmoothPeriodicStatsModels
using StochasticWeatherGenerators
Data extraction and settings
To get the interesting weather variables, we use weather station provided by a the French research institute for agronomy and environment (INRAE). This data is available through the INRAE CLIMATIK platform[climatik] (https://agroclim.inrae.fr/climatik/, in French) managed by the AgroClim laboratory of Avignon, France. Unfortunately, these data are not yet open access (they should be soon). Météo France do have a version of this data and it is accessible through an API on the website Data.Gouv.fr. This package provide a simple command to extract the data of one station (given its STAtionID) from the API.
# Download the four stations used in this tutorial from MeteoFrance collection
dfs = collect_data_MeteoFrance.([49215002, 80557001, 40272002, 63345002])
For example
collect_data_MeteoFrance(49215002)[1:10,:]
Row | __id | STAID | STANAME | LAT | LON | ALTI | DATE | RR | QRR | TN | QTN | HTN | QHTN | TX | QTX | HTX | QHTX | TM | QTM | TNTXM | QTNTXM | TAMPLI | QTAMPLI | TNSOL | QTNSOL | TN50 | QTN50 | DG | QDG | FFM | QFFM | FF2M | QFF2M | FXY | QFXY | DXY | QDXY | HXY | QHXY | FXI | QFXI | DXI | QDXI | HXI | QHXI | FXI2 | QFXI2 | DXI2 | QDXI2 | HXI2 | QHXI2 | FXI3S | QFXI3S | DXI3S | QDXI3S | HXI3S | QHXI3S | DRR | QDRR | DHUMEC | QDHUMEC | PMERM | QPMERM | PMERMIN | QPMERMIN | INST | QINST | GLOT | QGLOT | DIFT | QDIFT | DIRT | QDIRT | INFRART | QINFRART | UV | QUV | UV_INDICEX | QUV_INDICEX | SIGMA | QSIGMA | UN | QUN | HUN | QHUN | UX | QUX | HUX | QHUX | UM | QUM | DHUMI40 | QDHUMI40 | DHUMI80 | QDHUMI80 | TSVM | QTSVM | ETPMON | QETPMON | ETPGRILLE | QETPGRILLE | ECOULEMENTM | QECOULEMENTM | HNEIGEF | QHNEIGEF | NEIGETOTX | QNEIGETOTX | NEIGETOT06 | QNEIGETOT06 | NEIG | QNEIG | BROU | QBROU | ORAG | QORAG | GRESIL | QGRESIL | GRELE | QGRELE | ROSEE | QROSEE | VERGLAS | QVERGLAS | SOLNEIGE | QSOLNEIGE | GELEE | QGELEE | FUMEE | QFUMEE | BRUME | QBRUME | ECLAIR | QECLAIR | NB300 | QNB300 | BA300 | QBA300 | TMERMIN | QTMERMIN | TMERMAX | QTMERMAX |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Int64 | String31 | Float64 | Float64 | Int64 | Date | Float64? | Int64? | Float64 | Int64 | Int64? | Int64? | Float64 | Int64 | Int64? | Int64? | Float64? | Int64? | Float64 | Int64 | Float64 | Int64 | Float64? | Int64? | Float64? | Int64? | Int64? | Int64? | Missing | Missing | Float64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Float64? | Int64? | Int64? | Int64? | Int64? | Int64? | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Int64? | Int64? | Int64? | Missing | Missing | Missing | Missing | Missing | Missing | Int64? | Int64? | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Int64? | Float64? | Int64? | Float64? | Int64? | Float64? | Int64? | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | String1? | Int64? | String1? | Int64? | String1? | Int64? | Missing | Missing | String1? | Int64? | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | Missing | |
1 | 566387 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-01 | missing | missing | -4.3 | 1 | missing | missing | 7.0 | 1 | missing | missing | missing | missing | 1.4 | 1 | 11.3 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
2 | 566388 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-02 | missing | missing | 5.5 | 1 | missing | missing | 8.0 | 1 | missing | missing | missing | missing | 6.8 | 1 | 2.5 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
3 | 566389 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-03 | missing | missing | 5.7 | 1 | missing | missing | 8.0 | 1 | missing | missing | missing | missing | 6.9 | 1 | 2.3 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
4 | 566390 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-04 | missing | missing | -1.3 | 1 | missing | missing | 5.0 | 1 | missing | missing | missing | missing | 1.9 | 1 | 6.3 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
5 | 566391 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-05 | missing | missing | 1.4 | 1 | missing | missing | 7.3 | 1 | missing | missing | missing | missing | 4.4 | 1 | 5.9 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
6 | 566392 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-06 | missing | missing | 2.4 | 1 | missing | missing | 8.0 | 1 | missing | missing | missing | missing | 5.2 | 1 | 5.6 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
7 | 566393 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-07 | missing | missing | 2.1 | 1 | missing | missing | 8.0 | 1 | missing | missing | missing | missing | 5.1 | 1 | 5.9 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
8 | 566394 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-08 | missing | missing | 4.4 | 1 | missing | missing | 11.0 | 1 | missing | missing | missing | missing | 7.7 | 1 | 6.6 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
9 | 566395 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-09 | missing | missing | 5.9 | 1 | missing | missing | 8.0 | 1 | missing | missing | missing | missing | 7.0 | 1 | 2.1 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
10 | 566396 | 49215002 | MONT-BELLAY-INRAE | 47.13 | -0.1415 | 58 | 1986-01-10 | missing | missing | 2.7 | 1 | missing | missing | 11.4 | 1 | missing | missing | missing | missing | 7.1 | 1 | 8.7 | 1 | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing | missing |
However, the data there does not exactly match the available on CLIMATIK, (less data, different values ...). For now I stored the CLIMATIK data on a private repo until the Météo France data is fixed.
While testing this function, it appears that MeteoFrance API might bug sometimes returning an error for some stations (and working for others). In that case, you can check the API directly here.
local_order = 1
memory_order = 2^local_order
K = 4
degree = 1
T = 366
366
Weather stations
We select four INRAE weather stations Montreuil-Bellay, Mons-en-Chaussée, Saint-Martin-de-Hinx and Saint-Gènes-Champanelle. Note that these are not the training stations used in the paper tutorial, however we will still use the weather regime trained there. The rain occurrence generation will be conditionally independent to this weather regime across all these four stations.
Station French department number.
station_dep = [49, 80, 40, 63]
station_name = ["Montreuil-Bellay", "Mons-en-Chaussée", "Saint-Martin-de-Hinx", "Saint-Gènes-Champanelle"]
station_gps = [(LAT = 47.13, LON = -0.1415), (LAT = 49.875, LON = 3.031), (LAT = 43.5708, LON = -1.29933), (LAT = 45.723, LON = 3.019) ]
D = length(station_name)
station_path = string.("https://forgemia.inra.fr/david.metivier/weather_data_mistea/-/raw/main/INRAE_stations/INRAE_STATION_", [49215002, 80557001, 40272002, 63345002], ".csv") .|> download
station_ndep = string.(station_name, " (", station_dep, ")")
4-element Vector{String}:
"Montreuil-Bellay (49)"
"Mons-en-Chaussée (80)"
"Saint-Martin-de-Hinx (40)"
"Saint-Gènes-Champanelle (63)"
Load
Load the AutoRegressive Seasonal HMM computed in this tutorial.
file_hmm = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/refs/heads/master/assets/tuto_1/hmm_fit_K_4_d_1_m_1.jld")
begin
hmm_infos = JLD.load(file_hmm)
hmm_fit_full = hmm_infos["hmm"]
hist = hmm_infos["hist"]
θq_fit = hmm_infos["Q_param"]
end
4×3×3 Array{Float64, 3}:
[:, :, 1] =
2.27287 1.61791 1.09247
0.047264 0.845656 -1.02341
-0.352188 -0.537942 0.725081
-2.86711 -1.38575 -2.01317
[:, :, 2] =
0.239175 -0.214254 0.200498
0.408978 0.148962 0.203501
0.377251 0.148114 0.28492
0.164178 0.0208895 -0.13681
[:, :, 3] =
0.283683 0.0230918 0.107845
0.0602381 -0.01249 0.103605
-0.0774946 -0.145427 0.0673092
0.315783 0.0398954 0.0329445
Load the sequence of estimated hidden states for the historical sequence.
file_df_z = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/refs/heads/master/assets/tuto_1/z_hat_K_4_d_1_m_1.csv")
df_z = CSV.File(file_df_z) |> DataFrame
df_z[1:10, :] # Show the first lines of the dataframes
Row | DATE | z |
---|---|---|
Date | Int64 | |
1 | 1956-01-01 | 1 |
2 | 1956-01-02 | 2 |
3 | 1956-01-03 | 4 |
4 | 1956-01-04 | 4 |
5 | 1956-01-05 | 4 |
6 | 1956-01-06 | 3 |
7 | 1956-01-07 | 1 |
8 | 1956-01-08 | 1 |
9 | 1956-01-09 | 3 |
10 | 1956-01-10 | 2 |
Load and filter the data by date. There are a few missing values that we impute using Impute.Interpolate
.
begin
data_stations_full = collect_data_INRAE.(station_path; show_warning=[:RR, :TX], impute_missing=[:RR, :TX])
for df in data_stations_full
@transform!(df, :RO = onefy.(:RR))
end
end
data_stations = [innerjoin(df, df_z; on=:DATE) for df in data_stations_full]
4-element Vector{DataFrames.DataFrame}:
12418×10 DataFrame
Row │ STAID DATE TN TX QQ ETPP RR ⋯
│ Int64 Date Float64? Float64 Float64? Float64? Float64 ⋯
───────┼────────────────────────────────────────────────────────────────────────
1 │ 49215002 1986-01-01 -5.7 8.7 2.0 0.9 13.6 ⋯
2 │ 49215002 1986-01-02 3.5 8.8 3.0 0.1 14.1
3 │ 49215002 1986-01-03 5.4 7.6 2.0 0.5 1.0
4 │ 49215002 1986-01-04 -2.9 6.8 6.1 0.0 6.0
5 │ 49215002 1986-01-05 2.6 7.5 3.4 1.4 14.0 ⋯
6 │ 49215002 1986-01-06 2.2 8.1 6.0 0.4 5.0
7 │ 49215002 1986-01-07 3.4 7.6 1.9 0.0 8.0
8 │ 49215002 1986-01-08 4.6 10.8 5.7 0.0 0.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
12412 │ 49215002 2019-12-25 6.4 8.3 2.0 0.1 0.0 ⋯
12413 │ 49215002 2019-12-26 4.6 13.6 3.76 0.1 0.0
12414 │ 49215002 2019-12-27 10.3 13.6 2.0 0.5 1.0
12415 │ 49215002 2019-12-28 5.6 11.0 5.01 0.0 0.5
12416 │ 49215002 2019-12-29 0.1 8.4 4.5 0.0 0.0 ⋯
12417 │ 49215002 2019-12-30 -1.8 8.2 6.71 0.0 0.0
12418 │ 49215002 2019-12-31 -2.4 6.6 3.02 0.0 0.5
3 columns and 12403 rows omitted
12449×10 DataFrame
Row │ STAID DATE TN TX QQ ETPP RR ⋯
│ Int64 Date Float64? Float64 Float64? Float64? Float64 ⋯
───────┼────────────────────────────────────────────────────────────────────────
1 │ 80557001 1985-12-01 7.5 13.5 4.1 0.2 0.0 ⋯
2 │ 80557001 1985-12-02 8.1 13.1 2.1 0.5 0.0
3 │ 80557001 1985-12-03 10.0 12.0 0.7 0.1 2.0
4 │ 80557001 1985-12-04 10.0 13.7 2.7 0.2 0.5
5 │ 80557001 1985-12-05 9.1 14.6 3.1 0.7 0.5 ⋯
6 │ 80557001 1985-12-06 5.5 10.6 3.6 0.4 2.5
7 │ 80557001 1985-12-07 8.2 11.6 1.3 0.4 1.5
8 │ 80557001 1985-12-08 2.9 8.6 1.4 0.4 1.5
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
12443 │ 80557001 2019-12-25 5.6 9.1 3.29 0.1 0.0 ⋯
12444 │ 80557001 2019-12-26 1.9 10.1 1.2 0.3 10.0
12445 │ 80557001 2019-12-27 7.3 8.5 1.07 0.2 0.0
12446 │ 80557001 2019-12-28 3.3 7.4 3.07 0.0 0.0
12447 │ 80557001 2019-12-29 -0.7 4.2 4.6 0.0 0.0 ⋯
12448 │ 80557001 2019-12-30 -1.0 5.7 5.24 0.0 0.0
12449 │ 80557001 2019-12-31 -1.3 3.9 4.76 0.0 0.0
3 columns and 12434 rows omitted
9845×10 DataFrame
Row │ STAID DATE TN TX QQ ETPP RR ⋯
│ Int64 Date Float64? Float64 Float64? Float64? Float64 ⋯
──────┼─────────────────────────────────────────────────────────────────────────
1 │ 40272002 1993-01-17 19.0 20.6 0.0 0.5 0.0 ⋯
2 │ 40272002 1993-01-18 16.7 19.4 0.0 0.5 0.0
3 │ 40272002 1993-01-19 7.9 16.0 0.7 0.5 15.0
4 │ 40272002 1993-01-20 -0.9 16.7 7.4 0.5 0.0
5 │ 40272002 1993-01-21 -0.6 15.9 7.7 0.5 0.0 ⋯
6 │ 40272002 1993-01-22 -1.2 17.7 7.8 0.5 0.0
7 │ 40272002 1993-01-23 0.2 16.0 6.3 0.6 0.0
8 │ 40272002 1993-01-24 -0.7 13.6 6.1 0.4 0.5
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
9839 │ 40272002 2019-12-25 6.7 19.3 6.46 0.1 0.0 ⋯
9840 │ 40272002 2019-12-26 7.3 15.7 5.29 0.3 0.0
9841 │ 40272002 2019-12-27 0.6 14.2 6.65 0.0 0.5
9842 │ 40272002 2019-12-28 5.3 10.7 2.32 0.2 0.0
9843 │ 40272002 2019-12-29 4.0 6.4 2.64 0.1 0.0 ⋯
9844 │ 40272002 2019-12-30 1.6 13.4 6.65 0.1 0.0
9845 │ 40272002 2019-12-31 -0.4 11.7 2.15 0.2 0.5
3 columns and 9830 rows omitted
12449×10 DataFrame
Row │ STAID DATE TN TX QQ ETPP RR ⋯
│ Int64 Date Float64? Float64 Float64? Float64? Float64 ⋯
───────┼────────────────────────────────────────────────────────────────────────
1 │ 63345002 1985-12-01 2.9 13.5 6.9 0.1 0.0 ⋯
2 │ 63345002 1985-12-02 4.5 16.7 6.5 0.3 0.0
3 │ 63345002 1985-12-03 6.6 21.1 5.6 1.4 0.0
4 │ 63345002 1985-12-04 7.8 15.8 4.4 1.1 0.0
5 │ 63345002 1985-12-05 3.8 12.2 6.6 0.7 0.0 ⋯
6 │ 63345002 1985-12-06 4.1 6.4 3.9 0.4 0.0
7 │ 63345002 1985-12-07 3.3 7.9 1.8 0.3 0.0
8 │ 63345002 1985-12-08 3.1 8.1 2.4 0.4 0.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
12443 │ 63345002 2019-12-25 -0.1 10.0 6.54 0.0 0.0 ⋯
12444 │ 63345002 2019-12-26 -1.7 10.2 4.13 0.7 0.0
12445 │ 63345002 2019-12-27 4.4 7.9 2.72 0.5 8.5
12446 │ 63345002 2019-12-28 2.0 4.5 3.07 0.1 0.0
12447 │ 63345002 2019-12-29 -2.6 12.2 7.01 0.2 0.0 ⋯
12448 │ 63345002 2019-12-30 0.4 14.3 6.73 0.4 0.0
12449 │ 63345002 2019-12-31 -0.7 12.9 6.53 0.2 0.0
3 columns and 12434 rows omitted
We compute the mean temperature during the growth month of maize (May to September) at each location.
mean_summer_TX = round.([@combine(@subset(df, month.(:DATE) .∈ tuple([5,6,7,8,9])), :meanTX=mean(:TX))[1,1] for df in data_stations_full], digits = 1)
file_for_maps_with_geomakie = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/master/examples/utilities_geo_makie_features.jl")
include(file_for_maps_with_geomakie)
FR_map = map_with_stations(last.(station_gps), first.(station_gps), mean_summer_TX; station_name=string.(" ",station_ndep), show_value=true, colorbar_show=true, precision_scale = 50, colorbar_label = "°C")

Fitting
We now fit the observed weather variables to seasonal models w.r.t. the hidden variables i.e. the models we fit depend continuously on the day of the year $t\in [1,366]$ and on the provided hidden state $Z \in [1,K]$.
Rain Occurrences
Fit the Rain Occurrences of INRAE stations with respect to the given hidden states sequence. These distributions are then cast into the HMM emission distribution. The pre-trained transitons matrices Qₜ
are kept the same. We could have added these new emission distributions to the existing one, however here we focus only on these four stations.
θ = zeros(K, D, memory_order, 2degree + 1)
B = Array{Bernoulli}(undef, K, T, D, memory_order)
@time "Fit RO" for (j, df) in enumerate(data_stations)
B[:, :, j, :], θ[:, j, :, :] = fit_mle_RO(df, local_order, degree)
end
θy_fit = θ # hmm_infos["Y_param"] for old stations
a = zeros(K)
a[data_stations[1].z[1]] = 1.0 # Initial state of the HMM
hmm_fit = ARPeriodicHMM(a, hmm_fit_full.A, B)
SmoothPeriodicStatsModels.ARPeriodicHMM{Distributions.Univariate, Float64}([1.0, 0.0, 0.0, 0.0], [0.5868668408604313 0.19286713354416585 0.17289938700187546 0.04736663859352811; 0.2759747892651204 0.4722340192476778 0.07708716933059408 0.17470402215660846; 0.18794832946617515 0.12398289093764574 0.5045183073090319 0.18355047228714813; 0.04680120290737543 0.17755340772353972 0.08096831926793513 0.6946770701011507;;; 0.5877868257152563 0.19234608241581733 0.17265160720924666 0.047215484659680514; 0.27614582817303934 0.471991540065958 0.0771993838260455 0.17466324793495813; 0.18768201945106514 0.12367549022779331 0.5050775685980138 0.18356492172312863; 0.04703227662293221 0.17759850456149864 0.08098486902189786 0.6943843497936721;;; 0.588687334120533 0.19184574755786424 0.1724003270660093 0.04706659125559436; 0.27629965819575786 0.4717560982598423 0.07731156187308509 0.17463268167131557; 0.18740794640994776 0.12337190379695806 0.5056284970073495 0.1835916527857457; 0.04726201155774799 0.1776422856497569 0.08100458503728997 0.6940911177552062;;; … ;;; 0.5839921989986937 0.19455414318307435 0.1736203837119659 0.04783327410626695; 0.27535913872978807 0.47300281386298804 0.07675060216742301 0.17488744523980176; 0.1886995358089772 0.12492728775056734 0.5027923864426367 0.1835807899978197; 0.046100916578977115 0.17741031768250684 0.08093767132668162 0.6955510944118354;;; 0.5849693064664692 0.19397121053213456 0.17338390952093885 0.04767557348045827; 0.27558136578616993 0.47273974707608046 0.07686274949815931 0.17481613763959114; 0.18845719968939276 0.1246088606036976 0.5033755292747968 0.18355841043211377; 0.04633542670594446 0.17745930425501852 0.08094472055045258 0.6952605484885854;;; 0.5859275947510023 0.19340885802718405 0.17314353250367023 0.04752001471814443; 0.27578661001423593 0.4724834511962677 0.07697494813716882 0.17475499065232838; 0.18820676065352623 0.12429403772922211 0.5039508986966097 0.1835483029206428; 0.04656888744079467 0.17750700436717928 0.08095493640358074 0.6949691717884463], Distributions.Bernoulli[Distributions.Bernoulli{Float64}(p=0.9114349905656545) Distributions.Bernoulli{Float64}(p=0.9112922227762373) … Distributions.Bernoulli{Float64}(p=0.9116527083444091) Distributions.Bernoulli{Float64}(p=0.9115551366640436); Distributions.Bernoulli{Float64}(p=0.5303550953788534) Distributions.Bernoulli{Float64}(p=0.5297718795388718) … Distributions.Bernoulli{Float64}(p=0.531453610503336) Distributions.Bernoulli{Float64}(p=0.5309157247945008); Distributions.Bernoulli{Float64}(p=0.3759328543838933) Distributions.Bernoulli{Float64}(p=0.3752646381095858) … Distributions.Bernoulli{Float64}(p=0.37715296876738574) Distributions.Bernoulli{Float64}(p=0.3765623864448382); Distributions.Bernoulli{Float64}(p=0.15100146048415572) Distributions.Bernoulli{Float64}(p=0.1505457512469645) … Distributions.Bernoulli{Float64}(p=0.15182716935111723) Distributions.Bernoulli{Float64}(p=0.1514287102290464);;; Distributions.Bernoulli{Float64}(p=0.7800442659210688) Distributions.Bernoulli{Float64}(p=0.7797646619945506) … Distributions.Bernoulli{Float64}(p=0.7805789168842295) Distributions.Bernoulli{Float64}(p=0.7803157038345774); Distributions.Bernoulli{Float64}(p=0.6790258003793276) Distributions.Bernoulli{Float64}(p=0.678877149512602) … Distributions.Bernoulli{Float64}(p=0.679280530240206) Distributions.Bernoulli{Float64}(p=0.6791602707062911); Distributions.Bernoulli{Float64}(p=0.1304367921685503) Distributions.Bernoulli{Float64}(p=0.12967758806682217) … Distributions.Bernoulli{Float64}(p=0.13196995314982773) Distributions.Bernoulli{Float64}(p=0.13120097985325627); Distributions.Bernoulli{Float64}(p=0.12576150835485392) Distributions.Bernoulli{Float64}(p=0.12452859087713151) … Distributions.Bernoulli{Float64}(p=0.12822440436803237) Distributions.Bernoulli{Float64}(p=0.1269936446145172);;; Distributions.Bernoulli{Float64}(p=0.7726526902118604) Distributions.Bernoulli{Float64}(p=0.7730694552252773) … Distributions.Bernoulli{Float64}(p=0.7717979836228935) Distributions.Bernoulli{Float64}(p=0.7722288331823005); Distributions.Bernoulli{Float64}(p=0.30703958620094723) Distributions.Bernoulli{Float64}(p=0.30616316971985086) … Distributions.Bernoulli{Float64}(p=0.30881795687334856) Distributions.Bernoulli{Float64}(p=0.30792459231123015); Distributions.Bernoulli{Float64}(p=0.5580165648042019) Distributions.Bernoulli{Float64}(p=0.558503754086631) … Distributions.Bernoulli{Float64}(p=0.5569999337015598) Distributions.Bernoulli{Float64}(p=0.5575152450375114); Distributions.Bernoulli{Float64}(p=0.18581502417222773) Distributions.Bernoulli{Float64}(p=0.18515837773862875) … Distributions.Bernoulli{Float64}(p=0.1871290801233476) Distributions.Bernoulli{Float64}(p=0.1864719933970075);;; Distributions.Bernoulli{Float64}(p=0.6991690180326523) Distributions.Bernoulli{Float64}(p=0.6994675898266853) … Distributions.Bernoulli{Float64}(p=0.6987069677180473) Distributions.Bernoulli{Float64}(p=0.698915430708985); Distributions.Bernoulli{Float64}(p=0.36109654815637093) Distributions.Bernoulli{Float64}(p=0.3611277721431516) … Distributions.Bernoulli{Float64}(p=0.36108008370248895) Distributions.Bernoulli{Float64}(p=0.36108065074119566); Distributions.Bernoulli{Float64}(p=0.3181915765477496) Distributions.Bernoulli{Float64}(p=0.3189970961118282) … Distributions.Bernoulli{Float64}(p=0.3166809462714016) Distributions.Bernoulli{Float64}(p=0.31741949192055735); Distributions.Bernoulli{Float64}(p=0.07318304065257011) Distributions.Bernoulli{Float64}(p=0.07275954854020077) … Distributions.Bernoulli{Float64}(p=0.07406022588364866) Distributions.Bernoulli{Float64}(p=0.07361659694920993);;;; Distributions.Bernoulli{Float64}(p=0.8749201681312516) Distributions.Bernoulli{Float64}(p=0.8751807114132103) … Distributions.Bernoulli{Float64}(p=0.8743658445953179) Distributions.Bernoulli{Float64}(p=0.8746485390124702); Distributions.Bernoulli{Float64}(p=0.6318069097739621) Distributions.Bernoulli{Float64}(p=0.6316468839407342) … Distributions.Bernoulli{Float64}(p=0.63209218401333) Distributions.Bernoulli{Float64}(p=0.6319553557741113); Distributions.Bernoulli{Float64}(p=0.47614525021227294) Distributions.Bernoulli{Float64}(p=0.47655165249760256) … Distributions.Bernoulli{Float64}(p=0.4752788885564046) Distributions.Bernoulli{Float64}(p=0.47572095221794075); Distributions.Bernoulli{Float64}(p=0.2138403985769788) Distributions.Bernoulli{Float64}(p=0.21349321209196448) … Distributions.Bernoulli{Float64}(p=0.21449754193357112) Distributions.Bernoulli{Float64}(p=0.2141752226754711);;; Distributions.Bernoulli{Float64}(p=0.8855381689637662) Distributions.Bernoulli{Float64}(p=0.8855994326828246) … Distributions.Bernoulli{Float64}(p=0.8853789246502635) Distributions.Bernoulli{Float64}(p=0.8854646658878261); Distributions.Bernoulli{Float64}(p=0.7262791925442851) Distributions.Bernoulli{Float64}(p=0.7262049366779089) … Distributions.Bernoulli{Float64}(p=0.7264231979776209) Distributions.Bernoulli{Float64}(p=0.7263519532117009); Distributions.Bernoulli{Float64}(p=0.2813410920433409) Distributions.Bernoulli{Float64}(p=0.2817276962593952) … Distributions.Bernoulli{Float64}(p=0.2805729977664643) Distributions.Bernoulli{Float64}(p=0.28095615580149214); Distributions.Bernoulli{Float64}(p=0.21281685123594582) Distributions.Bernoulli{Float64}(p=0.21226880534077966) … Distributions.Bernoulli{Float64}(p=0.21391349246218255) Distributions.Bernoulli{Float64}(p=0.21336513578697588);;; Distributions.Bernoulli{Float64}(p=0.870567218510227) Distributions.Bernoulli{Float64}(p=0.8708100922012092) … Distributions.Bernoulli{Float64}(p=0.8700674604087172) Distributions.Bernoulli{Float64}(p=0.8703196573151201); Distributions.Bernoulli{Float64}(p=0.6167401331750285) Distributions.Bernoulli{Float64}(p=0.6169445406220616) … Distributions.Bernoulli{Float64}(p=0.6163013392463161) Distributions.Bernoulli{Float64}(p=0.6165257137308044); Distributions.Bernoulli{Float64}(p=0.7507401976118325) Distributions.Bernoulli{Float64}(p=0.7517108100822073) … Distributions.Bernoulli{Float64}(p=0.7487418021915991) Distributions.Bernoulli{Float64}(p=0.749750473330656); Distributions.Bernoulli{Float64}(p=0.3351287130156477) Distributions.Bernoulli{Float64}(p=0.3357203475273395) … Distributions.Bernoulli{Float64}(p=0.3339011660201009) Distributions.Bernoulli{Float64}(p=0.3345222453661863);;; Distributions.Bernoulli{Float64}(p=0.8953214479977459) Distributions.Bernoulli{Float64}(p=0.8955404277662539) … Distributions.Bernoulli{Float64}(p=0.8948863178617229) Distributions.Bernoulli{Float64}(p=0.8951033876229403); Distributions.Bernoulli{Float64}(p=0.5917565119637603) Distributions.Bernoulli{Float64}(p=0.591876379493273) … Distributions.Bernoulli{Float64}(p=0.5915356844996875) Distributions.Bernoulli{Float64}(p=0.5916429352602304); Distributions.Bernoulli{Float64}(p=0.6153019573600416) Distributions.Bernoulli{Float64}(p=0.6157963173846315) … Distributions.Bernoulli{Float64}(p=0.6143520886637719) Distributions.Bernoulli{Float64}(p=0.6148204940328377); Distributions.Bernoulli{Float64}(p=0.3562513096573049) Distributions.Bernoulli{Float64}(p=0.3555951583400004) … Distributions.Bernoulli{Float64}(p=0.3575286519958104) Distributions.Bernoulli{Float64}(p=0.35689588133965017)])
Rain Amounts
Here for simplicity we select the double exponential model for the rain amount. We aim for a generic interface where one could easily change this choice to other univariate distribution e.g. mix₀ = MixtureModel([Exponential(1), Gamma(1,2)], [1/2,1/2])
or mix₀ = Pareto(1)
. For now this is not supported (because of the seasonal fit).
@time "Fit Rain amounts" mix_allE = fit_mle_RR.(data_stations, local_order; mix₀=StochasticWeatherGenerators.mix_ini(T));
Fit Rain amounts: 52.778774 seconds (249.23 M allocations: 22.943 GiB, 6.19% gc time, 12.33% compilation time: <1% of which was recompilation)
The Gaussian copula covariance matrices are then estimated.
Σ²RR = cov_RR(data_stations, K)
4-element Vector{Matrix{Float64}}:
[31.76415957843173 6.844726152914493 5.566374230877233 5.894147720497379; 6.844726152914493 30.671740091699586 1.6051317713619329 4.122138695924489; 5.566374230877233 1.6051317713619329 133.83512052388917 11.38371098788533; 5.894147720497379 4.122138695924489 11.38371098788533 60.84843886399006]
[18.814522481337494 4.499677821682111 3.218040160985254 3.6981648899561517; 4.499677821682111 21.924109242248473 -0.5112451865357651 3.29411091932108; 3.218040160985254 -0.5112451865357651 57.20484223034485 7.0741224740724675; 3.6981648899561517 3.29411091932108 7.0741224740724675 28.4748198971268]
[22.526397523231637 0.11751293460188918 3.8806035837705375 6.103057015755162; 0.11751293460188918 10.383559603648603 -0.03334966631737586 -1.2943160782297487; 3.8806035837705375 -0.03334966631737586 125.2955119008292 9.147986986520573; 6.103057015755162 -1.2943160782297487 9.147986986520573 47.19469913508749]
[17.71913430653424 1.3398724527144155 9.253376683896645 5.046150952980411; 1.3398724527144155 14.265975889201796 -0.3526574216776574 1.162605150714381; 9.253376683896645 -0.3526574216776574 42.60506798496414 3.0376039948202553; 5.046150952980411 1.162605150714381 3.0376039948202553 21.463744546532258]
Temperature
We first fit the daily maximal temperature (we could have started with minimal temperatures). We use the same principle as for RR
i.e. first fit each univariate distribution and then the copula.
Temperature Max
@time "Fit TX" ar1sTX = fit_AR1.(data_stations, :TX, degree)
Σ²TX = cov_ar1(data_stations, ar1sTX, :TX)
4-element Vector{Matrix{Float64}}:
[0.9734492960626859 0.9088263527797048 0.7512396297213331 0.736871432126974; 0.9088263527797048 0.933801684912108 0.661682831481654 0.7133204422045198; 0.7512396297213331 0.661682831481654 0.606779453845207 0.5543809286933857; 0.736871432126974 0.7133204422045198 0.5543809286933857 0.5747770257017922]
[0.9681534836532066 0.9255394453442793 0.7682632586198843 0.6966057633142798; 0.9255394453442793 0.9073092315973019 0.7078278148395432 0.6856275091055775; 0.7682632586198843 0.7078278148395432 0.6494675049440323 0.5330859233911215; 0.6966057633142798 0.6856275091055775 0.5330859233911215 0.5251315387620278]
[0.991656860471339 1.0286449977971168 0.8380972089080905 0.7554770426011381; 1.0286449977971168 1.0795832666032297 0.870028075571526 0.7842843563337052; 0.8380972089080905 0.870028075571526 0.8405559690908149 0.7146659734476936; 0.7554770426011381 0.7842843563337052 0.7146659734476936 0.6294793610747098]
[1.0010212957320863 1.003583031214807 0.7917838135602174 0.7596658530140675; 1.003583031214807 1.0209045986058716 0.808431211641072 0.767719911807825; 0.7917838135602174 0.808431211641072 0.6624416461850272 0.6171847440192805; 0.7596658530140675 0.767719911807825 0.6171847440192805 0.6196734998261336]
Minimal Temperature
To directly fit the TN
one could do the following
ar1sTN = fit_AR1.(data_stations, :TN, degree)
Σ²TN = cov_ar1(dropmissing.(data_stations), ar1sTN, :TN)
However this produce TN
independantly of TX
(and we can have TN>TX
). To prevent that we fit the positive difference ΔT = TX-TN
with a Gamma distribution. We will then simulate the TN
conditionally to the TX
@time "Fit TN residuals" θ_ΔT = fit_TN.(data_stations, 1, T; print_level=0) # 1 is the degree
θ_cor = cor_groupbyTXTN.(data_stations, T)
f(θ) = Gamma(θ[1], θ[2]) # other options e.g. MixtureModel([Exponential(θ[1]), Exponential(θ[2])], [θ[3], 1 - θ[3]])
f(t, θ) = f([σₜ(t, θ[1:(2+1)]), σₜ(t, θ[(2+2):end])])
f (generic function with 2 methods)
Solar Irradiance (QQ)
QQ
must be positive, we will truncate at simulation time negative instances.
@time "Fit QQ" ar1sQQ = fit_AR1.(data_stations, :QQ, degree)
Σ²QQ = cov_ar1(dropmissing.(data_stations), ar1sQQ, :QQ)
4-element Vector{Matrix{Float64}}:
[0.9735613898067461 1.1029355193846306 0.814648884912151 0.9025672376967814; 1.1029355193846306 1.305225934873927 0.9060038459791245 1.0074785955906629; 0.814648884912151 0.9060038459791245 0.7008343359057471 0.7752971343567613; 0.9025672376967814 1.0074785955906629 0.7752971343567613 0.8858298601200435]
[0.981504445223468 1.073677113167918 0.8687252007004189 0.9197330221334017; 1.073677113167918 1.218351425886187 0.9141419674317055 0.9807740739861867; 0.8687252007004189 0.9141419674317055 0.8095775631892538 0.8488618691933953; 0.9197330221334017 0.9807740739861867 0.8488618691933953 0.9000030967163165]
[0.9738802939529032 1.1130761587867382 0.8789283476900667 0.8893323206430374; 1.1130761587867382 1.3140896483701416 0.982708443862374 1.0100904237182031; 0.8789283476900667 0.982708443862374 0.8830876567737447 0.8806116916739374; 0.8893323206430374 1.0100904237182031 0.8806116916739374 0.894197436332702]
[0.9671506438805116 0.9364348356078387 0.8157114299248541 0.7802365615724078; 0.9364348356078387 0.9407785777896013 0.7911515784805272 0.7683093116416554; 0.8157114299248541 0.7911515784805272 0.7736155734508874 0.7083813255465897; 0.7802365615724078 0.7683093116416554 0.7083813255465897 0.670721249798178]
Evapotranspiration Penman (ETPP)
ETPP
must be positive, we will truncate at simulation time negative instances.
@time "Fit ETPP" ar1sETPP = fit_AR1.(data_stations, :ETPP, degree)
Σ²ETPP = cov_ar1(dropmissing.(data_stations), ar1sETPP, :ETPP)
4-element Vector{Matrix{Float64}}:
[0.9478427476558838 0.9447369158652795 0.8427423734682039 0.8692905240091324; 0.9447369158652795 0.9728115210519872 0.8310547639257787 0.8603216721396257; 0.8427423734682039 0.8310547639257787 0.80096586386207 0.8165236524118246; 0.8692905240091324 0.8603216721396257 0.8165236524118246 0.8456995674372729]
[0.9740984494523715 0.8545353116107742 0.9809628778627023 0.9211740262500745; 0.8545353116107742 0.7827166998025054 0.8654739483361198 0.8457217575730882; 0.9809628778627023 0.8654739483361198 1.0223346266051496 0.9409708482499695; 0.9211740262500745 0.8457217575730882 0.9409708482499695 0.9323982375888193]
[0.9833026543073872 0.9477149584375228 0.8672595548121652 0.920012070035954; 0.9477149584375228 0.9576105459162413 0.84112815349153 0.9089411028313419; 0.8672595548121652 0.84112815349153 1.0062338409833154 0.9806980180951468; 0.920012070035954 0.9089411028313419 0.9806980180951468 1.0077467983137338]
[1.000683353043992 0.9207526012694283 0.8873198655981687 0.9379518232084835; 0.9207526012694283 0.8565900281404355 0.827276194246968 0.8759653645923384; 0.8873198655981687 0.827276194246968 0.9685438651404111 0.9152931194494438; 0.9379518232084835 0.8759653645923384 0.9152931194494438 0.9335307717402674]
Simulation
Initial conditions
y_ini = [@subset(df, :DATE .== Date(2000) - Day(1)).RO[1] for df in data_stations]'
tx_ini = [@subset(df, :DATE .== Date(2000)).TX[1] for df in data_stations]
z_ini = @subset(data_stations[1], :DATE .== Date(2000)).z[1]
tn_ini = [@subset(df, :DATE .== Date(2000)).TN[1] for df in data_stations]
qq_ini = [@subset(df, :DATE .== Date(2000)).QQ[1] for df in data_stations]
et_ini = [@subset(df, :DATE .== Date(2000)).ETPP[1] for df in data_stations]
4-element Vector{Float64}:
0.1
0.0
0.3
0.0
Generation
indicatrix(x) = x > zero(x) ? x : zero(x) # used to manually suppress negative `QQ` and `ETPP`
Random.seed!(50000)
Random.TaskLocalRNG()
NSIMU = 300
NYEAR = 50
year_start = 1986
date_range = Date(year_start):Day(1):Date(year_start + NYEAR - 1, 12, 31)
n2t = dayofyear_Leap.(date_range)
years = unique(year.(date_range))
N = length(n2t)
18262
@time "Total simulation $NSIMU of $NYEAR years" begin
zs = zeros(Int, N, NSIMU)
ys = zeros(Bool, N, D, NSIMU)
rs = zeros(D, N, NSIMU)
txs = zeros(D, N, NSIMU)
ϵ_TX = zeros(D, N, NSIMU)
ΔTs = zeros(D, N, NSIMU)
qqs = zeros(D, N, NSIMU)
ets = zeros(D, N, NSIMU)
@time "HMM" for i in 1:NSIMU
zs[:, i], ys[:, :, i] = rand(hmm_fit, n2t; y_ini=y_ini, z_ini=z_ini, seq=true)
end
@time "Rain" for i in 1:NSIMU
rs[:, :, i] = rand_RR(mix_allE, n2t, zs[:, i], ys[:, :, i]', Σ²RR)
end
@time "T_max" for i in 1:NSIMU
txs[:, :, i], ϵ_TX[:, :, i] = rand(ar1sTX, n2t, zs[:, i], Σ²TX; y₁=tx_ini, output_ϵ=true)
end
@time "T_min" for i in 1:NSIMU
ΔTs[:, :, i] = reduce(hcat, [rand_cond(ϵ_TX[j, :, i], zs[:, i], θ_ΔT[j], θ_cor[j], n2t, T) for j in 1:D]) |> permutedims
end
tns = txs - ΔTs #rand(ar1sTN, n2t, zs, Σ²TN; y₁=tn_ini)
@time "Solar Radiation" for i in 1:NSIMU
qqs[:, :, i] = indicatrix.(rand(ar1sQQ, n2t, zs[:, i], Σ²QQ; y₁=qq_ini))
end
@time "Evapotranspiration" for i in 1:NSIMU
ets[:, :, i] = indicatrix.(rand(ar1sETPP, n2t, zs[:, i], Σ²ETPP, y₁=et_ini))
end
end
HMM: 5.897623 seconds (90.92 M allocations: 4.068 GiB, 19.80% gc time, 18.52% compilation time)
Rain: 22.677732 seconds (86.53 M allocations: 3.908 GiB, 1.79% gc time, 7.12% compilation time)
T_max: 4.972885 seconds (29.16 M allocations: 2.705 GiB, 5.37% gc time, 21.19% compilation time)
T_min: 34.441156 seconds (396.33 M allocations: 15.930 GiB, 4.34% gc time, 2.56% compilation time)
Solar Radiation: 5.142782 seconds (27.59 M allocations: 2.788 GiB, 18.62% gc time, 1.33% compilation time)
Evapotranspiration: 3.864231 seconds (27.43 M allocations: 2.780 GiB, 4.91% gc time)
Total simulation 300 of 50 years: 77.710164 seconds (658.38 M allocations: 33.406 GiB, 5.96% gc time, 6.53% compilation time)
Results
Cast the results into NSIMU = 300
$\times$ D=4
DataFrame
s of NYEAR = 50
years each, with columns [:DATE, :RR, :RN, :TX, :QQ, :ETPP, :STAID]
where STAID
is the unique station identifier. Also cast everything into a Dict
for convenience.
dicts_simu = Dict(:RR => rs, :TN => tns, :TX => txs, :QQ => qqs, :ETPP => ets)
dfs_simus = [[DataFrame(:DATE => date_range, :RR => rs[j, :, i], :TN => tns[j, :, i], :TX => txs[j, :, i], :QQ => qqs[j, :, i], :ETPP => ets[j, :, i], :STAID => fill(data_stations[j].STAID[1], length(n2t))) for j in 1:D] for i in 1:NSIMU];
Plots
Settings for plotting
Some settings and packages to have nice plots.
using StatsPlots
using StatsPlots.PlotMeasures # To play with margin in Plots
gr() # plotly() # for interactive plots
default(fontfamily="Computer Modern")
cur_colors = get_color_palette(:auto, 100);
my_palette(K) = palette(vcat(cur_colors[1], [cur_colors[c] for c in 3:4], cur_colors[2]), K)
file_for_plot_utilities = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/master/examples/utilities_plot.jl")
include(file_for_plot_utilities)
vars = [:RR, :TN, :TX, :QQ, :ETPP]
5-element Vector{Symbol}:
:RR
:TN
:TX
:QQ
:ETPP
Plot correlations
In this section, we show different correlations between variables and stations.
Multisite Correlation for each variables
@time begin
plt_cor_mutlisite = [plot(-1:0.1:1, -1:0.1:1, lw=2, label=:none, aspect_ratio=:equal, foreground_color_legend=nothing, background_color_legend=nothing, legend_columns=2, tickfont=12, legendfontsize=12, xlabelfontsize=16, ylabelfontsize=16) for i in 1:length(vars)]
for (i, vari) in enumerate(vars)
annotate!(plt_cor_mutlisite[i], 0.5, 1.15, ("$(string(vari))", 16))
X = dicts_simu[vari]
for station_1 in 1:D-1
@views X₁ = X[station_1, :, :]
for station_2 in station_1+1:D
df1 = dropmissing(innerjoin(data_stations[station_1], data_stations[station_2], on=:DATE, makeunique=true))
@views X₂ = X[station_2, :, :]
@views mean_cor = mean(cor(X₁[:, r], X₂[:, r]) for r in 1:NSIMU)
arr = ([cor(df1[:, vari], df1[:, Symbol(string(vari, "_1"))])], [mean_cor])
scatter!(plt_cor_mutlisite[i], arr, label=ifelse(i == 2, "$(station_dep[station_1]) vs $(station_dep[station_2])", :none), markersize=6)
end
end
xlabel!("Observation")
i ∈ [1, 4] ? ylabel!("Simulation") : nothing
end
plt_cor_mutlisites = plot(plt_cor_mutlisite..., size=(1000, 1000), layout=(3, 3), top_margin=7px)
end
Correlations between weather variables at each sites.
markers = filter((m -> begin
m in Plots.supported_markers()
end), Plots._shape_keys)
begin
plt_var = [plot(-1:0.1:1, -1:0.1:1, lw=2, label=:none, aspect_ratio=:equal, foreground_color_legend=nothing, background_color_legend=nothing, legend_columns=1, tickfont=13, legendfontsize=14, xlabelfontsize=16, ylabelfontsize=16, titlefontsize=16) for j in 1:D]
for station_1 in 1:D
title!(plt_var[station_1], "$(station_ndep[station_1])")
for (i, var_1) in enumerate(vars)
for j in (i+1):length(vars)
var_2 = vars[j]
df1 = dropmissing(data_stations[station_1])
@views mean_cor = mean(cor(dicts_simu[var_1][station_1, :, r], dicts_simu[var_2][station_1, :, r]) for r in 1:NSIMU)
scatter!(plt_var[station_1], [cor(df1[:, var_1], df1[:, var_2])], [mean_cor], label=ifelse(station_1 == 1, "$var_1 vs $var_2", :none), m=markers[i], ms=8)
end
end
end
plt_vars = plot(plt_var..., size=(1000, 1000))
end
Univariate distributions
In this section we plot the marginal distribution of each variables at each sites.
Marginal Distributions
begin
plt_dist_univ = Dict()
plts_dist_univ = Dict()
xlabel_string = Dict(:RR => L"Rain ($\mathrm{m}\mathrm{m}/\mathrm{m}^2$)", :TX => L"$T_{\mathrm{max}}$ (°C)", :TN => L"$T_{\mathrm{min}}$ (°C)", :QQ => L"Solar irradiance (MJ/$\mathrm{m}^2$)", :ETPP => L"Evapotranspiration ($\mathrm{m}\mathrm{m}/\mathrm{m}^2$)")
for XX in vars
plt_dist_univ[XX] = [plot(tickfont=13, legendfontsize=14, xlabelfontsize=16, ylabelfontsize=16, titlefontsize=16) for j = 1:D]
for (j, df) in enumerate(data_stations)
dist_j = XX == :RR ? [filter(!iszero, x) for x in eachcol(dicts_simu[XX][j, :, :])] : [x for x in eachcol(dicts_simu[XX][j, :, :])] # RR>0
errorlinehist!(plt_dist_univ[XX][j], dist_j, groupcolor=:grey, legend=:topright, label=islabel(j, [1], L"Simu $q_{0,100}$"), norm=:pdf, errortype=:percentile, percentiles=[0, 100], fillalpha=0.4, centertype=:median)
errorlinehist!(plt_dist_univ[XX][j], dist_j, groupcolor=:red, label=islabel(j, [1], L"Simu $q_{25,75}$"), norm=:pdf, errortype=:percentile, percentiles=[25, 75], fillalpha=0.5, centertype=:median)
dist_j_histo = XX == :RR ? filter(!iszero, dropmissing(df)[!, XX]) : dropmissing(df)[!, XX] # RR>0
errorlinehist!(plt_dist_univ[XX][j], [dist_j_histo], label=islabel(j, [1], "Obs"), groupcolor=:blue, lw=1.5, norm=:pdf, errortype=:percentile)
XX == :RR ? ylims!(plt_dist_univ[XX][j], 1e-4, 0, yaxis=:log10) : nothing
XX == :RR && j == 2 ? xlims!(plt_dist_univ[XX][j], -1, 100, yaxis=:log10) : nothing # some simulated RR are super extreme and messing with the xaxis
end
[xlabel!(plt_dist_univ[XX][j], xlabel_string[XX]) for j in [3, 4]]
[ylabel!(plt_dist_univ[XX][j], "PDF") for j in [1, 3]]
station_ndep
[title!(plt_dist_univ[XX][j], station_ndep[j]) for j = 1:D]
plts_dist_univ[XX] = plot(plt_dist_univ[XX]..., size=(1000, 700), bottom_margin=11px, left_margin=15px)
end
end
Rainfall RR
plts_dist_univ[:RR]
Temperature max TX
plts_dist_univ[:TX]
Temperature min TN
plts_dist_univ[:TN]
Solar irradiance QQ
plts_dist_univ[:QQ]
Evapotranspiration ETPP
plts_dist_univ[:ETPP]
Monthly statistics
year_range = unique(year.(date_range))
idx_year = [findall(x -> year.(x) == m, date_range) for m in year_range]
idx_month = [findall(x -> month.(x) == m, date_range) for m in 1:12]
idx_all = [intersect(yea, mon) for yea in idx_year, mon in idx_month]
agg_fun(XX) = ifelse(XX == :RR, sum, mean)
month_rain_simu = Dict(key => [monthly_agg(xx[j, :, i], idx_all, agg_fun(key)) for j in 1:D, i in 1:NSIMU] for (key, xx) in dicts_simu)
month_rain_histo = Dict(key => [monthly_agg(@subset(df, :DATE .≥ Date(1986)), key, agg_fun(key)) for df in data_stations] for key in vars)
qs = [0.9, 0.5, 0.1]
@time "Plot monthly quantile" begin
plt_month = Dict()
plts_month = Dict()
ylabel_string = Dict(:RR => L"Cumulated Rain ($\mathrm{m}\mathrm{m}/\mathrm{m}^2$)", :TX => L"$T_{\mathrm{max}}$ (°C)", :TN => L"$T_{\mathrm{min}}$ (°C)", :QQ => L"Solar irradiance (MJ/$\mathrm{m}^2$)", :ETPP => L"Evapotranspiration ($\mathrm{m}\mathrm{m}/\mathrm{m}^2$)")
for XX in vars
plt_month[XX] = [plot(xtickfontsize=10, ytickfontsize=11, ylabelfontsize=12, legendfontsize=12, titlefontsize=12, foreground_color_legend=nothing) for j = 1:D]
for j = 1:D
for (α, per) in enumerate([[0, 100], [25, 75]])
for (cc, q) in enumerate(qs)
errorline!(plt_month[XX][j], [quantile(month_rain_simu[XX][j, i][:, m], q) for m in 1:12, i in 1:NSIMU], label=(α == 1 ? islabel(j, 4, L"Simu $q_{%$(Int(q*100))}$") : :none), fillalpha=0.18 * α^2, centertype=:median, errortype=:percentile, percentiles=per, groupcolor=my_palette(length(qs))[cc])
end
end
for q in qs
scatter!(plt_month[XX][j], m -> quantile(skipmissing(month_rain_histo[XX][j][:, m]), q), 1:12, label=(q == qs[1] ? islabel(j, 3, "Obs") : :none), legend=:topleft, ms=2.5, c=:blue)
plot!(plt_month[XX][j], m -> quantile(skipmissing(month_rain_histo[XX][j][:, m]), q), 1:12, label=:none, c=:blue, lw=1.75)
end
xticks!(plt_month[XX][j], 1:12, string.(first.(monthabbr.(1:12))))
XX == :RR ? ylims!(plt_month[XX][j], 0, Inf) : nothing
end
[ylabel!(plt_month[XX][j], ylabel_string[XX]) for j in [1, 3]]
[title!(plt_month[XX][j], station_ndep[j]) for j = 1:D]
plts_month[XX] = plot(plt_month[XX]..., size=(1000, 700), left_margin=19px)
end
end
Plot monthly quantile: 2.512541 seconds (5.63 M allocations: 576.993 MiB, 5.25% gc time, 62.15% compilation time: 2% of which was recompilation)
Rainfall RR
plot(plts_month[:RR])
Temperature max TX
plts_month[:TX]
Temperature min TN
plts_month[:TN]
Solar irradiance QQ
plts_month[:QQ]
Evapotranspiration ETPP
plts_month[:ETPP]
Stochastic Weather Generator + Crop model STICS
In this section, we demonstrate how stochastic weather simulations can serve as inputs for a crop model to study the climate sensitivity of a crop model. This is a proof of concept showing how SWG can be useful when combine with other models.
Specifically, we use the STICS crop model[STICS]. To download STICS, go to this page (in French...) and follow these instructions:
Register on the dedicated website, or log in if you already have an account.
You will receive an e-mail confirming the creation of your account.
Download files here after logging in
The download files will have
- STICS executable, here we will use the lower level one
stics_modulo
and not the Java version which is slower. In this tutorial, we used the versionSTICS-10.0.0
. - Parameter files of different crop.
Description
For this tutorial, we use the default STICS parameters for maize with the following modifications: no irrigation (to highlight the effect of hydric stress), pgrainmaxi = 0.35
, and nbgrmin = 0
, nbgrmax = 4500
(minimum and maximum number of fruits per m$^2$).
Typically, the final yield ranges between 0 and 15 t/ha and is highly dependent on rainfall and temperature.
Running STICS
In the file file_stics
, we implemented functions to streamline calls to the STICS executable (either .exe
or .sh
), which can be obtained from the STICS downloads. For each simulation, the script updates the STICS weather files based on the input data frames (dfs_simus
). It extracts the final yield (along with some other quantities not used here). If STICS encounters an error, the YIELD value is set to missing
.
Since repeated calls to STICS are time-intensive, results are saved for reuse. Below, we show the executed code:
using Suppressor # to handle the STICS hard-printed output
file_for_stics_utilities = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/master/examples/utilities_stics.jl")
include(file_for_stics_utilities)
period_range_func (generic function with 1 method)
stics_path = joinpath(STICS_PATH, "JavaSTICS-1.5.1-STICS-10.0.0", "bin", "stics_modulo.exe") # or stics_modulo for Linux
work_path = joinpath("stics_files", "maize") # folder where all maize files are
@time res_YIELDs = map(Iterators.product(1:300, 1:D)) do (i, j)
GC.gc() # empirically needed
@show i,j
run_stics_mod_yield(dfs_simus[i][j], infos=100, stics_path=stics_path, work_path=work_path).YIELD
end
# 15650.246490 seconds (1.03 G allocations: 108.568 GiB, 1.13% gc time, 0.01% compilation time: <1% of which was recompilation)
cd(@__DIR__)
JLD.save("results_yield.jld", Dict("res_YIELDs" => res_YIELDs))
file_yield = download("https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/refs/heads/master/assets/tuto_2/results_yield.jld")
res_YIELDs = JLD.load(file_yield)["res_YIELDs"]
300×4 Matrix{Vector{Union{Missing, Float64}}}:
[11.16, 8.26, 7.65, 8.46, 4.23, 8.57, 4.5, 4.06, 3.67, 9.84 … 7.37, 7.3, 7.74, missing, 7.82, 4.53, 4.07, 4.73, 10.39, 6.96] … [2.07, 4.16, 8.01, 4.59, 4.84, 3.47, 3.15, 6.38, 1.02, 7.4 … 3.26, 0.0, 4.74, 5.36, 1.63, 0.75, 0.0, 7.61, 0.67, 4.12]
[10.0, 10.61, 9.13, 7.55, 7.21, 8.12, 4.42, 4.52, 6.35, 5.84 … 6.5, 3.27, 5.14, 5.95, 5.09, 5.6, 9.51, 3.83, 4.32, 3.68] [1.91, 2.61, 3.99, 6.13, 4.34, 0.0, 5.08, 4.7, 0.0, missing … 0.27, 7.09, 3.33, 3.16, 2.62, 0.0, 5.2, 2.14, 2.87, 8.66]
[9.19, missing, 7.0, 12.8, 7.36, 8.98, 6.54, 5.28, 8.85, 2.94 … 4.15, 3.84, 5.02, 8.26, 13.36, 6.38, 8.65, missing, missing, 7.85] [9.03, 1.23, 2.39, 3.76, 7.06, 2.57, 1.63, 2.35, 1.71, 4.89 … 3.48, 4.64, 1.64, 0.98, 0.0, 2.36, 4.48, 1.13, 1.25, 2.42]
[12.94, missing, 10.04, 5.41, 7.01, missing, missing, 5.23, missing, missing … 5.44, 11.03, 5.82, 4.06, missing, 3.87, 5.12, 10.15, 10.32, missing] [3.99, 2.67, 3.6, 4.17, 10.11, 2.56, 0.56, 3.62, 4.63, 1.95 … 4.64, 2.59, 3.61, 0.82, 4.41, 2.78, 3.82, 0.71, 2.23, 3.02]
[5.82, 6.14, 10.49, 8.87, 3.89, 6.42, 5.62, 8.92, 9.23, 11.64 … 10.04, 4.76, 7.61, 8.6, 6.0, 4.0, 7.58, 4.34, 3.05, missing] [2.74, 1.77, 7.87, 2.36, 0.23, 5.88, 1.23, 1.25, 4.08, 0.33 … 0.0, 3.16, 0.68, 0.96, 2.71, 0.23, 7.71, 1.49, 2.87, 8.29]
[9.83, 8.82, 3.99, 5.2, 9.05, 5.32, 11.49, 8.52, 7.18, 5.83 … 8.45, 5.5, 4.92, 4.97, 6.88, 5.68, 9.92, 8.95, 6.04, 6.06] … [1.45, 5.49, 5.94, 4.06, 1.25, 5.82, 5.59, 2.6, missing, 6.76 … 0.88, 9.77, 2.94, 4.7, 2.03, 6.09, 2.25, 2.65, 7.66, 3.61]
[7.57, 3.52, 6.99, 9.29, 5.79, missing, 7.77, 9.74, 5.88, 2.82 … 4.95, 4.44, 5.71, 6.05, 8.24, 6.95, 3.77, missing, 2.82, 7.08] [0.87, 8.83, 8.74, 0.75, 6.14, 3.33, 3.03, 2.81, 2.62, 4.6 … 0.44, 5.15, 1.63, 2.4, 7.01, 1.37, 3.27, 0.0, missing, 3.62]
[10.55, 9.36, 5.47, 5.86, 6.98, 10.61, 3.7, missing, 6.88, missing … 6.6, 10.76, 7.3, 4.49, 4.85, 8.27, 10.35, 13.02, 4.09, 3.98] [4.28, 1.33, 2.76, 3.0, 0.01, 1.99, 2.48, 1.03, 0.3, 2.73 … missing, 1.2, 4.94, 1.26, 1.73, 0.81, 1.95, 0.0, 2.35, 2.58]
[3.38, 4.51, 6.66, 3.71, 6.67, 3.72, 12.34, 7.45, 10.54, 4.55 … 10.7, 6.65, 6.74, 7.01, 8.17, missing, 7.79, 10.3, 10.37, 7.62] [5.11, 1.92, 3.32, 1.17, 3.47, 2.32, 0.0, 1.95, 5.19, 1.87 … 1.34, 3.06, 1.92, 1.69, 0.34, 8.73, 5.64, 0.91, 1.56, 7.27]
[5.77, 4.16, 7.37, 6.92, 4.3, 4.71, 7.39, 5.89, 11.05, 5.36 … 3.21, 6.27, 8.81, missing, 11.21, 4.03, 3.66, 8.77, 8.95, 8.57] [1.87, 5.36, 3.27, 4.3, 0.11, 0.2, 5.82, 1.92, 1.85, 0.21 … 1.79, 1.11, 5.2, 2.91, 1.96, 1.15, 0.21, 0.0, 7.94, 0.0]
⋮ ⋱
[10.88, missing, 5.67, 8.91, 7.69, missing, 5.55, 10.71, 2.91, 3.7 … 4.87, 4.14, 13.35, 4.61, 3.76, 10.12, 11.94, 8.65, missing, missing] [0.37, 4.65, 5.95, 0.77, 0.78, 0.93, 0.69, 3.82, 4.72, 5.03 … 4.18, 1.06, 3.91, 2.98, 0.15, 2.95, 0.92, 8.35, 4.72, 4.7]
[6.77, 8.79, missing, 6.77, 10.78, 4.47, 4.31, 4.35, 7.46, 5.22 … 4.63, 5.79, 13.65, 8.27, 6.52, 10.38, 3.56, 8.47, 4.85, 9.33] [3.89, 0.67, 7.12, 0.37, 0.64, 4.8, 0.7, 2.72, 2.92, 5.51 … 4.53, 1.24, 3.29, 3.98, 1.76, 4.24, 7.21, 7.81, 0.43, 1.27]
[missing, 7.93, 3.85, 3.94, 6.43, missing, 8.22, 13.09, 5.45, 4.68 … 5.14, 5.67, missing, 4.61, 5.51, 7.7, 5.24, 6.96, 5.37, 5.39] [9.28, 4.34, 0.0, 7.3, 4.39, 3.23, 6.43, 4.81, 2.74, 3.09 … 0.69, 1.74, 1.34, 2.68, 0.0, 2.59, 0.14, 4.64, 1.28, 4.4]
[14.51, 5.57, 8.45, 7.21, 4.35, 9.6, 5.8, 3.56, 4.82, 7.01 … 5.0, 5.13, 6.39, 4.29, missing, 6.26, 5.86, 7.19, 8.91, 10.02] [0.88, 3.98, 2.0, 7.68, 2.43, 4.14, 2.32, 0.0, 4.4, 2.66 … 2.16, missing, 3.97, 5.79, 0.81, 2.21, 4.56, 0.0, 2.83, 3.16]
[5.15, 5.95, 7.82, 4.42, 8.02, 5.77, 6.83, missing, 3.45, 4.78 … 5.07, missing, 8.54, 11.29, 3.69, 5.24, missing, 5.15, missing, 12.09] … [0.96, 4.03, 1.24, 4.2, 7.05, 5.93, 0.99, 0.86, 2.69, 1.87 … 7.73, 2.15, 3.49, 3.5, 4.31, 2.25, 1.27, 3.01, 1.86, 7.71]
[9.23, 5.48, 4.8, 7.06, 4.63, 6.61, missing, 8.0, 4.25, 4.06 … 8.85, 10.74, 5.18, 7.39, 8.08, 9.76, 5.73, 6.23, 6.6, 7.62] [1.51, 2.93, 4.08, 0.12, 3.94, 3.06, 4.4, 3.23, 5.31, 4.54 … 4.79, 3.18, 8.15, 0.0, 0.07, 3.71, missing, 2.2, 3.83, 2.49]
[missing, 10.37, missing, 8.96, 5.52, 3.46, 9.99, 7.3, 8.91, 9.38 … 5.29, 10.33, 10.6, 7.09, 10.97, 6.39, 12.25, 7.21, 10.56, 2.82] [1.92, 5.64, 1.61, missing, missing, 0.18, 5.17, 4.32, 6.83, 0.41 … 4.18, 5.76, 6.24, 1.98, 3.22, 4.69, 2.23, 0.0, 6.12, 5.3]
[4.63, 6.45, 4.49, 3.76, 9.78, missing, 4.39, missing, 4.66, 12.29 … 6.52, missing, 5.81, 11.69, 6.43, 3.74, 4.92, 8.6, 8.38, 3.51] [0.09, 4.81, 6.58, 4.98, 6.4, 0.36, 2.11, 0.27, 6.28, 1.86 … 1.94, 2.01, 0.99, 2.67, 0.32, 5.7, 1.68, 2.66, 0.0, 3.2]
[5.98, 4.25, 11.98, 7.2, 10.38, 10.63, 6.47, 6.49, 5.77, 11.37 … 7.31, 6.92, 8.18, 7.07, 4.72, 3.78, 3.71, missing, missing, 5.22] [missing, 5.98, 5.76, 5.87, 3.76, 2.83, 3.95, 1.37, 1.83, 2.31 … 0.8, 2.89, 6.01, missing, 0.83, 2.09, 2.88, 1.9, 5.92, 0.0]
In the GenHack 3 Hackathon, the participants were given one file per station with $10^5$ year of annual yield. The validation set was composed of $10^6$ years. Here, we generate 300 realizations of 50 year crop yield distributions (to access uncertainty).
Yield distributions with uncertainty
The following plot shows the typical distribution of maize yield at four location over a 50-year span. We added the envelope of these distributions to highlight uncertainty. Each location has a very different weather (see previous sections).
begin
plt_dist_yield = [plot(tickfont=13, legendfontsize=14, xlabelfontsize=16, ylabelfontsize=16, titlefontsize=16, legend=:topright) for j = 1:D]
for j in 1:D
dist_j = Vector{Vector{Float64}}(filter.(!ismissing, res_YIELDs[:, j]))
errorlinehist!(plt_dist_yield[j], dist_j, groupcolor=:gray, label=islabel(j, [1], L"Simu $q_{0,100}$"), norm=:pdf, errortype=:percentile, percentiles=[0, 100], fillalpha=0.5, centertype=:median)
errorlinehist!(plt_dist_yield[j], dist_j, groupcolor=:red, label=islabel(j, [1], L"Simu $q_{25,75}$"), norm=:pdf, errortype=:percentile, percentiles=[25, 75], fillalpha=0.5, centertype=:median)
xlims!(plt_dist_yield[j], 0, 16)
end
[xlabel!(plt_dist_yield[j], "YIELD (t/ha)") for j in [3, 4]]
[ylabel!(plt_dist_yield[j], "PDF") for j in [1, 3]]
station_ndep
[title!(plt_dist_yield[j], station_ndep[j]) for j = 1:D]
plts_dist_yield = plot(plt_dist_yield..., size=(1000, 700), bottom_margin=11px, left_margin=15px)
end
Sensitivity of maize on rainfall during key growth periods
To identify which rainfall period between April and October has the greatest impact on maize yield, we divide the growing season into four periods. For each period, we calculate the mean rainfall, both conditionally (blue) and unconditionally (orange), on final yields exceeding the median. Each distribution is displayed along with its interquartile range.
For the selected station, the most critical period is from June 12 to July 27, as evidenced by the significant difference between the two distributions. Specifically, the mean rainfall, when conditioned on a high yield, is approximately 20% higher during this period. In contrast, for the other periods, the mean rainfall remains nearly identical, indicating a much lower sensitivity.
week_date = (date_begin=Date(1996, 4, 27), date_end=Date(1996, 10, 27), N_period=4)
groups = chunky(Dates.value(week_date.date_end - week_date.date_begin) + 1, week_date.N_period)
dfs_stat = [[stats_fortnightly(dfsim, week_date, years) for dfsim in dfs_simus[i]] for i in axes(res_YIELDs, 1)]
for i in eachindex(dfs_stat)
for (j, df) in enumerate(dfs_stat[i])
@transform!(df, :YIELD = res_YIELDs[i, j])
end
end
dfs_stat[1][1][1:10,:] # show how it looks like at one station
Row | YEAR | MEAN_TX_1 | MEAN_TX_2 | MEAN_TX_3 | MEAN_TX_4 | MEAN_RR_1 | MEAN_RR_2 | MEAN_RR_3 | MEAN_RR_4 | YIELD |
---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64? | |
1 | 1986 | 20.457 | 24.7809 | 23.554 | 21.1293 | 1.37129 | 1.91468 | 2.42863 | 1.87161 | 11.16 |
2 | 1987 | 19.5929 | 24.3262 | 27.5608 | 23.3728 | 1.72801 | 1.24399 | 1.27608 | 0.835774 | 8.26 |
3 | 1988 | 24.3713 | 24.7251 | 26.736 | 19.5601 | 0.32479 | 2.6228 | 0.492459 | 1.97264 | 7.65 |
4 | 1989 | 21.7094 | 24.5979 | 26.1679 | 21.8371 | 0.817122 | 1.96792 | 1.59204 | 2.36471 | 8.46 |
5 | 1990 | 21.8382 | 26.9997 | 26.8214 | 25.5684 | 2.07868 | 0.354247 | 0.633167 | 1.81674 | 4.23 |
6 | 1991 | 20.8871 | 25.6638 | 25.2258 | 20.0659 | 0.781276 | 2.32205 | 1.69724 | 3.57942 | 8.57 |
7 | 1992 | 21.2009 | 24.4955 | 24.5737 | 23.0732 | 0.883475 | 0.822688 | 2.11817 | 0.423854 | 4.5 |
8 | 1993 | 22.4568 | 25.809 | 28.7643 | 20.9434 | 2.20868 | 0.50843 | 2.23784 | 1.5855 | 4.06 |
9 | 1994 | 22.6829 | 25.8347 | 24.15 | 17.9013 | 0.307037 | 0.482274 | 2.48349 | 1.08125 | 3.67 |
10 | 1995 | 22.4755 | 26.711 | 25.4279 | 20.6923 | 1.33009 | 2.21916 | 1.4287 | 0.830187 | 9.84 |
In the GenHack 3 Hackathon, the number of period was 9, so that there were $9+9=18$ weather variables. Instead of being named MEAN_TX_j
and MEAN_RR_j
with j
going from 1 to 9, there were named W_i
with i
going from 1 to 18.
begin
j = 1 # station number
dfi = [dropmissing(dfs_stat[i][j]) for i in eachindex(dfs_stat)]
df_sub = [@subset(df, :YIELD .> quantile(:YIELD, 0.5)) for df in dfi]
plt_scat = [plot(legend=:topright, tickfont=14, legendfontsize=14, xlabelfontsize=14, ylabelfontsize=14, titlefontsize=14) for i in 1:week_date.N_period]
for i in 1:week_date.N_period
errorlinehist!(plt_scat[i], [dfi[ii][:, Symbol("MEAN_RR_$i")] for ii in eachindex(dfi)], groupcolor=2, label=islabel(i, 1, L"R"), norm=:pdf, errortype=:percentile, percentiles=[25, 75], fillalpha=0.1, lw = 2, centertype=:median)
errorlinehist!(plt_scat[i], [df_sub[ii][:, Symbol("MEAN_RR_$i")] for ii in eachindex(dfi)], groupcolor=1, label=islabel(i, 1, L"R \mid (\mathrm{Yield} > \mathrm{median})"), norm=:pdf, errortype=:percentile, percentiles=[25, 75], fillalpha=0.1, lw = 2, centertype=:median)
# stephist!(plt_scat[i], dfi[:, Symbol("MEAN_RR_$i")], label=islabel(i, 2, L"R"), norm=:pdf, lw=1.75)
# stephist!(plt_scat[i], df_sub[:, Symbol("MEAN_RR_$i")], label=islabel(i, 2, L"R \mid (\mathrm{Yield} > \mathrm{median})"), norm=:pdf, lw=1.75)
vline!([mean.([dfi[ii][:, Symbol("MEAN_RR_$i")] for ii in eachindex(dfi)]) |> mean], label=:none, s=:dot, c=2, lw=2)
vline!([mean.([df_sub[ii][:, Symbol("MEAN_RR_$i")] for ii in eachindex(dfi)]) |> mean], label=:none, s=:dot, c=1, lw=2)
period_range = period_range_func(groups[i])
title!("$(day(period_range[1])) $(monthabbr(period_range[1])) - $(day(period_range[2])) $(monthabbr(period_range[2]))")
xlabel!("Rain (mm/period)")
ylabel!("PDF")
end
annotate!(plt_scat[3], 5.5, 1.1, station_ndep[j])
plt_sensitivity = plot(plt_scat..., size=(1000, 700), left_margin=3Plots.PlotMeasures.mm)
end
Reproducibility
import InteractiveUtils
InteractiveUtils.versioninfo()
Julia Version 1.11.2
Commit 5e9a32e7af2 (2024-12-01 20:02 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 4 × AMD EPYC 7763 64-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)
Package list and version
import Pkg;
Pkg.status();
Status `~/work/StochasticWeatherGenerators.jl/StochasticWeatherGenerators.jl/docs/Project.toml`
[336ed68f] CSV v0.10.15
⌃ [13f3f980] CairoMakie v0.12.18
[a93c6f00] DataFrames v1.7.0
[1313f7d8] DataFramesMeta v0.15.4
[8bb1440f] DelimitedFiles v1.9.1
[31c24e10] Distributions v0.25.116
[e30172f5] Documenter v1.8.0
[db073c08] GeoMakie v0.7.10
[4138dd39] JLD v0.13.5
[b964fa9f] LaTeXStrings v1.4.0
[98b081ad] Literate v2.20.1
[436b0209] NaturalEarth v0.1.0
[3d8e1505] SmoothPeriodicStatsModels v2.0.1-DEV
[2913bbd2] StatsBase v0.34.4
[f3b207a7] StatsPlots v0.15.7
[3eadb0b9] StochasticWeatherGenerators v1.2.1-DEV `~/work/StochasticWeatherGenerators.jl/StochasticWeatherGenerators.jl`
[fd094767] Suppressor v0.2.8
[ade2ca70] Dates v1.11.0
[b77e0a4c] InteractiveUtils v1.11.0
[9a3f8284] Random v1.11.0
Info Packages marked with ⌃ have new versions available and may be upgradable.
This page was generated using Literate.jl.
- climatikDelannoy, David; Maury, Olivier; Décome, Jérémie, 2022, “CLIMATIK : système d’information pour les données du réseau agroclimatique INRAE”, https://doi.org/10.57745/AJNXEN, Recherche Data Gouv, V1
- STICSBrisson et al. (2003). An overview of the crop model STICS. European Journal of agronomy, 18(3-4), 309-332.