cd(@__DIR__)
import Pkg
Pkg.activate(".");
using PrettyTables Activating project at `C:\Users\metivier\Dropbox\PC (2)\Documents\GitHub\Pluto_export\SWG_coding_challenges`
The notebook is the companion to the talk Coding Stochastic Weather Generators: Challenges and Perspectives given at the SWGEN 2025 - Conference on Stochastic Weather Generators.
It showcases two challenges encountered when developing Stochastic Weather Generators (SWGs):
cd(@__DIR__)
import Pkg
Pkg.activate(".");
using PrettyTables Activating project at `C:\Users\metivier\Dropbox\PC (2)\Documents\GitHub\Pluto_export\SWG_coding_challenges`
# import PythonCall #if you want to test Python version
import RCallAll implementations have been generated by a LLM. All versions could be further optimized, but the goal here is to illustrate typical performance differences “out of the box”. We will illustrate with a simple Hidden Markov Model (HMM) with 4 states and 12-dimensional multivariate normal emissions.
Using packages such as RCall.jl and PythonCall.jl, Julia can seamlessly call R and Python code from within Julia scripts or notebooks. C code can be compiled into shared libraries and called using Julia’s ccall interface.
In the Julia version, we use Distributions.jl for an arbitrary distribution interface. Thanks to Julia’s multiple dispatch, switching to other emission distributions (e.g., Poisson, Gamma) is straightforward.
using Distributions, LinearAlgebra
function rand_HMM(Q, dist, N)
Z = zeros(Int, N)
d = length(dist[1]) # dimension of observations
Y = zeros(d, N)
Z[1] = 1
Y[:, 1] = rand(dist[Z[1]])
for t in 2:N
Z[t] = rand(Categorical(Q[Z[t-1], :]))
Y[:, t] = rand(dist[Z[t]])
end
return Z, Y
endrand_HMM (generic function with 1 method)
In the R version, we use the mvtnorm package for multivariate normal sampling. Compared to the Julia version above, the MvNormal distribution has to be hard-coded.
library(mvtnorm)
rand_HMM <- function(Q, means_list, Sigma_list, N) {
n_states <- nrow(Q)
d <- length(means_list[[1]]) # dimension
Z <- integer(N)
Y <- matrix(0, nrow = d, ncol = N)
Z[1] <- 1
Y[, 1] <- rmvnorm(1, mean = means_list[[Z[1]]], sigma = Sigma_list[[Z[1]]])
for (t in 2:N) {
Z[t] <- sample(1:n_states, size = 1, prob = Q[Z[t-1], ])
Y[, t] <- rmvnorm(1, mean = means_list[[Z[t]]], sigma = Sigma_list[[Z[t]]])
}
return(list(Z = Z, Y = Y))
};┌ Warning: RCall.jl: Warning: package 'mvtnorm' was built under R version 4.4.2 └ @ RCall C:\Users\metivier\.julia\packages\RCall\HIJsY\src\io.jl:171
RCall.RFunction{RCall.RObject{RCall.ClosSxp}}(RCall.RObject{RCall.ClosSxp}
function (Q, means_list, Sigma_list, N)
{
n_states <- nrow(Q)
d <- length(means_list[[1]])
Z <- integer(N)
Y <- matrix(0, nrow = d, ncol = N)
Z[1] <- 1
Y[, 1] <- rmvnorm(1, mean = means_list[[Z[1]]], sigma = Sigma_list[[Z[1]]])
for (t in 2:N) {
Z[t] <- sample(1:n_states, size = 1, prob = Q[Z[t - 1],
])
Y[, t] <- rmvnorm(1, mean = means_list[[Z[t]]], sigma = Sigma_list[[Z[t]]])
}
return(list(Z = Z, Y = Y))
}
)
The Python version also has to hard-code the MvNormal distribution. It was not tested because Python is not installed on my computer.
import numpy as np
def rand_HMM(Q, means_list, Sigma_list, N):
n_states = len(Q)
d = len(means_list[0]) # dimension
Z = np.zeros(N, dtype=int)
Y = np.zeros((d, N))
Z[0] = 0
Y[:, 0] = np.random.multivariate_normal(means_list[Z[0]], Sigma_list[Z[0]])
for t in range(1, N):
Z[t] = np.random.choice(n_states, p=Q[Z[t-1], :])
Y[:, t] = np.random.multivariate_normal(means_list[Z[t]], Sigma_list[Z[t]])
return Z, YHere is the C code generated by the LLM on the first try. It performs manual Cholesky decomposition and Box-Muller sampling of standard normals.
This C code could be further optimized using optimizations flags e.g. -O3 -ffast-math -msse3 during compilation, but the goal here is to illustrate typical performance of C code written by a non-expert. This type of optimization is also possible in Julia with macros such as @inbounds, @fastmath and @simd. Note that this type of optimization trades off safety for speed and should be used with caution.
using Libdl # Dynamic Linker to find path for CThe C code is folded below.
C_code_v1 = """
#include <stddef.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
// Cholesky decomposition (simple version, assumes positive definite)
void cholesky(double *A, double *L, size_t d) {
memset(L, 0, d * d * sizeof(double));
for (size_t i = 0; i < d; i++) {
for (size_t j = 0; j <= i; j++) {
double sum = 0.0;
for (size_t k = 0; k < j; k++) {
sum += L[i * d + k] * L[j * d + k];
}
if (i == j) {
L[i * d + j] = sqrt(A[i * d + i] - sum);
} else {
L[i * d + j] = (A[i * d + j] - sum) / L[j * d + j];
}
}
}
}
void rand_HMM_c_v1(size_t N, size_t d, double *Q, size_t n_states,
double *means, double *Sigmas,
int *Z, double *Y, unsigned int seed) {
srand(seed);
// Precompute Cholesky decompositions
double *L_all = (double *)malloc(n_states * d * d * sizeof(double));
for (size_t s = 0; s < n_states; s++) {
cholesky(&Sigmas[s * d * d], &L_all[s * d * d], d);
}
Z[0] = 0;
// Sample first observation
for (size_t j = 0; j < d; j++) {
double z = 0.0;
for (size_t k = 0; k < d; k++) {
double u1 = (double)rand() / RAND_MAX;
double u2 = (double)rand() / RAND_MAX;
double std_normal = sqrt(-2.0 * log(u1)) * cos(2.0 * M_PI * u2);
z += L_all[Z[0] * d * d + j * d + k] * std_normal;
}
Y[j * N + 0] = means[Z[0] * d + j] + z;
}
for (size_t t = 1; t < N; t++) {
// Sample next state
double r = (double)rand() / RAND_MAX;
double cum_prob = 0.0;
for (size_t s = 0; s < n_states; s++) {
cum_prob += Q[Z[t-1] * n_states + s];
if (r <= cum_prob) {
Z[t] = s;
break;
}
}
// Sample multivariate normal observation
for (size_t j = 0; j < d; j++) {
double z = 0.0;
for (size_t k = 0; k < d; k++) {
double u1 = (double)rand() / RAND_MAX;
double u2 = (double)rand() / RAND_MAX;
double std_normal = sqrt(-2.0 * log(u1)) * cos(2.0 * M_PI * u2);
z += L_all[Z[t] * d * d + j * d + k] * std_normal;
}
Y[j * N + t] = means[Z[t] * d + j] + z;
}
}
free(L_all);
}
""""#include <stddef.h>\n#include <stdlib.h>\n#include <math.h>\n#include <string.h>\n\n// Cholesky decomposition (simple version, assumes positive definite)\nvoid cholesky(double *A, double *L, size_t d) {\n memset(L, 0, d * d * sizeof(double));\n for (size_t i = 0; i < d; i++) {\n " ⋯ 1872 bytes ⋯ "double)rand() / RAND_MAX;\n double std_normal = sqrt(-2.0 * log(u1)) * cos(2.0 * M_PI * u2);\n z += L_all[Z[t] * d * d + j * d + k] * std_normal;\n }\n Y[j * N + t] = means[Z[t] * d + j] + z;\n }\n }\n \n free(L_all);\n}\n"
The Julia wrapper to compile and call the C function:
const Clib_path_v1 = tempname()
open(`gcc -fPIC -xc -shared -o $(Clib_path_v1 * "." * Libdl.dlext) -`, "w") do f
print(f, C_code_v1)
end
function c_rand_HMM_v1(Q::Matrix{Float64}, means::Vector{Vector{Float64}},
Sigmas::Vector{Matrix{Float64}}, N::Int)
n_states = size(Q, 1)
d = length(means[1])
Z = zeros(Int32, N)
Y = zeros(Float64, d, N)
# Flatten means and Sigmas for C
means_flat = reduce(vcat, means)
Sigmas_flat = reduce(vcat, [vec(Sigmas[s]') for s in 1:n_states])
seed = rand(UInt32)
ccall(("rand_HMM_c_v1", Clib_path_v1), Cvoid,
(Csize_t, Csize_t, Ptr{Float64}, Csize_t, Ptr{Float64}, Ptr{Float64},
Ptr{Int32}, Ptr{Float64}, UInt32),
N, d, Q, n_states, means_flat, Sigmas_flat, Z, Y, seed)
return Z, Y
endc_rand_HMM_v1 (generic function with 1 method)
Here is another version assuming that the Cholesky factors of the covariance matrices are precomputed in Julia and passed to C, along with standard normal random variables. So the C code only handles a loop, state sampling and matrix-vector products. After some try, I could not get a pure C version with native C libraries for Cholesky and normal sampling to work on my computer.
The C code is folded below.
C_code_v2 = """
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
// rand_HMM_c: assumes Julia provides L_all (Cholesky factors) and E (standard normals)
// Q: n_states x n_states row-stochastic transition matrix
// means: n_states x d means (flattened row-major per state)
// L_all: n_states x d x d lower-triangular Cholesky factors (flattened)
// E: d x N matrix of standard normals (flattened column-major as passed from Julia)
void rand_HMM_c_v2(size_t N, size_t d, double *Q, size_t n_states,
double *means, double *L_all, double *E,
int *Z, double *Y, unsigned int seed) {
// Simple LCG for reproducible state sampling (optional: srand)
srand(seed);
// Initial state
Z[0] = 0;
// First observation: y = μ + L * e
for (size_t j = 0; j < d; j++) {
double z = 0.0;
for (size_t k = 0; k < d; k++) {
double e = E[k * N + 0];
z += L_all[Z[0] * d * d + j * d + k] * e;
}
Y[j * N + 0] = means[Z[0] * d + j] + z;
}
for (size_t t = 1; t < N; t++) {
// Sample next state from Q[Z[t-1], :]
double r = (double)rand() / RAND_MAX;
double cum_prob = 0.0;
for (size_t s = 0; s < n_states; s++) {
cum_prob += Q[Z[t-1] * n_states + s];
if (r <= cum_prob) {
Z[t] = s;
break;
}
}
// Observation: y = μ + L * e
for (size_t j = 0; j < d; j++) {
double z = 0.0;
for (size_t k = 0; k < d; k++) {
double e = E[k * N + t];
z += L_all[Z[t] * d * d + j * d + k] * e;
}
Y[j * N + t] = means[Z[t] * d + j] + z;
}
}
}
""""#include <stddef.h>\n#include <stdlib.h>\n#include <string.h>\n\n// rand_HMM_c: assumes Julia provides L_all (Cholesky factors) and E (standard normals)\n// Q: n_states x n_states row-stochastic transition matrix\n// means: n_states x d means (flattened row-major per state)\n// L_all" ⋯ 1173 bytes ⋯ "_t j = 0; j < d; j++) {\n double z = 0.0;\n for (size_t k = 0; k < d; k++) {\n double e = E[k * N + t];\n z += L_all[Z[t] * d * d + j * d + k] * e;\n }\n Y[j * N + t] = means[Z[t] * d + j] + z;\n }\n }\n}\n"
const Clib_path_v2 = tempname()
open(`gcc -fPIC -xc -shared -o $(Clib_path_v2 * "." * Libdl.dlext) -`, "w") do f
print(f, C_code_v2)
end
function c_rand_HMM_v2(Q::Matrix{Float64}, means::Vector{Vector{Float64}},
Sigmas::Vector{Matrix{Float64}}, N::Int)
n_states = size(Q, 1)
d = length(means[1])
Z = zeros(Int32, N)
Y = zeros(Float64, d, N)
# Flatten means for C
means_flat = reduce(vcat, means)
# Precompute Cholesky factors in Julia and flatten for C
L_all = reduce(vcat, [vec(cholesky(Sigmas[s]).L') for s in 1:n_states])
# Generate standard normals in Julia (d x N) and flatten column-major
E = randn(d, N)
seed = rand(UInt32)
ccall(("rand_HMM_c_v2", Clib_path_v2), Cvoid,
(Csize_t, Csize_t, Ptr{Float64}, Csize_t, Ptr{Float64}, Ptr{Float64}, Ptr{Float64},
Ptr{Int32}, Ptr{Float64}, UInt32),
N, d, Q, n_states, means_flat, L_all, E, Z, Y, seed)
return Z, Y
endc_rand_HMM_v2 (generic function with 1 method)
Define a 4-state HMM with 12-dimensional multivariate normal emissions:
using BenchmarkTools
using Random
Random.seed!(123)
# Transition matrix
Q = [0.7 0.15 0.10 0.05;
0.2 0.6 0.15 0.05;
0.1 0.2 0.6 0.10;
0.05 0.1 0.15 0.7]
# Generate parameters for 4 states, 12 dimensions
d = 12
n_states = 4
means_vec = [randn(d) .+ i for i in 1:n_states]
Sigmas_vec = [Matrix(Diagonal(rand(d) .+ 0.5)) for _ in 1:n_states]
# Julia version with Distributions.jl
dist = [MvNormal(means_vec[i], Sigmas_vec[i]) for i in 1:n_states]
N = 10_000*10 # 10k*10 time steps ≃ 27*10 years of daily data100000
time_julia = @belapsed rand_HMM(Q, dist, N);RCall.@rput Q
RCall.@rput N
RCall.@rput d
RCall.@rput n_states
RCall.@rput means_vec
RCall.@rput Sigmas_vec;time_R = @belapsed RCall.R"rand_HMM(Q, means_vec, Sigmas_vec, N)";time_C_v1 = @belapsed c_rand_HMM_v1(Q, means_vec, Sigmas_vec, N);time_C_v2 = @belapsed c_rand_HMM_v2(Q, means_vec, Sigmas_vec, N);ratio_C_v1 = time_C_v1 / time_julia
ratio_C_v2 = time_C_v2 / time_julia
ratio_R = time_R / time_julia
results_table = [
"Julia (baseline)" time_julia 1.0;
"C v1" time_C_v1 ratio_C_v1;
"C v2" time_C_v2 ratio_C_v2;
"R" time_R ratio_R
]
pretty_table(results_table,
column_labels=["Language", "Time (s)", "Relative Speed"], backend=:html)| Language | Time (s) | Relative Speed |
|---|---|---|
| Julia (baseline) | 0.128532 | 1.0 |
| C v1 | 6.46593 | 50.306 |
| C v2 | 0.192528 | 1.4979 |
| R | 89.3993 | 695.542 |
mclapply gaining only limited speedup (in practice speedup due to distributed computing increases slower than the number of cores) or rewrite bottlenecks in C/C++ via Rcpp, but this adds complexity and maintenance burden.2-v1. C is not always faster: Surprisingly, the first C implementation (v1) produced by the LLM is ~50.0× slower than Julia! This demonstrates that:
2-v2. As a sanity check the v2 C version, that uses precomputed Cholesky factors and standard normals from dedicated (Julia) packages, is much faster. It has the same speed as the Julia version (the small difference might be due to the interface between Julia and C).
Maximum Likelihood Estimation is the most common method to fit SWGs to data. Typical workflows involve using a package or directly using the optim R function (that is used under the hood by many R packages). The default method is Nelder-Mead, a gradient-free optimization method. The second choice is the BFGS quasi-Newton method where the gradients are approximated with finite differences. Both approaches can be very slow and/or inaccurate for complex models with many parameters. See the optim documentation for more details.
R as well as native Python/Matlab lack automatic differentiation (AD) capabilities.
In Julia, the support of AD is automatic: almost any Julia code can be differentiated with Automatic Differentiation. Of course, Julia optimization packages also support finite differences and derivative-free methods.
We consider \(N\) independent samples \(Y^{(1)}, \ldots, Y^{(N)}\) from a spatial Gaussian Random Field (GRF) at \(d\) locations, where each sample follows:
\[ Y^{(i)} \sim \mathcal{N}(0, \sigma^2 R(\nu, \rho)) \]
The correlation matrix \(R\) has a Matérn structure:
\[ R_{jk} = \rho_{\text{Matérn}}(h_{jk}; \nu, \rho) = \frac{2^{1-\nu}}{\Gamma(\nu)} \left(\frac{h_{jk}}{\rho}\right)^\nu K_\nu\left(\frac{h_{jk}}{\rho}\right) \]
where:
For i.i.d. samples, the log-likelihood is:
\[ \ell(\nu, \rho) = -\frac{N}{2}\left[\log|R(\nu, \rho)| + d\log(2\pi)\right] - \frac{1}{2}\sum_{i=1}^{N} (Y^{(i)})^\top R(\nu, \rho)^{-1} Y^{(i)} \]
We estimate \((\nu, \rho, \sigma)\) by minimizing \(\ell\) using gradient-based optimization with automatic differentiation.
Generate synthetic data from a Matérn GRF with known parameters.
using GaussianRandomFields
using BesselK: _gamma, adbesselkxv
using Distances
using LinearAlgebra
using Random
using Optimization, OptimizationOptimJL
import FiniteDiff, ForwardDiffRandom.seed!(1234)
# Create a d×d grid of 2D spatial locations
d = 15 # number of spatial locations per dimension (total: d²)
x_pts = range(0, stop=10, length=d)
y_pts = range(0, stop=10, length=d)
# Build location matrix (d² × 2)
locs = [(x, y) for x in x_pts, y in y_pts]
locs = hcat([[loc[1], loc[2]] for loc in locs]...)'
# Compute pairwise distances
H = pairwise(Euclidean(), locs, dims=1)225×225 Matrix{Float64}:
0.0 0.714286 1.42857 … 13.1708 13.6464 14.1421
0.714286 0.0 0.714286 12.7175 13.1708 13.6464
1.42857 0.714286 0.0 12.289 12.7175 13.1708
2.14286 1.42857 0.714286 11.8881 12.289 12.7175
2.85714 2.14286 1.42857 11.5175 11.8881 12.289
3.57143 2.85714 2.14286 … 11.1803 11.5175 11.8881
4.28571 3.57143 2.85714 10.8797 11.1803 11.5175
5.0 4.28571 3.57143 10.6186 10.8797 11.1803
5.71429 5.0 4.28571 10.4002 10.6186 10.8797
6.42857 5.71429 5.0 10.227 10.4002 10.6186
⋮ ⋱
10.8797 10.6186 10.4002 4.28571 5.0 5.71429
11.1803 10.8797 10.6186 3.57143 4.28571 5.0
11.5175 11.1803 10.8797 2.85714 3.57143 4.28571
11.8881 11.5175 11.1803 2.14286 2.85714 3.57143
12.289 11.8881 11.5175 … 1.42857 2.14286 2.85714
12.7175 12.289 11.8881 0.714286 1.42857 2.14286
13.1708 12.7175 12.289 0.0 0.714286 1.42857
13.6464 13.1708 12.7175 0.714286 0.0 0.714286
14.1421 13.6464 13.1708 1.42857 0.714286 0.0
Define the Matérn correlation function with the adbesselkxv and _gamma functions from the BesselK.jl package as they are AD-compatible:
function cov_matérn(h, ν)
iszero(h) && return one(h) # Return type-stable 1
return 2^(1 - ν) / _gamma(ν) * adbesselkxv(ν, h)
endcov_matérn (generic function with 1 method)
Generate \(N\) i.i.d. samples from a Matérn GRF
θ_true = [2.5, 1.2] # (ρ, ν)2-element Vector{Float64}:
2.5
1.2
cov = CovarianceFunction(2, Matern(θ_true[1], θ_true[2]))
grf = GaussianRandomField(cov, GaussianRandomFields.Cholesky(), x_pts, y_pts)Gaussian random field with 2d Matérn covariance function (λ=2.5, ν=1.2, σ=1.0, p=2.0) on a 15×15 structured grid, using a Cholesky decomposition
N = 700
Y = [sample(grf) for _ in 1:N]
println("Generated $N samples on a $(d)×$(d) grid ($(size(H,1)) locations)")Generated 700 samples on a 15×15 grid (225 locations)
Implement the NLL using:
function nll(u, p)
ρ, ν = u[1], u[2]
Y, H = p[1], p[2]
N = length(Y)
d = size(H, 1)
# Build correlation matrix R(ν, ρ) with broadcasting
R = cov_matérn.(H / ρ, ν)
# Cholesky decomposition (handles Σ⁻¹ and log|Σ| efficiently)
Σ = cholesky!(Symmetric(R))
# Negative log-likelihood:
# ℓ = (N/2)[log|R| + d·log(2π)] + (1/2)∑ᵢ yᵢᵀ R⁻¹ yᵢ
ℓ = N / 2 * (logdet(Σ) + d * log(2π)) +
sum(dot(vec(y), Σ \ vec(y)) for y in Y) / 2
return ℓ
endnll (generic function with 1 method)
Compare optimization methods: Automatic Differentiation vs Finite Differences vs Derivative-Free
# Initial guess (true values: ρ=2.5, ν=1.2)
θ₀ = [5.2, 0.5]
# 1. Automatic Differentiation (AD) - exact gradients
optf = OptimizationFunction(nll, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optf, θ₀, [Y, H], lb=[0.0, 0.0], ub=[Inf, Inf])
# 2. Finite Differences (FD) - approximate gradients
optfFD = OptimizationFunction(nll, Optimization.AutoFiniteDiff())
probFD = OptimizationProblem(optfFD, θ₀, [Y, H], lb=[0.0, 0.0], ub=[Inf, Inf])OptimizationProblem. In-place: true u0: 2-element Vector{Float64}: 5.2 0.5
Apparently, Nelder-Mead uses one evaluation of gradient for some simplex stuff see here and here. So we use the Finite Differences version for consistency with the R version.
To limit computation time, we limit the maximum number of iterations as Finite Differences is super slow. You can increase it to reach better convergence.
maxiters = 400400
(solAD, timeAD) = @timed solve(prob, IPNewton(), maxiters=maxiters)
solAD.stats┌ Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided. │ So a `SecondOrder` with ADTypes.AutoForwardDiff() for both inner and outer will be created, this can be suboptimal and not work in some cases so │ an explicit `SecondOrder` ADtype is recommended. └ @ OptimizationBase C:\Users\metivier\.julia\packages\OptimizationBase\R2xIG\src\cache.jl:51
SciMLBase.OptimizationStats
Number of iterations: 29
Time in seconds: 192.830000
Number of function evaluations: 74
Number of gradient evaluations: 74
Number of hessian evaluations: 29
(solFD, timeFD) = @timed solve(probFD, IPNewton(), maxiters=maxiters)
solFD.stats┌ Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided. │ So a `SecondOrder` with ADTypes.AutoFiniteDiff() for both inner and outer will be created, this can be suboptimal and not work in some cases so │ an explicit `SecondOrder` ADtype is recommended. └ @ OptimizationBase C:\Users\metivier\.julia\packages\OptimizationBase\R2xIG\src\cache.jl:51
SciMLBase.OptimizationStats
Number of iterations: 400
Time in seconds: 453.919000
Number of function evaluations: 529
Number of gradient evaluations: 529
Number of hessian evaluations: 401
(solNoDiff, timeNoDiff) = @timed solve(probFD, NelderMead(), maxiters=maxiters)
solNoDiff.statsSciMLBase.OptimizationStats
Number of iterations: 5
Time in seconds: 15.423000
Number of function evaluations: 487
Number of gradient evaluations: 1
Number of hessian evaluations: 0
results_table = [
"True" θ_true[1] θ_true[2] "N/A";
"AD" round(solAD.u[1], digits=3) round(solAD.u[2], digits=3) timeAD;
"FD" round(solFD.u[1], digits=3) round(solFD.u[2], digits=3) timeFD;
"NM" round(solNoDiff.u[1], digits=3) round(solNoDiff.u[2], digits=3) timeNoDiff
]
pretty_table(results_table,
column_labels=["Method", "ρ", "ν", "Time (s)"], backend=:html)| Method | ρ | ν | Time (s) |
|---|---|---|---|
| True | 2.5 | 1.2 | N/A |
| AD | 2.516 | 1.195 | 205.186 |
| FD | 2.871 | 1.118 | 457.771 |
| NM | 2.516 | 1.195 | 16.2413 |
θσ_true = [2.5, 1.2, 4.0] # (ρ, ν, σ)3-element Vector{Float64}:
2.5
1.2
4.0
covσ = CovarianceFunction(2, Matern(θσ_true[1], θσ_true[2], σ=θσ_true[3]))
grfσ = GaussianRandomField(covσ, GaussianRandomFields.Cholesky(), x_pts, y_pts)
Yσ = [sample(grfσ) for _ in 1:N]700-element Vector{Matrix{Float64}}:
[-2.908789397950098 -3.8131606954589463 … 2.1799794522902656 4.596670635099563; 0.11906230828325759 -1.1200414240148966 … 0.7310653607067716 3.2218717476072607; … ; -3.3298667384113565 -3.4485990712482915 … 1.851762239582804 0.33598070020734483; -3.5281571196689114 -2.469843084132078 … 2.0786003052783015 2.2067623466468334]
[-0.313092254671193 -0.6952943404963746 … 7.093191293580664 7.255686785741399; -3.04940693514618 -3.202361341070152 … 6.3909650094538275 6.578454974732634; … ; -4.477612774250883 -4.244753240458948 … 4.949807486617989 2.6337494394464143; -4.067743540401492 -3.4918743980425733 … 2.121380102158777 1.609712371730212]
[-0.08451936980921823 1.3443976005655083 … -0.47142832756923886 0.2625890895516912; 0.7750323392359102 3.1527001162621002 … 0.44974661369119606 0.16415799665088204; … ; -8.165703436450121 -8.813877348866782 … -3.1156279769047415 -0.3879050059615239; -7.554554583789133 -7.758913834321561 … -3.3778080532296975 -1.240276468606131]
[-5.676515966859646 -6.716685318367137 … 1.1369109918964289 3.2347819481159226; -5.895749134693372 -5.098952852195997 … 1.2606823402520817 3.686720551891458; … ; 3.5131749565860018 4.189357149735303 … -6.934798611064781 -4.662758977888864; 5.599097883613813 5.9553517953183706 … -4.8282843374871565 -3.3682827499020185]
[1.7339207170392155 2.322741455121328 … -8.127329784300995 -5.727244717650702; 2.085657998190487 1.8108420652039547 … -8.472794353242957 -6.491141320348742; … ; 5.213523793252757 3.376616624133374 … -3.3237930098913777 -2.06791067069522; 6.481376200083426 5.381115750848656 … -2.8116397062662832 -0.05224306321142902]
[-1.633560704076629 -0.3595365630862855 … -0.24969665672829425 -0.34394631395533665; -1.4779095730876675 -0.9413336061672843 … -2.0973063192688306 -2.016989620303522; … ; -1.386754134015919 -1.6217364408241104 … -1.8362684460254384 -1.1043748550866055; -0.9229258249564791 0.0392410697080412 … -3.299544916763974 -1.5806815955476854]
[-5.290246384521325 -4.5022009087722035 … -7.666662515900084 -5.984660828256332; -4.372175018547575 -3.8002572482946015 … -7.549476927796233 -5.734528976775111; … ; -4.672732410194486 -4.788264512294825 … 5.0552972364680615 5.51746250758299; -6.778288457511182 -6.326729073888694 … 4.925936375368744 5.416843346158407]
[-3.0832222044749904 -4.302480890420956 … 1.1941700367472539 0.6218192805152737; -1.336121120806356 -2.223844366280061 … 1.8840192198774357 1.5929577583256038; … ; -0.42063772223176654 -1.2150581056241614 … 1.4106669967056704 0.050247108332530876; 0.6177976934503271 0.0793484404884428 … 3.1293367349276053 1.881441216724007]
[-2.4546302654166614 -2.016642806126867 … -0.7019843958779142 -0.29714555390124786; -2.1951287727681925 -1.8126456819829178 … -0.7444362147916765 -0.15346935586833887; … ; -2.480543584657277 -3.2317242833242146 … 3.027928672397817 5.324174042677577; -3.0314321624627656 -4.373592980519557 … 2.507429939440178 2.798379151642449]
[4.78204397472337 2.4463424940950604 … 0.8210860973949246 0.23207574303935044; 5.7413773724474595 3.4747108943183544 … 1.9078907374287999 1.481235504196272; … ; -1.4097992333052762 -3.161642103908008 … 1.6329258061555523 4.486541843425547; -2.0954955820260173 -3.371506553563967 … 1.7525186624116396 2.7787361175836813]
⋮
[2.526504307776522 -0.5847799761052339 … 0.29965532278756324 0.4240473963351052; -2.021392457653805 -2.648417948989092 … 0.917961973589297 0.28738887483136855; … ; 6.59707784165706 6.042459463855335 … 2.6017701942788856 3.2747496694198404; 5.309849811493078 5.204631071316478 … 2.576088218521617 3.1914362348104146]
[-0.44049596592253704 -1.9777331875924107 … 0.412011950200139 2.9056766799873466; 0.9038349644701564 -0.561041692044341 … -0.3618454049533446 1.3488068409579839; … ; 5.412064157780901 2.391218999855472 … -9.282943772912104 -9.233555804007667; 4.266229309591138 2.8374530505903968 … -9.425347389768108 -9.754728455949365]
[-0.28475469701010214 1.1001766540481925 … 1.600514853699297 2.7118075329270934; -0.5697853430646705 1.4401770090838288 … -0.6828174553503688 1.6657797982748135; … ; 3.7440139090116857 2.0768368409453264 … 5.396916300495214 5.129191480855952; 6.015021207004874 4.257866286559388 … 5.769484677895831 4.71061678311146]
[-3.188028190815699 -3.359640584677632 … 1.3557516107975274 -0.1388938388383145; -3.8710585453041375 -4.001455922305908 … 0.4343308250061231 -0.5233291343188942; … ; -3.967809264805045 -3.184588880537353 … 0.6489500382398137 1.0592860503183903; -2.3893513789524996 -0.859509260753168 … 0.9214782762785829 0.23437136986225815]
[-0.09670717025197507 -0.19189193937892765 … -0.4860045994573195 -0.1752761301483978; -1.4829104528641177 0.0025063582641504933 … 0.9999686829041503 0.1639107576190973; … ; 3.2668570795780245 5.716426108602874 … 1.8149671149761357 1.344971325398709; 1.8126088479126672 1.7850094922579074 … -0.27854407132346115 -0.40689749020303384]
[-5.822843100723238 -3.973250899903822 … -4.12918582209273 -2.6027512827605546; -6.602395251810106 -6.131332389022809 … -3.2096059155636265 -3.72348450736055; … ; -0.7785456273546474 -0.9186385412820087 … 8.263413235836842 7.030564212468362; -2.380483246214559 -2.4546152198065996 … 8.18636463419122 7.813930742438426]
[3.960708464997427 5.4964917833571585 … -1.304882317952616 -1.7013950196041603; 2.4827678410065404 4.5965405775169055 … -1.881955529462397 -1.6175246759169428; … ; 10.444819508881277 10.67813598858717 … -4.134015386176744 -4.035045443593478; 9.507305235402903 10.13729267232085 … -4.305964195395933 -5.568150978574627]
[-1.0823772157612597 1.105657976804282 … 8.311003893951643 6.428549247345561; 0.6840986172085235 2.85366358824699 … 8.301786420234777 6.214449730539613; … ; -2.1260710615001566 -4.501645010387021 … -4.0008100305918175 -2.9475338619153515; 0.46249287120664107 -1.7888746933039674 … -3.780282406516182 -1.9722418922947416]
[3.8484383176364076 5.44552172636197 … -0.7565132972602653 1.0855799275964333; 2.4396625814313637 4.271836880977826 … 0.8054201037994785 2.4563274869622727; … ; 5.463672975178838 5.6590107771176505 … 2.5074168712247427 0.7505902477446413; 5.029091961300201 5.238688979569602 … 3.222128418942532 3.2620639216563845]
function nllσ(u, p)
ρ, ν, σ = u[1], u[2], u[3]
Y, H = p[1], p[2]
N = length(Y)
d = size(H, 1)
# Build correlation matrix R(ν, ρ) with broadcasting
R = σ^2 * cov_matérn.(H / ρ, ν)
# Cholesky decomposition (handles Σ⁻¹ and log|Σ| efficiently)
Σ = cholesky!(Symmetric(R))
# Negative log-likelihood:
# ℓ = (N/2)[log|R| + d·log(2π)] + (1/2)∑ᵢ yᵢᵀ R⁻¹ yᵢ
ℓ = N / 2 * (logdet(Σ) + d * log(2π)) +
sum(dot(vec(y), Σ \ vec(y)) for y in Y) / 2
return ℓ
endnllσ (generic function with 1 method)
Compare optimization methods: Automatic Differentiation vs Finite Differences vs Derivative-Free
# Initial guess (true values: ρ=2.5, ν=1.2, σ = 4.0)
θσ₀ = [0.2, 0.5, 0.5]
# With automatic differentiation
optfσ = OptimizationFunction(nllσ, Optimization.AutoForwardDiff())
probσ = OptimizationProblem(optfσ, θσ₀, [Yσ, H], lb=[0.0, 0.0, 0.0], ub=[Inf, Inf, Inf])
# With finite differences
optfσFD = OptimizationFunction(nllσ, Optimization.AutoFiniteDiff())
probσFD = OptimizationProblem(optfσFD, θσ₀, [Yσ, H], lb=[0.0, 0.0, 0.0], ub=[Inf, Inf, Inf])OptimizationProblem. In-place: true u0: 3-element Vector{Float64}: 0.2 0.5 0.5
(solAD, timeAD) = @timed solve(probσ, IPNewton(), maxiters=maxiters)
solAD.stats┌ Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided. │ So a `SecondOrder` with ADTypes.AutoForwardDiff() for both inner and outer will be created, this can be suboptimal and not work in some cases so │ an explicit `SecondOrder` ADtype is recommended. └ @ OptimizationBase C:\Users\metivier\.julia\packages\OptimizationBase\R2xIG\src\cache.jl:51
SciMLBase.OptimizationStats
Number of iterations: 32
Time in seconds: 145.094000
Number of function evaluations: 76
Number of gradient evaluations: 76
Number of hessian evaluations: 32
Since Finite Differences is very slow, we catch potential failures to converge within the iteration budget.
(solFD, timeFD) = @timed try
solve(probσFD, IPNewton(), maxiters=maxiters)
catch
@warn "The solve failed."
(u=[missing, missing, missing], stats=nothing)
end
solFD.stats┌ Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided. │ So a `SecondOrder` with ADTypes.AutoFiniteDiff() for both inner and outer will be created, this can be suboptimal and not work in some cases so │ an explicit `SecondOrder` ADtype is recommended. └ @ OptimizationBase C:\Users\metivier\.julia\packages\OptimizationBase\R2xIG\src\cache.jl:51
SciMLBase.OptimizationStats
Number of iterations: 400
Time in seconds: 1360.517000
Number of function evaluations: 783
Number of gradient evaluations: 783
Number of hessian evaluations: 401
In this example, Nelder-Mead also fails to converge (producing non positive definite matrices).
(solNoDiff, timeNoDiff) = @timed try
solve(probσ, NelderMead(), maxiters=maxiters)
catch
@warn "The solve failed."
(u=[missing, missing, missing], stats=nothing)
end
solNoDiff.stats┌ Warning: The solve failed. └ @ Main.Notebook C:\Users\metivier\Dropbox\PC (2)\Documents\GitHub\Pluto_export\SWG_coding_challenges\Talk_DM_quarto.qmd:698
results_table = [
"True" θσ_true[1] θσ_true[2] θσ_true[3] "N/A";
"AD" round(solAD.u[1], digits=3) round(solAD.u[2], digits=3) round(solAD.u[3], digits=3) timeAD;
"FD" round(solFD.u[1], digits=3) round(solFD.u[2], digits=3) round(solFD.u[3], digits=3) timeFD;
"NM" round(solNoDiff.u[1], digits=3) round(solNoDiff.u[2], digits=3) round(solNoDiff.u[3], digits=3) timeNoDiff
]
pretty_table(results_table,
column_labels=["Method", "ρ", "ν", "σ", "Time (s)"], backend=:html)| Method | ρ | ν | σ | Time (s) |
|---|---|---|---|---|
| True | 2.5 | 1.2 | 4.0 | N/A |
| AD | 2.508 | 1.196 | 3.987 | 146.412 |
| FD | 2.405 | 1.207 | 3.87 | 1364.05 |
| NM | missing | missing | missing | 38.5543 |
To fit a two or three parameter Matérn GRF model to spatial data on a small 2D grid with 700 i.i.d. samples:
This does not mean that AD is always the best choice, but it should be seriously considered when fitting complex SWG models with many parameters.
using InteractiveUtils
InteractiveUtils.versioninfo()
Pkg.status();Julia Version 1.12.1 Commit ba1e628ee4 (2025-10-17 13:02 UTC) Build Info: Official https://julialang.org release Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: 28 × 13th Gen Intel(R) Core(TM) i7-13850HX WORD_SIZE: 64 LLVM: libLLVM-18.1.7 (ORCJIT, alderlake) GC: Built with stock GC Threads: 1 default, 0 interactive, 1 GC (on 28 virtual cores) Environment: JULIA_LOAD_PATH = @;@stdlib JULIA_PROJECT = @. Status `C:\Users\metivier\Dropbox\PC (2)\Documents\GitHub\Pluto_export\SWG_coding_challenges\Project.toml` [6e4b80f9] BenchmarkTools v1.6.3 [432ab697] BesselK v0.5.7 [a93c6f00] DataFrames v1.8.1 [b4f34e82] Distances v0.10.12 [31c24e10] Distributions v0.25.122 [6a86dc24] FiniteDiff v2.29.0 [f6369f11] ForwardDiff v1.3.0 [e4b2fa32] GaussianRandomFields v2.2.7 ⌃ [7f7a1694] Optimization v5.1.0 [36348300] OptimizationOptimJL v0.4.8 [08abe8d2] PrettyTables v3.1.2 [6f49c342] RCall v0.14.10 [b77e0a4c] InteractiveUtils v1.11.0 [8f399da3] Libdl v1.11.0 [37e2e46d] LinearAlgebra v1.12.0 Info Packages marked with ⌃ have new versions available and may be upgradable.