📅 Weather Data
Several (and more coming) weather station format can be read and transformed to DataFrame
.
ECA dataset
From the European Climate Assessment & Dataset project at this link for zip of all stations per variables and at this link for custom manual query. I asked them about an API to extract directly a specific file automatically, but they answer it is not currently available. I tried unzip-http but could not get it working with ECA website[1]
using StochasticWeatherGenerators, DataFrames, Dates
collect_data_ECA(33, Date(1956), Date(2019, 12, 31), "https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/master/weather_files/ECA_blend_rr/RR_", portion_valid_data=1, skipto=22, header=21, url=true)[1:10,:]
Row | STAID | SOUID | DATE | RR | Q_RR |
---|---|---|---|---|---|
Int64 | Int64 | Date | Int64 | Int64 | |
1 | 33 | 105 | 1956-01-01 | 23 | 0 |
2 | 33 | 105 | 1956-01-02 | 1 | 0 |
3 | 33 | 105 | 1956-01-03 | 0 | 0 |
4 | 33 | 105 | 1956-01-04 | 0 | 0 |
5 | 33 | 105 | 1956-01-05 | 0 | 0 |
6 | 33 | 105 | 1956-01-06 | 21 | 0 |
7 | 33 | 105 | 1956-01-07 | 3 | 0 |
8 | 33 | 105 | 1956-01-08 | 25 | 0 |
9 | 33 | 105 | 1956-01-09 | 20 | 0 |
10 | 33 | 105 | 1956-01-10 | 0 | 0 |
StochasticWeatherGenerators.collect_data_ECA
— Functioncollect_data_ECA(STAID::Integer, path::String, var::String="RR"; skipto=19, header = 18)
path
gives the path where all data files are stored in a vector
collect_data_ECA(STAID, date_start::Date, date_end::Date, path::String, var::String="RR"; portion_valid_data=1, skipto=19, header = 18, return_nothing = true)
path
gives the path where all data files are stored in a vector- Filter the
DataFrame
s.t.date_start ≤ :DATE ≤ date_end
- var = "RR", "TX" etc.
portion_valid_data
is the portion of valid data we are ok with. If we don't want any missing, fix it to1
.skipto
andheader
forcsv
files with meta informations/comments at the beginning of files. SeeCSV.jl
.return_nothing
iftrue
it will returnnothing
is the file does not exists or does not have enough valid data.
Météo France
Météo France do have a version of this data and it is accessible through an API on the website Data.Gouv.fr. This package provides a simple command to extract the data of one station (given its STAtionID) from the API.
collect_data_MeteoFrance(34154001)[1:10,:] # Montpellier Airport
Row | __id | STAID | STANAME | LAT | LON | ALTI | DATE | RR | QRR | TN | QTN | HTN | QHTN | TX | QTX | HTX | QHTX | TM | QTM | TNTXM | QTNTXM | TAMPLI | QTAMPLI | TNSOL | QTNSOL | TN50 | QTN50 | DG | QDG | FFM | QFFM | FF2M | QFF2M | FXY | QFXY | DXY | QDXY | HXY | QHXY | FXI | QFXI | DXI | QDXI | HXI | QHXI | FXI2 | QFXI2 | DXI2 | QDXI2 | HXI2 | QHXI2 | FXI3S | QFXI3S | DXI3S | QDXI3S | HXI3S | QHXI3S | DRR | QDRR |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Int64 | String31 | Float64 | Float64 | Int64 | Date | Float64 | Int64 | Float64 | Int64 | Int64 | Int64 | Float64 | Int64 | Int64 | Int64 | Float64 | String1 | Float64 | Int64 | Float64 | Int64 | Float64 | Int64 | Float64 | Int64 | Int64? | Int64? | Float64 | String1 | Missing | Missing | Float64 | String1 | Int64 | Int64 | Int64 | Int64 | Float64 | Int64 | Int64 | Int64 | Int64 | Int64 | Missing | Missing | Missing | Missing | Missing | Missing | Float64? | String1? | Missing | Missing | Int64? | Int64? | Int64? | Int64? | |
1 | 5688 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-01 | 0.0 | 1 | 8.8 | 1 | 636 | 9 | 15.5 | 1 | 1351 | 9 | 10.8 | t | 12.2 | 1 | 6.7 | 1 | 5.8 | 9 | 7.7 | 9 | 0 | 9 | 4.5 | t | missing | missing | 9.6 | t | 290 | 1 | 11 | 9 | 13.9 | 1 | 290 | 1 | 100 | 9 | missing | missing | missing | missing | missing | missing | 13.0 | t | missing | missing | 100 | 9 | 0 | 9 |
2 | 5689 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-02 | 0.0 | 1 | 2.6 | 1 | 323 | 9 | 12.3 | 1 | 1358 | 9 | 7.7 | t | 7.5 | 1 | 9.7 | 1 | -0.7 | 9 | 1.0 | 9 | 0 | 9 | 3.1 | t | missing | missing | 5.2 | t | 10 | 1 | 1724 | 9 | 7.4 | 1 | 360 | 1 | 1724 | 9 | missing | missing | missing | missing | missing | missing | 7.2 | t | missing | missing | 1724 | 9 | 0 | 9 |
3 | 5690 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-03 | 0.0 | 1 | 4.0 | 1 | 453 | 9 | 17.9 | 1 | 1440 | 9 | 11.5 | t | 11.0 | 1 | 13.9 | 1 | 1.0 | 9 | 1.9 | 9 | 0 | 9 | 3.7 | t | missing | missing | 10.3 | t | 280 | 9 | 2031 | 9 | 14.9 | 1 | 290 | 9 | 2018 | 9 | missing | missing | missing | missing | missing | missing | 13.9 | t | missing | missing | 2029 | 9 | 0 | 9 |
4 | 5691 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-04 | 0.0 | 1 | 8.5 | 1 | 710 | 9 | 17.0 | 1 | 1158 | 9 | 12.7 | t | 12.8 | 1 | 8.5 | 1 | 6.0 | 9 | 6.9 | 9 | 0 | 9 | 3.9 | t | missing | missing | 9.3 | t | 290 | 1 | 20 | 9 | 12.0 | 1 | 300 | 1 | 3 | 9 | missing | missing | missing | missing | missing | missing | 11.1 | t | missing | missing | 10 | 9 | 66 | 9 |
5 | 5692 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-05 | 0.0 | 1 | 10.3 | 1 | 109 | 9 | 14.7 | 1 | 1447 | 9 | 10.8 | t | 12.5 | 1 | 4.4 | 1 | 7.9 | 9 | 9.1 | 9 | 0 | 9 | 3.8 | t | missing | missing | 7.1 | t | 10 | 1 | 205 | 9 | 10.6 | 1 | 20 | 1 | 225 | 9 | missing | missing | missing | missing | missing | missing | 10.1 | t | missing | missing | 225 | 9 | 0 | 9 |
6 | 5693 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-06 | 0.0 | 1 | 7.1 | 1 | 659 | 9 | 13.7 | 1 | 1220 | 9 | 10.2 | t | 10.4 | 1 | 6.6 | 1 | 3.6 | 9 | 6.0 | 9 | 0 | 9 | 7.4 | t | missing | missing | 12.2 | t | 320 | 1 | 1254 | 9 | 17.6 | 1 | 310 | 1 | 1248 | 9 | missing | missing | missing | missing | missing | missing | 16.0 | t | missing | missing | 1137 | 9 | 0 | 9 |
7 | 5694 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-07 | 0.0 | 1 | 7.4 | 1 | 630 | 9 | 11.4 | 1 | 1254 | 9 | 8.6 | t | 9.4 | 1 | 4.0 | 1 | 5.5 | 9 | 6.4 | 9 | 0 | 9 | 9.6 | t | missing | missing | 15.2 | t | 320 | 1 | 1218 | 9 | 21.7 | 1 | 320 | 1 | 1214 | 9 | missing | missing | missing | missing | missing | missing | 19.8 | t | missing | missing | 1213 | 9 | 0 | 9 |
8 | 5695 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-08 | 0.0 | 1 | 2.9 | 1 | 1800 | 9 | 8.8 | 1 | 1153 | 9 | 5.5 | t | 5.9 | 1 | 5.9 | 1 | 0.5 | 9 | 1.5 | 9 | 31 | 9 | 5.4 | t | missing | missing | 9.5 | t | 330 | 1 | 1017 | 9 | 13.9 | 1 | 330 | 1 | 1034 | 9 | missing | missing | missing | missing | missing | missing | 12.9 | t | missing | missing | 1113 | 9 | 0 | 9 |
9 | 5696 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-09 | 0.0 | 1 | -4.1 | 1 | 416 | 9 | 9.5 | 1 | 1324 | 9 | 1.8 | t | 2.7 | 1 | 13.6 | 1 | -8.9 | 9 | -6.5 | 9 | 722 | 9 | 2.0 | t | missing | missing | 3.5 | t | 360 | 1 | 1844 | 9 | 5.5 | 1 | 350 | 1 | 1104 | 9 | missing | missing | missing | missing | missing | missing | 5.1 | t | missing | missing | 1110 | 9 | 0 | 9 |
10 | 5697 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-10 | 44.9 | 1 | -2.8 | 1 | 210 | 9 | 8.6 | 1 | 1448 | 9 | 4.2 | t | 2.9 | 1 | 11.4 | 1 | -7.4 | 9 | -5.3 | 9 | 302 | 9 | 4.5 | t | missing | missing | 8.2 | t | 70 | 9 | 1929 | 9 | 13.1 | 1 | 70 | 9 | 1925 | 9 | missing | missing | missing | missing | missing | missing | 11.3 | t | missing | missing | 1925 | 9 | 1079 | 9 |
As it is rather new, this DataGov/MeteoFrance API may change in the future making this function not working anymore. It is currently not fully working. One would have to call the DataGov API directly.
StochasticWeatherGenerators.collect_data_MeteoFrance
— Functioncollect_data_MeteoFrance(STAID; show_warning=false, impute_missing=[], period="1950-2021", variables = "all")
Given a STAID
(station ID given by Météo France), it returns a DataFrame
with data in period
and for the variables
.
STAID
can be an integer or string.- Option for
period
are "1846-1949", "1950-2021", "2022-2023" - Option for
variables
areall
, "RR-T-Wind", "others" impute_missing
expects a vector of column name(s) where to impute missing withImpute.Interpolate
e.g.impute_missing=[:TX]
.show_warning
in case of missing data.false
for no column,true
for all variables columns and for selected columns e.g.show_warning = [:TX]
.
The data is available through the French Data.gouv.fr website api. Data may be updated without notice. See the following two links to get informations on the "RR-T-Wind" and "others" variables (in French)
- https://object.files.data.gouv.fr/meteofrance/data/synchroftp/BASE/QUOT/QdescriptifchampsRR-T-Vent.csv
- https://object.files.data.gouv.fr/meteofrance/data/synchroftp/BASE/QUOT/Qdescriptifchampsautres-parametres.csv
Or the the SICLIMA website with information (in French) about computation and conversion for some weather variables/index.
StochasticWeatherGenerators.download_data_MeteoFrance
— Functiondownload_data_MeteoFrance(STAID, period = "2024-2025", variables = "all")
Function not really working anymore as the API changed in 2024.
- Option for
period
are "1846-1949", "1950-2021", "2022-2023" - Option for
variables
areall
, "RR-T-Wind", "others"
The data is available through the French Data.gouv.fr website api. Data may be updated without notice. In particular the path to the data may change.
INRAE
The INRAE CLIMATIK platform (Delannoy et al., 2022) (https://agroclim.inrae.fr/climatik/, in French) managed by the AgroClim laboratory of Avignon, France has weather stations. However, their API is not open access.
StochasticWeatherGenerators.collect_data_INRAE
— Functioncollect_data_INRAE(station_path::String; show_warning=false, impute_missing=[])
Read from a file an INRAE formatted weather station data and transform it to match ECA standard naming conventions.
impute_missing
expects a vector of column name(s) where to impute missing withImpute.Interpolate
e.g.impute_missing=[:TX]
.show_warning
in case of missing data.false
for no column,true
for all variables columns and for selected columns e.g.show_warning = [:TX]
.
Others
Data manipulation
StochasticWeatherGenerators.clean_data
— Functionclean_data(df::DataFrame; show_warning=false, impute_missing=[])
Impute missing and show warning for missings. It assumes that the first two columns are not numeric.
impute_missing
expects a vector of column name(s) where to impute missing withImpute.Interpolate
e.g.impute_missing=[:TX]
.show_warning
in case of missing data.false
for no column,true
for all variables columns and for selected columns e.g.show_warning = [:TX]
.
StochasticWeatherGenerators.select_in_range_df
— Functionselect_in_range_df(datas, start_Date, interval_Date, [portion])
Select station with some data availability in dates and quality (portion of valid data). Input is a vector
(array) of DataFrame
(one for each station for example) or a Dict
of DataFrame
. If 0 < portion ≤ 1
is specified, it will authorize some portion of data to be missing.
StochasticWeatherGenerators.shortname
— Functionshortname(name::String)
Experimental function that returns only the most relevant part of a station name.
long_name = "TOULOUSE-BLAGNAC"
shortname(long_name) # "TOULOUSE"
References
- Delannoy, D.; Maury, O. and Décome, J. (2022). CLIMATIK: système d’information pour les données du réseau agroclimatique INRAE.
- 1I don't remember exactly in fact.