📅 Weather Data
Several (and more coming) weather station format can be read and transformed to DataFrame.
ECA dataset
From the European Climate Assessment & Dataset project at this link for zip of all stations per variables and at this link for custom manual query. I asked them about an API to extract directly a specific file automatically, but they answer it is not currently available. I tried unzip-http but could not get it working with ECA website[1]
using StochasticWeatherGenerators, DataFrames, Dates
collect_data_ECA(33, Date(1956), Date(2019, 12, 31), "https://raw.githubusercontent.com/dmetivie/StochasticWeatherGenerators.jl/master/weather_files/ECA_blend_rr/RR_", portion_valid_data=1, skipto=22, header=21, url=true)[1:10,:]| Row | STAID | SOUID | DATE | RR | Q_RR |
|---|---|---|---|---|---|
| Int64 | Int64 | Date | Int64 | Int64 | |
| 1 | 33 | 105 | 1956-01-01 | 23 | 0 |
| 2 | 33 | 105 | 1956-01-02 | 1 | 0 |
| 3 | 33 | 105 | 1956-01-03 | 0 | 0 |
| 4 | 33 | 105 | 1956-01-04 | 0 | 0 |
| 5 | 33 | 105 | 1956-01-05 | 0 | 0 |
| 6 | 33 | 105 | 1956-01-06 | 21 | 0 |
| 7 | 33 | 105 | 1956-01-07 | 3 | 0 |
| 8 | 33 | 105 | 1956-01-08 | 25 | 0 |
| 9 | 33 | 105 | 1956-01-09 | 20 | 0 |
| 10 | 33 | 105 | 1956-01-10 | 0 | 0 |
StochasticWeatherGenerators.collect_data_ECA — Functioncollect_data_ECA(STAID::Integer, path::String, var::String="RR"; skipto=19, header = 18)path gives the path where all data files are stored in a vector
collect_data_ECA(STAID, date_start::Date, date_end::Date, path::String, var::String="RR"; portion_valid_data=1, skipto=19, header = 18, return_nothing = true)pathgives the path where all data files are stored in a vector- Filter the
DataFrames.t.date_start ≤ :DATE ≤ date_end - var = "RR", "TX" etc.
portion_valid_datais the portion of valid data we are ok with. If we don't want any missing, fix it to1.skiptoandheaderforcsvfiles with meta informations/comments at the beginning of files. SeeCSV.jl.return_nothingiftrueit will returnnothingis the file does not exists or does not have enough valid data.
Météo France
Météo France do have a version of this data and it is accessible through an API on the website Data.Gouv.fr. This package provides a simple command to extract the data of one station (given its STAtionID) from the API.
collect_data_MeteoFrance(34154001)[1:10,:] # Montpellier Airport| Row | __id | STAID | STANAME | LAT | LON | ALTI | DATE | RR | QRR | TN | QTN | HTN | QHTN | TX | QTX | HTX | QHTX | TM | QTM | TNTXM | QTNTXM | TAMPLI | QTAMPLI | TNSOL | QTNSOL | TN50 | QTN50 | DG | QDG | FFM | QFFM | FF2M | QFF2M | FXY | QFXY | DXY | QDXY | HXY | QHXY | FXI | QFXI | DXI | QDXI | HXI | QHXI | FXI2 | QFXI2 | DXI2 | QDXI2 | HXI2 | QHXI2 | FXI3S | QFXI3S | DXI3S | QDXI3S | HXI3S | QHXI3S | DRR | QDRR |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Int64 | String31 | Float64 | Float64 | Int64 | Date | Float64 | Int64 | Float64 | Int64 | Int64 | Int64 | Float64 | Int64 | Int64 | Int64 | Float64 | String1 | Float64 | Int64 | Float64 | Int64 | Float64 | Int64 | Float64 | Int64 | Int64? | Int64? | Float64 | String1 | Missing | Missing | Float64 | String1 | Int64 | Int64 | Int64 | Int64 | Float64 | Int64 | Int64 | Int64 | Int64 | Int64 | Missing | Missing | Missing | Missing | Missing | Missing | Float64? | String1? | Missing | Missing | Int64? | Int64? | Int64? | Int64? | |
| 1 | 5778 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-01 | 0.0 | 1 | 8.8 | 1 | 636 | 9 | 15.5 | 1 | 1351 | 9 | 10.8 | t | 12.2 | 1 | 6.7 | 1 | 5.8 | 9 | 7.7 | 9 | 0 | 9 | 4.5 | t | missing | missing | 9.6 | t | 290 | 1 | 11 | 9 | 13.9 | 1 | 290 | 1 | 100 | 9 | missing | missing | missing | missing | missing | missing | 13.0 | t | missing | missing | 100 | 9 | 0 | 9 |
| 2 | 5779 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-02 | 0.0 | 1 | 2.6 | 1 | 323 | 9 | 12.3 | 1 | 1358 | 9 | 7.7 | t | 7.5 | 1 | 9.7 | 1 | -0.7 | 9 | 1.0 | 9 | 0 | 9 | 3.1 | t | missing | missing | 5.2 | t | 10 | 1 | 1724 | 9 | 7.4 | 1 | 360 | 1 | 1724 | 9 | missing | missing | missing | missing | missing | missing | 7.2 | t | missing | missing | 1724 | 9 | 0 | 9 |
| 3 | 5780 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-03 | 0.0 | 1 | 4.0 | 1 | 453 | 9 | 17.9 | 1 | 1440 | 9 | 11.5 | t | 11.0 | 1 | 13.9 | 1 | 1.0 | 9 | 1.9 | 9 | 0 | 9 | 3.7 | t | missing | missing | 10.3 | t | 280 | 9 | 2031 | 9 | 14.9 | 1 | 290 | 9 | 2018 | 9 | missing | missing | missing | missing | missing | missing | 13.9 | t | missing | missing | 2029 | 9 | 0 | 9 |
| 4 | 5781 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-04 | 0.0 | 1 | 8.5 | 1 | 710 | 9 | 17.0 | 1 | 1158 | 9 | 12.7 | t | 12.8 | 1 | 8.5 | 1 | 6.0 | 9 | 6.9 | 9 | 0 | 9 | 3.9 | t | missing | missing | 9.3 | t | 290 | 1 | 20 | 9 | 12.0 | 1 | 300 | 1 | 3 | 9 | missing | missing | missing | missing | missing | missing | 11.1 | t | missing | missing | 10 | 9 | 66 | 9 |
| 5 | 5782 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-05 | 0.0 | 1 | 10.3 | 1 | 109 | 9 | 14.7 | 1 | 1447 | 9 | 10.8 | t | 12.5 | 1 | 4.4 | 1 | 7.9 | 9 | 9.1 | 9 | 0 | 9 | 3.8 | t | missing | missing | 7.1 | t | 10 | 1 | 205 | 9 | 10.6 | 1 | 20 | 1 | 225 | 9 | missing | missing | missing | missing | missing | missing | 10.1 | t | missing | missing | 225 | 9 | 0 | 9 |
| 6 | 5783 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-06 | 0.0 | 1 | 7.1 | 1 | 659 | 9 | 13.7 | 1 | 1220 | 9 | 10.2 | t | 10.4 | 1 | 6.6 | 1 | 3.6 | 9 | 6.0 | 9 | 0 | 9 | 7.4 | t | missing | missing | 12.2 | t | 320 | 1 | 1254 | 9 | 17.6 | 1 | 310 | 1 | 1248 | 9 | missing | missing | missing | missing | missing | missing | 16.0 | t | missing | missing | 1137 | 9 | 0 | 9 |
| 7 | 5784 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-07 | 0.0 | 1 | 7.4 | 1 | 630 | 9 | 11.4 | 1 | 1254 | 9 | 8.6 | t | 9.4 | 1 | 4.0 | 1 | 5.5 | 9 | 6.4 | 9 | 0 | 9 | 9.6 | t | missing | missing | 15.2 | t | 320 | 1 | 1218 | 9 | 21.7 | 1 | 320 | 1 | 1214 | 9 | missing | missing | missing | missing | missing | missing | 19.8 | t | missing | missing | 1213 | 9 | 0 | 9 |
| 8 | 5785 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-08 | 0.0 | 1 | 2.9 | 1 | 1800 | 9 | 8.8 | 1 | 1153 | 9 | 5.5 | t | 5.9 | 1 | 5.9 | 1 | 0.5 | 9 | 1.5 | 9 | 31 | 9 | 5.4 | t | missing | missing | 9.5 | t | 330 | 1 | 1017 | 9 | 13.9 | 1 | 330 | 1 | 1034 | 9 | missing | missing | missing | missing | missing | missing | 12.9 | t | missing | missing | 1113 | 9 | 0 | 9 |
| 9 | 5786 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-09 | 0.0 | 1 | -4.1 | 1 | 416 | 9 | 9.5 | 1 | 1324 | 9 | 1.8 | t | 2.7 | 1 | 13.6 | 1 | -8.9 | 9 | -6.5 | 9 | 722 | 9 | 2.0 | t | missing | missing | 3.5 | t | 360 | 1 | 1844 | 9 | 5.5 | 1 | 350 | 1 | 1104 | 9 | missing | missing | missing | missing | missing | missing | 5.1 | t | missing | missing | 1110 | 9 | 0 | 9 |
| 10 | 5787 | 34154001 | MONTPELLIER-AEROPORT | 43.5762 | 3.96467 | 1 | 2024-01-10 | 44.9 | 1 | -2.8 | 1 | 210 | 9 | 8.6 | 1 | 1448 | 9 | 4.2 | t | 2.9 | 1 | 11.4 | 1 | -7.4 | 9 | -5.3 | 9 | 302 | 9 | 4.5 | t | missing | missing | 8.2 | t | 70 | 9 | 1929 | 9 | 13.1 | 1 | 70 | 9 | 1925 | 9 | missing | missing | missing | missing | missing | missing | 11.3 | t | missing | missing | 1925 | 9 | 1079 | 9 |
As it is rather new, this DataGov/MeteoFrance API may change in the future making this function not working anymore. It is currently not fully working. One would have to call the DataGov API directly.
StochasticWeatherGenerators.collect_data_MeteoFrance — Functioncollect_data_MeteoFrance(STAID; show_warning=false, impute_missing=[], period="1950-2021", variables = "all")Given a STAID (station ID given by Météo France), it returns a DataFrame with data in period and for the variables.
STAIDcan be an integer or string.- Option for
periodare "1846-1949", "1950-2021", "2022-2023" - Option for
variablesareall, "RR-T-Wind", "others" impute_missingexpects a vector of column name(s) where to impute missing withImpute.Interpolatee.g.impute_missing=[:TX].show_warningin case of missing data.falsefor no column,truefor all variables columns and for selected columns e.g.show_warning = [:TX].
The data is available through the French Data.gouv.fr website api. Data may be updated without notice. See the following two links to get informations on the "RR-T-Wind" and "others" variables (in French)
- https://object.files.data.gouv.fr/meteofrance/data/synchroftp/BASE/QUOT/QdescriptifchampsRR-T-Vent.csv
- https://object.files.data.gouv.fr/meteofrance/data/synchroftp/BASE/QUOT/Qdescriptifchampsautres-parametres.csv
Or the the SICLIMA website with information (in French) about computation and conversion for some weather variables/index.
StochasticWeatherGenerators.download_data_MeteoFrance — Functiondownload_data_MeteoFrance(STAID, period = "2024-2025", variables = "all")Function not really working anymore as the API changed in 2024.
- Option for
periodare "1846-1949", "1950-2021", "2022-2023" - Option for
variablesareall, "RR-T-Wind", "others"
The data is available through the French Data.gouv.fr website api. Data may be updated without notice. In particular the path to the data may change.
INRAE
The INRAE CLIMATIK platform (Delannoy et al., 2022) (https://agroclim.inrae.fr/climatik/, in French) managed by the AgroClim laboratory of Avignon, France has weather stations. However, their API is not open access.
StochasticWeatherGenerators.collect_data_INRAE — Functioncollect_data_INRAE(station_path::String; show_warning=false, impute_missing=[])Read from a file an INRAE formatted weather station data and transform it to match ECA standard naming conventions.
impute_missingexpects a vector of column name(s) where to impute missing withImpute.Interpolatee.g.impute_missing=[:TX].show_warningin case of missing data.falsefor no column,truefor all variables columns and for selected columns e.g.show_warning = [:TX].
Others
Data manipulation
StochasticWeatherGenerators.clean_data — Functionclean_data(df::DataFrame; show_warning=false, impute_missing=[])Impute missing and show warning for missings. It assumes that the first two columns are not numeric.
impute_missingexpects a vector of column name(s) where to impute missing withImpute.Interpolatee.g.impute_missing=[:TX].show_warningin case of missing data.falsefor no column,truefor all variables columns and for selected columns e.g.show_warning = [:TX].
StochasticWeatherGenerators.select_in_range_df — Functionselect_in_range_df(datas, start_Date, interval_Date, [portion])Select station with some data availability in dates and quality (portion of valid data). Input is a vector (array) of DataFrame (one for each station for example) or a Dict of DataFrame. If 0 < portion ≤ 1 is specified, it will authorize some portion of data to be missing.
StochasticWeatherGenerators.shortname — Functionshortname(name::String)Experimental function that returns only the most relevant part of a station name.
long_name = "TOULOUSE-BLAGNAC"
shortname(long_name) # "TOULOUSE"References
- Delannoy, D.; Maury, O. and Décome, J. (2022). CLIMATIK: système d’information pour les données du réseau agroclimatique INRAE.
- 1I don't remember exactly in fact.