You can run this notebook in , in , in or in .
[ ]:
!pip install --quiet climetlab
Creating a shared dataset of GRIBs
[1]:
import climetlab as cml
Download data to the climetlab cache
[ ]:
for month in range(1, 13): # This takes a few minutes.
cml.load_source(
"mars",
param=["2t"],
levtype="sfc",
area=[50, -50, 20, 50],
grid=[1, 1],
date=f"2012-{month}",
)
[ ]:
cml.load_source(
"mars",
param="msl",
levtype="sfc",
area=[50, -50, 20, 50],
grid=[1, 1],
date="2012-12-01",
);
Export the data to a shared directory
This is optional, you could keep working on the data from the cache if you are the only user of the data and you do not mind redownloading it later. Other people should not use your cache: - When using climetlab the cache will eventually fills up and the data may be deleted automatically, - You will need to deal with permissions issues. - It will make it difficult to share the data with other people.
Let us export the data to a shared directory shared-data/temperature-for-analysis
[4]:
# Some housekeeping
!rm -rf shared-data/temperature-for-analysis
!mkdir -p shared-data/temperature-for-analysis
[5]:
# export all data from my cache which is from mars and not older that 1 day
!climetlab export_cache shared-data/temperature-for-analysis --newer 1h --match mars
Copying cache entries matching 'mars' and newer than '2023-03-11 13:29:29' to shared-data/temperature-for-analysis.
100%|██████████████████████████████████████████| 13/13 [00:00<00:00, 367.98it/s]
Copied 13 cache entries to shared-data/temperature-for-analysis.
Create indexes to speed up data access when using it. (Optional)
[ ]:
!climetlab index_directory shared-data/temperature-for-analysis
[ ]:
!climetlab availability shared-data/temperature-for-analysis
Using the data
[18]:
DATA = "shared-data/temperature-for-analysis"
[19]:
source = cml.load_source("indexed-directory", DATA)
[20]:
source.availability
[20]:
class=od, domain=g, expver=0001, levtype=sfc, md5_grid_section=ce1bd075c48ae7a5bf34f4e47166e942, step=0, stream=oper, time=1200, type=an date=20120101/to/20121231, param=2t date=20121201, param=msl
This is a good time to check the data, is all the data here? Are they missing dates? Parameters?
The data is ready to be used as numpy, tensorflow or xarray object.
[11]:
source.sel(param="msl").to_numpy().mean()
[11]:
101725.47522756307
[22]:
cml.load_source("indexed-directory", DATA, param="msl").to_numpy().mean()
[22]:
101725.47522756307
[23]:
temp = source.sel(param="2t").order_by("date")
temp.to_tfdataset()
[23]:
<PrefetchDataset element_spec=TensorSpec(shape=<unknown>, dtype=tf.float32, name=None)>
[24]:
temp.to_xarray()
[24]:
<xarray.Dataset> Dimensions: (number: 1, time: 366, step: 1, surface: 1, latitude: 31, longitude: 101) Coordinates: * number (number) int64 0 * time (time) datetime64[ns] 2012-01-01T12:00:00 ... 2012-12-31T12:0... * step (step) timedelta64[ns] 00:00:00 * surface (surface) float64 0.0 * latitude (latitude) float64 50.0 49.0 48.0 47.0 ... 23.0 22.0 21.0 20.0 * longitude (longitude) float64 -50.0 -49.0 -48.0 -47.0 ... 48.0 49.0 50.0 valid_time (time, step) datetime64[ns] 2012-01-01T12:00:00 ... 2012-12-3... Data variables: t2m (number, time, step, surface, latitude, longitude) float32 ... Attributes: GRIB_edition: 1 GRIB_centre: ecmf GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts GRIB_subCentre: 0 Conventions: CF-1.7 institution: European Centre for Medium-Range Weather Forecasts history: 2023-03-11T14:35 GRIB to CDM+CF via cfgrib-0.9.1...
xarray.Dataset
- number: 1
- time: 366
- step: 1
- surface: 1
- latitude: 31
- longitude: 101
- number(number)int640
- long_name :
- ensemble member numerical id
- units :
- 1
- standard_name :
- realization
array([0])
- time(time)datetime64[ns]2012-01-01T12:00:00 ... 2012-12-...
- long_name :
- initial time of forecast
- standard_name :
- forecast_reference_time
array(['2012-01-01T12:00:00.000000000', '2012-01-02T12:00:00.000000000', '2012-01-03T12:00:00.000000000', ..., '2012-12-29T12:00:00.000000000', '2012-12-30T12:00:00.000000000', '2012-12-31T12:00:00.000000000'], dtype='datetime64[ns]')
- step(step)timedelta64[ns]00:00:00
- long_name :
- time since forecast_reference_time
- standard_name :
- forecast_period
array([0], dtype='timedelta64[ns]')
- surface(surface)float640.0
- long_name :
- original GRIB coordinate for key: level(surface)
- units :
- 1
array([0.])
- latitude(latitude)float6450.0 49.0 48.0 ... 22.0 21.0 20.0
- units :
- degrees_north
- standard_name :
- latitude
- long_name :
- latitude
- stored_direction :
- decreasing
array([50., 49., 48., 47., 46., 45., 44., 43., 42., 41., 40., 39., 38., 37., 36., 35., 34., 33., 32., 31., 30., 29., 28., 27., 26., 25., 24., 23., 22., 21., 20.])
- longitude(longitude)float64-50.0 -49.0 -48.0 ... 49.0 50.0
- units :
- degrees_east
- standard_name :
- longitude
- long_name :
- longitude
array([-50., -49., -48., -47., -46., -45., -44., -43., -42., -41., -40., -39., -38., -37., -36., -35., -34., -33., -32., -31., -30., -29., -28., -27., -26., -25., -24., -23., -22., -21., -20., -19., -18., -17., -16., -15., -14., -13., -12., -11., -10., -9., -8., -7., -6., -5., -4., -3., -2., -1., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39., 40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.])
- valid_time(time, step)datetime64[ns]...
- standard_name :
- time
- long_name :
- time
array([['2012-01-01T12:00:00.000000000'], ['2012-01-02T12:00:00.000000000'], ['2012-01-03T12:00:00.000000000'], ..., ['2012-12-29T12:00:00.000000000'], ['2012-12-30T12:00:00.000000000'], ['2012-12-31T12:00:00.000000000']], dtype='datetime64[ns]')
- t2m(number, time, step, surface, latitude, longitude)float32...
- GRIB_paramId :
- 167
- GRIB_dataType :
- an
- GRIB_numberOfPoints :
- 3131
- GRIB_typeOfLevel :
- surface
- GRIB_stepUnits :
- 1
- GRIB_stepType :
- instant
- GRIB_gridType :
- regular_ll
- GRIB_NV :
- 0
- GRIB_Nx :
- 101
- GRIB_Ny :
- 31
- GRIB_cfName :
- unknown
- GRIB_cfVarName :
- t2m
- GRIB_gridDefinitionDescription :
- Latitude/Longitude Grid
- GRIB_iDirectionIncrementInDegrees :
- 1.0
- GRIB_iScansNegatively :
- 0
- GRIB_jDirectionIncrementInDegrees :
- 1.0
- GRIB_jPointsAreConsecutive :
- 0
- GRIB_jScansPositively :
- 0
- GRIB_latitudeOfFirstGridPointInDegrees :
- 50.0
- GRIB_latitudeOfLastGridPointInDegrees :
- 20.0
- GRIB_longitudeOfFirstGridPointInDegrees :
- -50.0
- GRIB_longitudeOfLastGridPointInDegrees :
- 50.0
- GRIB_missingValue :
- 9999
- GRIB_name :
- 2 metre temperature
- GRIB_shortName :
- 2t
- GRIB_totalNumber :
- 0
- GRIB_units :
- K
- long_name :
- 2 metre temperature
- units :
- K
- standard_name :
- unknown
[1145946 values with dtype=float32]
- GRIB_edition :
- 1
- GRIB_centre :
- ecmf
- GRIB_centreDescription :
- European Centre for Medium-Range Weather Forecasts
- GRIB_subCentre :
- 0
- Conventions :
- CF-1.7
- institution :
- European Centre for Medium-Range Weather Forecasts
- history :
- 2023-03-11T14:35 GRIB to CDM+CF via cfgrib-0.9.10.2/ecCodes-2.26.0 with {"source": "N/A", "filter_by_keys": {}, "encode_cf": ["parameter", "time", "geography", "vertical"]}
[25]:
# Note that this is wrong (not implemented yet)
temp.availability
[25]:
class=od, domain=g, expver=0001, levtype=sfc, md5_grid_section=ce1bd075c48ae7a5bf34f4e47166e942, step=0, stream=oper, time=1200, type=an date=20120101/to/20121231, param=2t date=20121201, param=msl
[ ]:
[ ]: