Datasets plugins¶
A Dataset is a Python class that provide a curated set of data with specific helper functions. CliMetLab has build-in example datasets for demo purposes. See usage details in Dataset (User guide) and implementation in Dataset (Dev guide). Dataset are added with pip plugin or yaml files.
Simple datasets using yaml files¶
Simple datasets are datasets that rely on existing built-in data source, and cannot be parametrised by users. This can be for example a single file downloadable from a URL.
---
dataset:
source: url
args:
url: http://download.ecmwf.int/test-data/metview/gallery/temp.bufr
metadata:
documentation: Sample BUFR file containing TEMP messages
Complex datasets using pip plugin¶
See https://github.com/ecmwf/climetlab-demo-dataset
setuptools.setup(
name="climetlab-demo-dataset",
version="0.0.1",
description="Example climetlab external dataset plugin",
entry_points={"climetlab.datasets":
["demo-dataset = climetlab_demo_dataset:DemoDataset"]
},
)
See CliMetLab plugin mechanism.
See an example notebook using an external plugin.
Python documentation on plugins.
Automatic generation of a pip package¶
To make it easier, there is a template for a Dataset plugin using cookiecutter. In addition, for a simple dataset, you can also use a yaml file and rely only on the code provided by CliMetLab or other plugins.
pip install cookiecutter
cookiecutter https://github.com/ecmwf-lab/climetlab-cookiecutter/dataset