Caching¶
Warning
This part of CliMetLab is still a work in progress. Documentation and code behaviour will change.
Purpose¶
CliMetLab caches most of the remote data access on a local cache. Running again
cml.load_dataset
or cml.load_source
will use the cached data instead of
downloading it again.
When the cache is full, cached data is deleted according it cache policy
(i.e. oldest data is deleted first).
CliMetLab cache configuration is managed through the CliMetLab Settings.
Warning
The CliMetLab cache is intended to be used by a single user. Sharing cache with multiple users is not recommended. Downloading a local copy of data on a shared disk to have multiple users working is a different use case and should be supported through using mirrors. Feedback and feature requests are welcome.
Cache location¶
The cache location is defined by the
cache‑directory
setting. Its default value depends on your system:
/tmp/climetlab-$USER
for Linux,
C:\\Users\\$USER\\AppData\\Local\\Temp\\climetlab-$USER
for Windows
/tmp/.../climetlab-$USER
for MacOSThe cache location can be read and modified either with shell command or within python.
Note
It is recommended to restart your Jupyter kernels after changing the cache location.
From a shell with the
climetlab
command:# Find the current cache directory $ climetlab settings cache-directory /tmp/climetlab-$USER # Change the value of the setting $ climetlab settings cache-directory /big-disk/climetlab-cache # Cache directory has been modified $ climetlab settings cache-directory /big-disk/climetlab-cacheFrom a python notebook or python script:
>>> import climetlab as cml >>> cml.settings.get("cache-directory") # Find the current cache directory /tmp/climetlab-$USER >>> # Change the value of the setting >>> cml.settings.set("cache-directory", "/big-disk/climetlab-cache") # Python kernel restarted >>> import climetlab as cml >>> cml.settings.get("cache-directory") # Cache directory has been modified /big-disk/climetlab-cacheMore generally, the CliMetLab settings can be read, modified, reset to their default values using the
climetlab
command or from python, see the Settings documentation.
Cache limits¶
- Maximum-cache-size
The
maximum-cache-size
setting ensures that CliMetLab does not use to much disk space. Its value sets the maximum disk space used by CliMetLab cache. When CliMetLab cache disk usage goes above this limit, CliMetLab triggers its cache cleaning mechanism before downloading additional data. The value of cache-maximum-size is absolute (such as “10G”, “10M”, “1K”).- Maximum-cache-disk-usage
The
maximum-cache-disk-usage
setting ensures that CliMetLab leaves does not fill your disk. Its values sets the maximum disk usage of the filesystem containing the cache directory. When the disk space goes below this limit, CliMetLab triggers its cache cleaning mechanism before downloading additional data. The value of maximum-cache-disk-usage is relative (such as “90%” or “100%”).
Warning
If your disk is filled by another application, CliMetLab will happily delete its cached data to make room for the other application as soon as it has a chance.
Note
When tweaking the cache settings, it is recommended to set the
maximum-cache-size
to a value below the user disk quota (if appliable)
and maximum-cache-disk-usage
to None
.
Caching settings default values¶
Name
|
Default
|
Description
|
---|---|---|
cache‑directory
|
‘/tmp/climetlab‑docs’
|
Directory of where the downloaded files are cached, with
${USER} is the user id.
See Caching for more information. |
maximum‑cache‑disk‑usage
|
‘90%’
|
Disk usage threshold after which CliMetLab expires older cached entries (% of the full disk capacity).
See Caching for more information.
|
maximum‑cache‑size
|
None
|
Maximum disk space used by the CliMetLab cache (ex: 100G or 2T).
|
Other CliMetLab settings can be found here.