modelrunner.storage.base module
Base classes for managing hierarchical storage in which data is stored
The storage classes provide low-level abstraction to store data in a hierarchical format
and should thus not be used directly. Instead, the user typically interacts with
StorageGroup
objects, i.e., returned by
open_storage()
.
The role of StorageBase is to ensure access rights and provide an interface that can be specified easily by subclasses to provide new storage formats. In contrast, the interface of StorageGroup is more user-friendly and provides additional convenience methods.
The main structure of the storage is a hierarchical tree of groups, which can contain other groups or specific data items. Currently, items can be either arrays or arbitrary objects, which are serialized transparently. Moreover, each group and each item can have attributes, which are a mapping with string keys and arbitrary values, which are also serialized transparently. Note that keys with double underscores are reserved for internal use and should thus not be used.
- class StorageBase(*, mode='read')[source]
Bases:
object
base class for storing data
- Parameters:
mode (str or
AccessMode
) – The file mode with which the storage is accessed. Determines allowed operations.
- close()[source]
closes the storage, potentially writing data to a persistent place
- Return type:
None
- property codec: Codec
A codec used to encode binary data
- Type:
Codec
- create_dynamic_array(loc, shape, *, dtype=<class 'float'>, record_array=False, attrs=None, cls=None)[source]
creates a dynamic array of flexible size
- Parameters:
loc (list of str) – The location in the storage where the dynamic array is created
shape (tuple of int) – The shape of the individual arrays. A singular axis is prepended to the shape, which can then be extended subsequently.
dtype (DTypeLike) – The data type of the array to be written
record_array (bool) – Flag indicating whether the array is of type
recarray
attrs (dict, optional) – Attributes stored with the array
cls (type) – A class associated with this array
- Return type:
None
- create_group(loc, *, attrs=None, cls=None)[source]
create a new group at a particular location
- Parameters:
- Returns:
The reference of the new group
- Return type:
StorageGroup
- default_codec = Pickle(protocol=5)
the default codec used for encoding binary data
- Type:
numcodecs.Codec
- ensure_group(loc)[source]
ensures the a group exists in the storage
If the group is not already in the storage, it is created (recursively).
- mode: AccessMode
access mode
- Type:
- read_array(loc, *, out=None, index=None)[source]
read an array from a particular location
- Parameters:
- Returns:
An array containing the data. Identical to out if specified.
- Return type:
- write_array(loc, arr, *, attrs=None, cls=None)[source]
write an array to a particular location
- Parameters:
loc (list of str) – The location in the storage where the array is read
arr (
ndarray
) – The array that will be writtenattrs (dict, optional) – Attributes stored with the array
cls (type) – A class associated with this array. The class will be used to re-create the object when this array is later accessed. If no class is supplied, a generic ~modelrunner.storage.utils.Array will be returned.
- Return type:
None
- write_object(loc, obj, *, attrs=None, cls=None)[source]
write an object to a particular location
- Parameters:
loc (list of str) – The location in the storage where the object is read.
obj (Any) – The object that will be written
attrs (dict, optional) – Attributes stored with the object
cls (type) – A class associated with this object. The class will be used to re-create the object when this object is later accessed. If no class is supplied, a generic python object will be returned.
- Return type:
None