modelrunner.storage.base module

Base classes for managing hierarchical storage in which data is stored

The storage classes provide low-level abstraction to store data in a hierarchical format and should thus not be used directly. Instead, the user typically interacts with StorageGroup objects, i.e., returned by open_storage().

The role of StorageBase is to ensure access rights and provide an interface that can be specified easily by subclasses to provide new storage formats. In contrast, the interface of StorageGroup is more user-friendly and provides additional convenience methods.

The main structure of the storage is a hierarchical tree of groups, which can contain other groups or specific data items. Currently, items can be either arrays or arbitrary objects, which are serialized transparently. Moreover, each group and each item can have attributes, which are a mapping with string keys and arbitrary values, which are also serialized transparently. Note that keys with double underscores are reserved for internal use and should thus not be used.

class StorageBase(*, mode='read')[source]

Bases: object

base class for storing data

Parameters:

mode (str or AccessMode) – The file mode with which the storage is accessed. Determines allowed operations.

property can_update: bool

indicates whether the storage supports updating items

Type:

bool

close()[source]

closes the storage, potentially writing data to a persistent place

Return type:

None

property closed: bool

determines whether the storage has been closed

Type:

bool

property codec: Codec

A codec used to encode binary data

Type:

Codec

create_dynamic_array(loc, shape, *, dtype=<class 'float'>, record_array=False, attrs=None, cls=None)[source]

creates a dynamic array of flexible size

Parameters:
  • loc (list of str) – The location in the storage where the dynamic array is created

  • shape (tuple of int) – The shape of the individual arrays. A singular axis is prepended to the shape, which can then be extended subsequently.

  • dtype (DTypeLike) – The data type of the array to be written

  • record_array (bool) – Flag indicating whether the array is of type recarray

  • attrs (dict, optional) – Attributes stored with the array

  • cls (type) – A class associated with this array

Return type:

None

create_group(loc, *, attrs=None, cls=None)[source]

create a new group at a particular location

Parameters:
  • loc (list of str) – The location in the storage where the group will be created

  • attrs (dict, optional) – Attributes stored with the group

  • cls (type) – A class associated with this group. The class will be used to re-create the object when this group is later accessed directly.

Returns:

The reference of the new group

Return type:

StorageGroup

default_codec = Pickle(protocol=5)

the default codec used for encoding binary data

Type:

numcodecs.Codec

ensure_group(loc)[source]

ensures the a group exists in the storage

If the group is not already in the storage, it is created (recursively).

Parameters:

loc (list of str) – The group location in the storage

Return type:

None

extend_dynamic_array(loc, arr)[source]

extend a dynamic array previously created

Parameters:
  • loc (list of str) – The location in the storage where the dynamic array is located

  • arr (array) – The array that will be appended to the dynamic array

Return type:

None

extensions: list[str] = []

all file extensions supported by this storage

Type:

list of str

flush()[source]

write (cached) data to storage

Return type:

None

abstract is_group(loc)[source]

determine whether the location is a group

Parameters:

loc (sequence of str) – A list of strings determining the location in the storage

Returns:

True if the loation is a group

Return type:

bool

abstract keys(loc)[source]

return all sub-items defined at a given location

Parameters:

loc (sequence of str) – A list of strings determining the location in the storage

Returns:

a list of all items defined at this location

Return type:

list

mode: AccessMode

access mode

Type:

AccessMode

read_array(loc, *, out=None, index=None)[source]

read an array from a particular location

Parameters:
  • loc (list of str) – The location in the storage where the array is read

  • out (array) – An array to which the results are written

  • index (int, optional) – An index denoting the subarray that will be read

Returns:

An array containing the data. Identical to out if specified.

Return type:

ndarray

read_attrs(loc)[source]

read attributes associated with a particular location

Parameters:

loc (list of str) – The location in the storage where the attributes are read

Returns:

A copy of the attributes at this location

Return type:

dict

read_object(loc)[source]

read an object from a particular location

Parameters:

loc (list of str) – The location in the storage where the object is created

Returns:

The object that has been read from the storage

Return type:

Any

write_array(loc, arr, *, attrs=None, cls=None)[source]

write an array to a particular location

Parameters:
  • loc (list of str) – The location in the storage where the array is read

  • arr (ndarray) – The array that will be written

  • attrs (dict, optional) – Attributes stored with the array

  • cls (type) – A class associated with this array. The class will be used to re-create the object when this array is later accessed. If no class is supplied, a generic ~modelrunner.storage.utils.Array will be returned.

Return type:

None

write_attrs(loc, attrs)[source]

write attributes to a particular location

Parameters:
  • loc (list of str) – The location in the storage where the attributes are written

  • attrs (dict) – The attributes to be added to this location

Return type:

None

write_object(loc, obj, *, attrs=None, cls=None)[source]

write an object to a particular location

Parameters:
  • loc (list of str) – The location in the storage where the object is read.

  • obj (Any) – The object that will be written

  • attrs (dict, optional) – Attributes stored with the object

  • cls (type) – A class associated with this object. The class will be used to re-create the object when this object is later accessed. If no class is supplied, a generic python object will be returned.

Return type:

None