xdas.virtual.VirtualSource#

class xdas.virtual.VirtualSource(path_or_dataset, name=None, shape=None, dtype=None, maxshape=None)[source]#

A lazy array object pointing toward a netCDF4/HDF5 file.

At creation the array corresponds to an entire file dataset. It can then be sliced to indicate which regions should be used. Sliced VirtualSource eventually can be assigned to a VirtualLayout to

Best practive is to pass it a h5py.Dataset obtain destructuring a h5py.File. Otherwise the exact filename, dataset name, shape and dtype must be passed.

The data can be accessed using numpy.asarray or the __array__ special method.

Parameters:
  • path_or_dataset (str or h5py.Dataset) – The path to a file, or an h5py dataset. If a dataset is given, no other parameters are allowed, as the relevant values are taken from the dataset instead.

  • name (str, optional) – The name of the source dataset within the file.

  • shape (tuple of int, optional) – A tuple giving the shape of the dataset.

  • dtype (dtype or str, optional) – Numpy dtype or string.

  • maxshape (tuple or int or None, optional) – The source dataset is resizable up to this shape. Use None for axes you want to be unlimited.

vsource#

The underlying sliced virtual source

Type:

h5py.VirtualSource

shape#

The shape of the source.

Type:

tuple of int or

dtype#

The dtype of the source.

Type:

dtype

ndim#

The number of dimensions of the source.

Type:

int

nbytes#

The number of bytes virtually linked into the source.

Type:

int

to_dataset(file_or_group, name)[source]#

Puts the source into a layout and writes it virtually into the specified HDF5 file of group with the given name.

Examples

>>> import os
>>> from tempfile import TemporaryDirectory
>>> import h5py
>>> import numpy as np
>>> from xdas.virtual import VirtualSource
>>> with TemporaryDirectory() as tmpdir:
...     shape = (2, 3, 5)
...     data = np.arange(np.prod(shape)).reshape(*shape)
...     with h5py.File(os.path.join(tmpdir, "source.h5"), "w") as file:
...         file.create_dataset("data", data.shape, data.dtype, data)
...         source = VirtualSource(file["data"])  # we both write and get source here
...     source = source[1:-1]  # the source can be sliced
...     result = np.asarray(source)
...     assert np.array_equal(result, data[1:-1])
<...>
__init__(path_or_dataset, name=None, shape=None, dtype=None, maxshape=None)[source]#

Methods

__init__(path_or_dataset[, name, shape, ...])

create_variable(file, name[, dims, dtype])

Write this virtual array into file and register it as a named variable.

to_dataset(file_or_group, name)

Write this source as an HDF5 virtual dataset in file_or_group.

Attributes

dtype

NumPy dtype of the source dataset.

empty

True if the array contains no elements.

nbytes

Total number of bytes occupied by the array elements.

ndim

Number of dimensions.

shape

Shape of the selected region of the source dataset.

size

Total number of elements.

vsource

Underlying h5py.VirtualSource with the current selection applied.