xdas.open_mfdatatree#

xdas.open_mfdatatree(paths, dim='first', tolerance=None, squeeze=False, engine=None, verbose=False, parallel=None, **kwargs)[source]#

Open a directory tree structure as a data collection.

The tree structure is descirebed by a path descriptor provided as a string containings placeholders. Two flavours of placeholder can be provided:

  • {field}: this level of the tree will behave as a dict. It will use the directory/file names as keys.

  • [field]: this level of the tree will behave as a list. The directory/file names are not considered (as if the placeholder was replaced by a *) and files are gathered and combined as if using open_mfdataarray.

Several dict placeholders with different names can be provided. They must be followed by one or more list placeholders that must share a unique name. The resulting data collection will be a nesting of dicts down to the lower level which will be a list of dataarrays.

Parameters:
  • paths (str) – The path descriptor.

  • dim (str, optional) – The dimension along which the data arrays are concatenated. Default to “first”.

  • tolerance (float or timedelta64, optional) – During concatenation, the tolerance to consider that the end of a file is continuous with beginning of the following one. For time coordinates, numeric values are considered as seconds. Default to zero tolerance.

  • squeeze (bool, optional) – Whether to return a DataArray instead of a DataCollection if the combination results in a data collection containing a unique data array.

  • engine (str or callable, optional) – The type of file to open or a read function. Default to xdas netcdf format.

  • parallel (bool or int, optional) – Whether to use multiprocessing to fetch file metadata. If False or 1, runs in single-process mode. If an integer, use that many processes. If True, use as many processes as available cores. If None, use the global xdas configuration. Default to None.

  • verbose (bool) – Whether to display a progress bar. Default to False.

  • **kwargs – Additional keyword arguments to be passed to the read function.

Returns:

The collected data.

Return type:

DataCollection

Examples

>>> import xdas as xd
>>> paths = "/data/{node}/{cable}/[acquisition]/proc/[acquisition].h5"
>>> xd.open_mfdatatree(paths, engine="asn")
Node:
  CCN:
    Cable:
      N:
        Acquisition:
          0: <xdas.DataArray (time: ..., distance: ...)>
          1: <xdas.DataArray (time: ..., distance: ...)>
  SER:
    Cable:
      N:
        Acquisition:
          0: <xdas.DataArray (time: ..., distance: ...)>
      S:
        Acquisition:
          0: <xdas.DataArray (time: ..., distance: ...)>
          1: <xdas.DataArray (time: ..., distance: ...)>
          2: <xdas.DataArray (time: ..., distance: ...)>