Interpolated Coordinates#
Coordinate#
Because DAS data are generally sampled with a constant sampling rate/resolution, keeping the
corresponding value for each index as a dense array is inefficient. xdas stores the
coordinates using the CF convention through the
xdas.Coordinate object. With this method, only a few tie points are kept and intermediate
values are retrieved by linear interpolation. Discontinuities are marked by two
consecutive tie points, as illustrated below:
The resulting coordinate vector is sparse but contains all the information necessary to exactly recover the original, dense coordinate vector.
Creating a Coordinate#
The xdas.Coordinate constructor takes tie_indices and tie_values as inputs.
The code below corresponds with the example illustrated in the figure above:
import xdas as xd
coord = xd.Coordinate(
{
"tie_indices": [0, 9, 19, 20, 29],
"tie_values": [0.0, 90.0, 190.0, 400.0, 490.0]
}
)
coord
0.000 to 490.000
The resulting object acts as an numpy.ndarray object. Indexing and
selecting works out of the box. Note that when specifying an increment step greater than 1, the tie points can be displaced a little bit.
coord = coord[1:-3:2]
coord
10.000 to 450.000
A major advantage of xdas.Coordinate is that it enables label-based selection.
For instance, to retrieve the index of a value the get_index() method can be used:
coord.to_index(430.0)
np.int64(11)
Warning
To be able to do label-based selection, tie_values must be strictly increasing.
In other words there must not be any overlap. To deal with small overlaps, a solution
is to simplify the coordinates, increasing the tolerance such that the overlapping points
disappear.
Gaps and Overlaps#
Gaps and Overlaps can be easily identified based on the tie point positions, and extracted with:
coord.get_discontinuities()
| start_index | end_index | start_value | end_value | delta | type | |
|---|---|---|---|---|---|---|
| 0 | 10 | 11 | 410.0 | 430.0 | 20.0 | gap |
While gaps represents missing data and are not problematic, overlaps usually arise from labeling errors and should be taken care of.
Using the simplify() method, the coordinate can be simplified with controlled
accuracy using the Ramer–Douglas–Peucker algorithm. In this example, the second
tie point does not provide useful information and is safely discarded.
coord = coord.simplify(tolerance=0.0)
coord
10.000 to 450.000
Temporal Coordinates#
The main use of coordinates in xdas is to deal with long time series. By default
xdas uses "datetime64[us]" dtype. Microseconds are used because to perform
interpolation xdas convert datetime64 to POSIX float which cannot safely
represent timestamps with better accuracies.
import numpy as np
coord = xd.Coordinate(
{
"tie_indices": [0, 3600 * 100],
"tie_values": [
np.datetime64("2023-01-01T00:00:00"),
np.datetime64("2023-01-01T01:00:00"),
],
}
)
coord.to_index(slice("2023-01-01T00:10:00", "2023-01-01T00:20:00"))
slice(np.int64(60000), np.int64(120001), None)