PySPI v1.0.2
New SPI - Gromov-Wasserstein Distance (GWτ)
This minor patch update introduces a new distance-based SPI, GWτ (called gwtau
in pyspi). An in-depth tutorial for incorporating new SPIs into the existing pyspi framework, using gwtau
as a prototypical example, is now available in the documentation.
What is it?
Based on the algorithm proposed by Kravtsova et al. (2023), GWτ is a new distance measure for comparing time series data, especially suited for biological applications. It works by representing each time series as a metric space and computing the distances from the start of each time series to every point. These distance distributions are then compared using the Wasserstein distance, which finds the optimal way to match the distances between two time series, making it robust to shifts and perturbations. The "tau" in GWτ emphasises that this distance measure is based on comparing the distributions of distances from the root (i.e., the starting point) to all other points in each time series, which is analogous to comparing the branch lengths in two tree-like structures. GWτ can be computed efficiently and is scalable.
How can I use it?
Currently, the default (subset = all
) SPI set and fast (subset = fast
) subset include gwtau
. This means you do not
have to do anything, unless you would like to compute gwtau
in isolation. Simply instantiate the calculator object and compute
SPIs as usual. You can access the matrix of pairwise interactions for gwtau
using it's identifier in the results table:
calc = Calculator(dataset=...)
calc.compute()
gwtau_results = calc.table['gwtau']
For technical details about the specific implementation of gwtau
, such as theoretical properties of this distance measure, see the original paper by Kravtsova et al. (2023). You can also find the original implementation of the algorithm in MATLAB in this GitHub repository.