reverse matrix profile computation #717
Replies: 2 comments 1 reply
-
@pavlexander Maybe I'm failing to see your point but how the matrix profile is aligned is purely a convention that is consistent with the original matrix profile publications (not by us). For example, if you have:
The first distance, Again, maybe I've misunderstood your question so please clarify where appropriate. I'm not sure what you mean by "reverse matrix profile". |
Beta Was this translation helpful? Give feedback.
-
you are right.. I have just ran a test import stumpy
from stumpy import config
import matplotlib.pyplot as plt
import matplotlib.dates as dates
from matplotlib.patches import Rectangle
repeating_data = [219.14, 219.25, 219.15, 219.07, 219.28]
older_data = [218.24, 218.79, 218.69, 218.84, 219.21, 219.0, 218.93, 219.01, 219.0,
218.31, 218.89, 218.21, 218.47, 218.7, 218.52, 218.24, 218.03, 218.03, 218.04, 218.04, 218.21, 218.25,
218.54, 218.44, 218.29]
newer_data = [218.29, 218.42, 218.29, 218.51, 218.87, 218.56, 218.67, 218.58, 218.58, 218.79, 218.81, 218.81,
218.78, 218.84, 218.82, 218.82, 218.69, 218.75, 218.73, 218.98]
data = []
data += older_data
data += repeating_data
data += newer_data
data += repeating_data
m = 5
data = np.array(data)
mp = stumpy.stump(data, m)
motif_idx = np.argsort(mp[:, 0])[0] # motif index
nearest_neighbor_idx = mp[motif_idx, 1] # nearest neighbor
dist = mp[motif_idx, 0] # distance
print('\nPlotting candles ds with motives')
fig, axs = plt.subplots(2, sharex=True, gridspec_kw={'hspace': 0})
axs[0].plot(data)
axs[0].set_ylabel('Steam Flow')
axs[1].axvline(x=motif_idx, linestyle="dashed")
axs[1].axvline(x=nearest_neighbor_idx, linestyle="dashed")
axs[0].axvline(x=motif_idx, linestyle="dashed")
axs[0].axvline(x=nearest_neighbor_idx, linestyle="dashed")
axs[1].set_xlabel('Time')
axs[1].set_ylabel('Matrix Profile')
axs[1].plot(mp[:, 0])
plt.show() the plot is in the example above, the repeating pattern is placed at the end of the time series (possibly streaming data). I can see that it is being immediately matched with the previous records.. I was under the impression that the matching will only happen after |
Beta Was this translation helpful? Give feedback.
-
Hi,
Introduction to the problem
Currently, whether
matrix profile
is created - generation always starts from zero index, therefore for all data that is within the range oflen(n)-m
tolen(n)
there is no matrix profile available.. (wheren
is number of records in the time series). This can be further confirmed by multiple sources in documentation, such as https://stumpy.readthedocs.io/en/latest/Tutorial_STUMPY_Basics.html#Find-a-Motif-Using-STUMPThe short conclusion: data at the start of the sequence takes precedence over latter time series data.
The problem
Consider that we have streaming time series data.
As mentioned in documentation we could use
stumpi
with methodupate()
to append new data to existing time series..From my perspective it does not make sense to use this function for the real-time data analysis because of lag that is explained in introduction.
So if you use step
m
that is equal to 1 day of data - this will essentially be the delay/lag. The data that we append always ends up on the end of the data-set. But we don't really see any patterns at current timet
until nextm
data points are added.Are my assumptions correct so far?
Question
What I would like to know is whether it's possible to reverse the matrix profile computation, so it always starts with the last index and goes backwards.
I know that I could invert the "static/frozen" data for 1-time computation, however, I don't know how to deal with real-time data (with
stumpi
) when patterns are needed ASAP.TLTR Here is what I expect to see instead, is it possible? :)
It would be nice if this use-case was covered on the time series data analysis page (link above).
Beta Was this translation helpful? Give feedback.
All reactions