You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is exactly the same question as this stackoverflow question (named "Pandas timeseries resampling and interpolating together"), except I am wondering if it is solvable with tempo instead of pandas.
We want to resample to every minute. Notice that 00:04:00 is missing, so interpolation is needed. BUT we want to use the fact that 2 seconds before (00:03:58) the value was 58, and 60 seconds later it is 60, so the value at 00:04:00 should be 58+((2/62)*2) = 58.064516.
So we do not want to first resample with e.g. mean into 1-min buckets and then interpolate between them, we instead want to find the "correct" value (by interpolating the values we have) at every minute point.
A common usecase for this is when you have sensors which only fires on some state change, e.g. a indicator if a valve is open or closed. So you have e.g.
tstamp val
0 2016-09-01 00:00:00 0
1 2016-09-01 00:01:02 1
Then it is important that at 2016-09-01 00:01:00 the correct value is 0 (the valve is closed).
This case could also be solved if the resample function had a method "last" which used the last value before the current bucket, but I still add it to this issue since I think that a solution to this issue will give that interpolation with ffill gives the right answer as well, and is maybe more generall.
Im looking at the same issue as described here.
I've solved this in pandas through re-indexing onto a time-index with both old time-stamps as well as new "regular" timestamps.
Then interpolate to fill the new regular time-stamps. Last, remove the original datapoints, leaving only the points at regular intervals.
I've not been able to reproduce this methodology in Tempo, due to tsdf.interpolate requiring a prior resampling, not being able to work on an generic index of NaN values.
I apprecioate the challanges in building a distributable framework. but I wonder how hard it would be to implement this?
This is exactly the same question as this stackoverflow question (named "Pandas timeseries resampling and interpolating together"), except I am wondering if it is solvable with tempo instead of pandas.
Here is the gist of it:
Data looks like this
We want to resample to every minute. Notice that 00:04:00 is missing, so interpolation is needed. BUT we want to use the fact that 2 seconds before (00:03:58) the value was 58, and 60 seconds later it is 60, so the value at 00:04:00 should be 58+((2/62)*2) = 58.064516.
So we do not want to first resample with e.g. mean into 1-min buckets and then interpolate between them, we instead want to find the "correct" value (by interpolating the values we have) at every minute point.
The pandas solution is relatively easy:
d:
The text was updated successfully, but these errors were encountered: