Segment at negative timestamp breaks #127

jamieforth · 2024-12-07T16:47:54Z

When dejittering, also take into account negative timestamps when
determining segment boundaries.

agricolab · 2024-12-08T03:26:10Z

Thanks. If i understand this correctly, this would catch streams where timestamp progression is negative (i.e. sampling rate was negative), not necessarily streams with negative timestamps, right?

Because with a simple diff, negative timestamps would work
np.diff([-4.-2,-1,0,1,2]) -> array([2, 1, 1, 1, 1])

but reversed time wouldn't
np.diff([2,1,0,-1,-2,-4]) -> array([-1, -1, -1, -1, -2])

or glitches wouldn't
np.diff([-2,-3,-2,-1,0,1,2]) -> array([-1, 1, 1, 1, 1, 1])

unless we take the abs as you suggest.

Is that would you wanted? Or should we handle glitches, but throw an error for negative time?

agricolab

I did a refactoring and added tests (see
https://github.com/agricolab/xdf-Python/tree/pr/fix/expose-segments)

to see whether it would cause any breaking changes. Passes the new tests and this implementation additionally works for negative time and glitches, now too

* Segment when timestamp intervals exceed threshold (previous behaviour) * Always segment at non-positive timestamp intervals (irrespective of threshold) * Additional tests for non-monotonicity

jamieforth · 2024-12-08T16:15:51Z

If i understand this correctly, this would catch streams where timestamp progression is negative (i.e. sampling rate was negative), not necessarily streams with negative timestamps, right?

Exactly, negative time intervals would have been a better way to put it, sorry.

unless we take the abs as you suggest.

Is that would you wanted? Or should we handle glitches, but throw an error for negative time?

Thank you for the tests, this really helped clarify things.

I don't see a problem with handling negative time or negative intervals per se, then it's up to the user to decide what to do with them. However, I now think there might be a better way to do it.

So I retract the abs suggestion in favour of explicitly handling non-positive time interval. Then the original logic for positive time intervals remains the same.

The strategy I've gone for here is to always segment at a non-positive time interval – regardless of the thresholds (i.e. segments are by definition monotonic).

I can't really imagine a use case for non-monotonic segments. But if you think it's a too risky breaking change, perhaps something like _detect_breaks(monotonic=False) could preserve the existing default behaviour (in conjunction with abs).

* Revert to existing dejitter segmentation logic * Handles cases where data is recorded out-of-order

jamieforth · 2024-12-09T01:41:10Z

Another rethink... An underling problem to all the above is data being recorded out-of-order (which I don't think itself is necessarily a problem, just something more likely to happen with busy/WiFi networks etc.).

So ensuring the time-series data is sorted prior to synchronisation and dejitter avoids the problem!

agricolab · 2024-12-09T11:18:36Z

Thanks for the effort and the additonal work.

I agree that non-monoticity is clearly indicative of a segmentation. It could stem from network issues, but also from clock resets or a bad implementation of the LSL outlet (some apps use custom timestamps).

Now practically e.g [0, 1, 2, 5, 6, 3, 4] would currently only catch the second segment [5,6] but not the last [3,4].

Sorting this prior to segmentation could be an option, but i would not make it the default for the following reasons.

First, unsorted (i.e. non-monotonic) timestamps would in principle caught as multiple segments, once we catch negative diffs in _detect_breaks. The difference between sorting before versus after segmentation is that as streams are dejittered segment-wise, they are dejittered differently. Therefore, sorted first and sorted last will share the identical timeseries, but the respective timestamps will be different. Might or might not be what the user wants - but that means we need to be able to set the option.

Second, pyxdf policy is to be hands-off the timeseries. We discussed this a lot during the PR for syncing stamps of different streams. The idea is to be hands-off unless the user explicitly tells pyxdf to manipulate the timeseries (and not just the timestamps).

In conclusion, my suggestion would be to catch b_breaks = np.logical_or(diffs < 0, diff > np.max((threshold_seconds, threshold_samples * stream.tdiff))) in _detect_breaks and make sorting before segmentation an optional kwarg to load_xdf. I think it is sensible to warn the user though if the stream has non-monotonic timestamps. Many users ignore timestamps anyways, and wouldn't notice without an explicit warning being printed.

Additionally, I think that non-monotonic timestamps might be linked to clock resets, so we should be aware that this is issue is also handled by _clock_sync.

Segment at negative timestamp breaks

4b025c3

agricolab self-assigned this Dec 8, 2024

agricolab self-requested a review December 8, 2024 03:38

Refactor _jitter_removal and add tests

0e41c5e

agricolab approved these changes Dec 8, 2024

View reviewed changes

jamieforth force-pushed the fix/expose-segments branch from ee1bb8a to 3d668f1 Compare December 8, 2024 15:04

Always segment at non-monotonic timestamps in _jitter_removal

deb4c54

* Segment when timestamp intervals exceed threshold (previous behaviour) * Always segment at non-positive timestamp intervals (irrespective of threshold) * Additional tests for non-monotonicity

jamieforth force-pushed the fix/expose-segments branch from 3d668f1 to deb4c54 Compare December 8, 2024 15:30

Ensure data is sorted before sync/dejitter

8e5a76d

* Revert to existing dejitter segmentation logic * Handles cases where data is recorded out-of-order

jamieforth force-pushed the fix/expose-segments branch from 51edcd7 to 8e5a76d Compare December 9, 2024 00:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segment at negative timestamp breaks #127

Segment at negative timestamp breaks #127

jamieforth commented Dec 7, 2024

agricolab commented Dec 8, 2024 •

edited

Loading

agricolab left a comment

jamieforth commented Dec 8, 2024

jamieforth commented Dec 9, 2024

agricolab commented Dec 9, 2024 •

edited

Loading

Segment at negative timestamp breaks #127

Are you sure you want to change the base?

Segment at negative timestamp breaks #127

Conversation

jamieforth commented Dec 7, 2024

agricolab commented Dec 8, 2024 • edited Loading

agricolab left a comment

Choose a reason for hiding this comment

jamieforth commented Dec 8, 2024

jamieforth commented Dec 9, 2024

agricolab commented Dec 9, 2024 • edited Loading

agricolab commented Dec 8, 2024 •

edited

Loading

agricolab commented Dec 9, 2024 •

edited

Loading