Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sync_timestamps #28

Closed
wants to merge 12 commits into from
Closed

Implement sync_timestamps #28

wants to merge 12 commits into from

Conversation

agricolab
Copy link

@agricolab agricolab commented Dec 4, 2018

Allows to resample (using interpolation for numeric values and shifting for strings) all streams to have their timestamps sample-wise in sync with the fastest stream.
Example
streams, fileheader = load_xdf(filename, sync_timestamps='linear')
This might help adress #24 and mne-tools/mne-python#5180

Allows to resample (using interpolation or shifting) all streams to have their timestamps sample-wise in sync with the fastest stream
'srate. I select one at random for syncing timestamps.')

fastest_stream = snames[srates.index(max_fs)]
new_timestamps = stamps[srates.index(max_fs)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streams are not guaranteed to start and stop at the same time. i.e. the first sample for the fastest stream might not come until several minutes into a recording. What do do then?
I think we can extrapolate timestamps (but not data!) assuming a constant interval.
So maybe for each stream we need to create a new timestamp vector that uses new_timestamps for the overlapping periods, and then extrapolates timestamps for the non-overlapping periods assuming even intervals.

For data processing streams that require all sources to have the same number of samples, we could then have a separate option to trim all streams to smallest overlapping regions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. I implemented that new_timestamps is now based on an extrapolation to the earliest and latest timestamp of any stream using the timestamps of the fastest stream as seed. An alternative would have been to generate a completely new timestamp vector based only on start and endpoint and an arbitrary sampling rate. But i did want to reduce the amount of interpolation necessary, and not implement downsampling. Additionally, i added a function to limit all streams to overlapping time-periods. The latter is supposed to work independently of whether timestamps have been synced before.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded interpolation to also deal with integer values. If a channels format is any integer type, interpolation now enforces integers as output with np.around. I also improved the coumentation, added unit tests to be run with pytest, and fixed a couple of bugs detected with the tests. Feel more or less finished, and am open for feedback.

The timestamps of the fastest sampled stream are extrapolated to include the earliert and latest timestamp of any stream
Additional argument which either limits all streams to overlapping recordings, or not
Using pytest
Fix bug in pyxdf where i forgot to remove a return statement used for develop,ment
for streams with unsynced timestamps
Fix a bug caused by floating point unprecision
Fix bug with marker streams where sampling rate is not set after upsampling
Add docstring for _sync_timestamps
Add docstring for _limit_streams_to_overlap
@cboulay
Copy link
Collaborator

cboulay commented Jan 10, 2019

This all looks good. I'm sorry I haven't had time to test. It's on my TODO list.

I'd prefer if pyxdf didn't have a strict dependency on scipy for a single rarely-used function. Can you put the scipy import inside the function that needs it? I'm OK with the occasional user encountering an import error. I'd wager that most users that want to use the interpolation function will already have scipy installed and then for everyone else there's one less dependency.

Or, if there's an argument to be made for including scipy as a dependency in setup.py, then I'm all ears.

Check dependencies for scipy/numpy and run at least minimal
interpolation if scipy is not installed
@agricolab
Copy link
Author

agricolab commented Jan 12, 2019

Hey, no sweat. Thanks for taking the time at all.

I agree that less dependencies is better. At first, i just moved scipy into _sync_timestamps. Then i thought, numpy is already a hard dependency for pyxdf, and supports a linear interpolation method. I therefore decided to outsource interpolation to a subfunction, which uses scipy if installed, and attempts to fall back to numpy, if not. I run the tests in an environment without scipy, and they run fine.

Obviously, the latter approach is a little bit more complex, so feel free to stick with the simple
2a60647 instead

@AdeuAndreu
Copy link

Hello guys, I followed ths thread cause i was trying to synchronize two different LSL streams of data.
I am not using xdf, though. I checked your implementation @agricolab and I realised that i was doing the interpolation the same way so i wanted to give you my feedback on something i though of doing differently. I noticed that you were using 'linear' interpolation by default.

I think the right interpolation should be 'nearest' and not linear. Because linear will modify the signal as it finds middle points between timepoints that do not exist in the interpolated signal. This might not look a big deal in temporal domain but in frequency domain it could have a different representation.

I attached two pictures of the same signal interpolated with linear and nearest methods:
interpolation_linear
interpolation_nearest

I hope it could be useful! Cheers!

@agricolab
Copy link
Author

Hello guys, I followed ths thread cause i was trying to synchronize two different LSL streams of data.
I am not using xdf, though. I checked your implementation @agricolab and I realised that i was doing the interpolation the same way so i wanted to give you my feedback on something i though of doing differently. I noticed that you were using 'linear' interpolation by default.

I think the right interpolation should be 'nearest' and not linear. Because linear will modify the signal as it finds middle points between timepoints that do not exist in the interpolated signal. This might not look a big deal in temporal domain but in frequency domain it could have a different representation.

I attached two pictures of the same signal interpolated with linear and nearest methods:
interpolation_linear
interpolation_nearest

I hope it could be useful! Cheers!

Hey @AdeuAndreu , thanks for checking.

Yeah, good analysis, and i fully agree. Interpolation is tricky. I'd say what's the most adequate interpolation method depends a lot on the signal. If the sampling rates of the two streams are similar, i 'd agree that "nearest" makes more sense. If one has a very low sampling rate compared to the fastest, "nearest" interpolation can introduce edges, and linear might be superior. Cubic interpolation is smoother, but is not convex and does not preserve monotonocity, etc. etc... In the end, that was why i wanted to allow to control the kind of interpolation with arguments.

I am open to changing the default, though. Currently linear is good as it can be done with numpy only. But probably sth like https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.PchipInterpolator.html#scipy.interpolate.PchipInterpolator is a good default. Maybe i'll find the time to add a monotonic cubic spline and/or more options

Maybe we should also expand the documentation or add a discussion to the wiki.

@cboulay
Copy link
Collaborator

cboulay commented Mar 28, 2019

@agricolab
Do you mind moving this PR to https://github.com/xdf-modules/xdf-Python?
(Unfortunately I can't transfer issues between organizations)

@agricolab
Copy link
Author

Is there a way to transfer between repositories? xdf-python does not show as option when changing the base... Googling didn't help either, it seems i have to clone xdf-python, add changes and create a new pull request. That will lose the comments on this pull request, but i guess i can link to them in the new. So, unless you have a better idea, will do asap.

@cboulay
Copy link
Collaborator

cboulay commented Mar 28, 2019

Unfortunately we can’t transfer issues or PRs across organizations. Linking to the old PR is the best we can do.

@agricolab
Copy link
Author

Done in xdf-modules/pyxdf#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants