-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sync_timestamps #28
Conversation
Allows to resample (using interpolation or shifting) all streams to have their timestamps sample-wise in sync with the fastest stream
'srate. I select one at random for syncing timestamps.') | ||
|
||
fastest_stream = snames[srates.index(max_fs)] | ||
new_timestamps = stamps[srates.index(max_fs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Streams are not guaranteed to start and stop at the same time. i.e. the first sample for the fastest stream might not come until several minutes into a recording. What do do then?
I think we can extrapolate timestamps (but not data!) assuming a constant interval.
So maybe for each stream we need to create a new timestamp vector that uses new_timestamps for the overlapping periods, and then extrapolates timestamps for the non-overlapping periods assuming even intervals.
For data processing streams that require all sources to have the same number of samples, we could then have a separate option to trim all streams to smallest overlapping regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. I implemented that new_timestamps is now based on an extrapolation to the earliest and latest timestamp of any stream using the timestamps of the fastest stream as seed. An alternative would have been to generate a completely new timestamp vector based only on start and endpoint and an arbitrary sampling rate. But i did want to reduce the amount of interpolation necessary, and not implement downsampling. Additionally, i added a function to limit all streams to overlapping time-periods. The latter is supposed to work independently of whether timestamps have been synced before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expanded interpolation to also deal with integer values. If a channels format is any integer type, interpolation now enforces integers as output with np.around. I also improved the coumentation, added unit tests to be run with pytest, and fixed a couple of bugs detected with the tests. Feel more or less finished, and am open for feedback.
The timestamps of the fastest sampled stream are extrapolated to include the earliert and latest timestamp of any stream
Additional argument which either limits all streams to overlapping recordings, or not
Using pytest Fix bug in pyxdf where i forgot to remove a return statement used for develop,ment
for streams with unsynced timestamps Fix a bug caused by floating point unprecision
Fix bug with marker streams where sampling rate is not set after upsampling
Add docstring for _sync_timestamps Add docstring for _limit_streams_to_overlap
This all looks good. I'm sorry I haven't had time to test. It's on my TODO list. I'd prefer if pyxdf didn't have a strict dependency on scipy for a single rarely-used function. Can you put the scipy import inside the function that needs it? I'm OK with the occasional user encountering an import error. I'd wager that most users that want to use the interpolation function will already have scipy installed and then for everyone else there's one less dependency. Or, if there's an argument to be made for including scipy as a dependency in setup.py, then I'm all ears. |
Check dependencies for scipy/numpy and run at least minimal interpolation if scipy is not installed
Hey, no sweat. Thanks for taking the time at all. I agree that less dependencies is better. At first, i just moved scipy into _sync_timestamps. Then i thought, numpy is already a hard dependency for pyxdf, and supports a linear interpolation method. I therefore decided to outsource interpolation to a subfunction, which uses scipy if installed, and attempts to fall back to numpy, if not. I run the tests in an environment without scipy, and they run fine. Obviously, the latter approach is a little bit more complex, so feel free to stick with the simple |
Hello guys, I followed ths thread cause i was trying to synchronize two different LSL streams of data. I think the right interpolation should be 'nearest' and not linear. Because linear will modify the signal as it finds middle points between timepoints that do not exist in the interpolated signal. This might not look a big deal in temporal domain but in frequency domain it could have a different representation. I attached two pictures of the same signal interpolated with linear and nearest methods: I hope it could be useful! Cheers! |
Hey @AdeuAndreu , thanks for checking. Yeah, good analysis, and i fully agree. Interpolation is tricky. I'd say what's the most adequate interpolation method depends a lot on the signal. If the sampling rates of the two streams are similar, i 'd agree that "nearest" makes more sense. If one has a very low sampling rate compared to the fastest, "nearest" interpolation can introduce edges, and linear might be superior. Cubic interpolation is smoother, but is not convex and does not preserve monotonocity, etc. etc... In the end, that was why i wanted to allow to control the kind of interpolation with arguments. I am open to changing the default, though. Currently linear is good as it can be done with numpy only. But probably sth like https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.PchipInterpolator.html#scipy.interpolate.PchipInterpolator is a good default. Maybe i'll find the time to add a monotonic cubic spline and/or more options Maybe we should also expand the documentation or add a discussion to the wiki. |
@agricolab |
Is there a way to transfer between repositories? xdf-python does not show as option when changing the base... Googling didn't help either, it seems i have to clone xdf-python, add changes and create a new pull request. That will lose the comments on this pull request, but i guess i can link to them in the new. So, unless you have a better idea, will do asap. |
Unfortunately we can’t transfer issues or PRs across organizations. Linking to the old PR is the best we can do. |
Done in xdf-modules/pyxdf#1 |
Allows to resample (using interpolation for numeric values and shifting for strings) all streams to have their timestamps sample-wise in sync with the fastest stream.
Example
streams, fileheader = load_xdf(filename, sync_timestamps='linear')
This might help adress #24 and mne-tools/mne-python#5180