Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify error handling in function spike_contrast (spike_train_synchrony.py) #626

Conversation

ManuelCiba
Copy link

Description:

This pull request suggests an enhancement for the spike_contrast function. Currently, the function raises a TypeError for input spike trains containing no more than 1 spike. The proposed modification replaces the TypeError with a warning to improve the user experience during batch analysis.

More precisely, the function will return a complete data structure 'trace' with all bin_size values, even if the synchrony calculation was not possible. The synchrony values will be 'nan'. This will lead to consistent data structures among full spike trains and empty spike trains.

Changes:

  • Replaced TypeError with a warning in the spike_train_synchrony.py function spike_contrast.
  • Removed the test case in the test_spike_train_synchrony.py function test_invalid_data.

Testing:

  • spike_contrast has been tested.
  • Changes to test_spike_train_synchrony were made directly on GitHub, so no local tests have been run yet.

Replace TypeError with a warning for cases where all input spike trains contain no more than 1 spike.

This change ensures that empty spike trains return a complete data structure (trace object) for batch analysis, improving consistency and usability.
Remove the test of "All input spiketrains contain no more than 1 spike" as the ValueError was replaced by a warning.
@pep8speaks
Copy link

pep8speaks commented Apr 4, 2024

Hello @ManuelCiba! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 197:5: E722 do not use bare 'except'
Line 198:80: E501 line too long (115 > 79 characters)
Line 199:80: E501 line too long (93 > 79 characters)
Line 199:94: W291 trailing whitespace
Line 200:80: E501 line too long (97 > 79 characters)
Line 200:98: W291 trailing whitespace
Line 202:80: E501 line too long (115 > 79 characters)
Line 202:116: W291 trailing whitespace

Line 92:1: W293 blank line contains whitespace
Line 93:80: E501 line too long (91 > 79 characters)
Line 93:92: W291 trailing whitespace
Line 95:9: E265 block comment should start with '# '

Comment last updated at 2024-04-17 12:54:48 UTC

@Moritz-Alexander-Kern
Copy link
Member

Moritz-Alexander-Kern commented Apr 5, 2024

Hello @ManuelCiba
thank you for reaching out to us.

I ran the following minimal example, with the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:

import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast

# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
    neo.SpikeTrain([10] * ms, t_stop=1000 * ms),
    neo.SpikeTrain([20] * ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
    print(f"Spike train no. {idx}: {spiketrain}")

# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
    spiketrains, return_trace=True
)

print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")

which yields:

Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
  warnings.warn("All input spiketrains contain no more than 1 spike.")
Synchrony: 0.5
Contrast: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 0.75, 0.25, 0.25, 0.25]
Active Spiketrains: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.6666666666666667, 0.5, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0]
Synchrony: [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.33333333333333337, 0.375, 0.375, 0.375, 0.375, 0.0, 0.0, 0.0]
Binsize: [500.         450.         405.         364.5        328.05
 295.245      265.7205     239.14845    215.233605   193.7102445
 174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
 102.94556605  92.65100944  83.3859085   75.04731765  67.54258588
  60.7883273   54.70949457  49.23854511  44.3146906   39.88322154
  35.89489938  32.30540945  29.0748685   26.16738165  23.55064349
  21.19557914  19.07602122  17.1684191   15.45157719  13.90641947

Did I get the described scenario correctly with this minimal example? I assume batch processing refers to processing a number of lists of spike trains. i.e. list1 = [st1, st2, st3, ...], list2 = [st1, st2, st3, ...] in this case the process will be interrupted if one call e.g. spike_contrast(list1) raises an error.

Is this the desired output?, or should the value of synchrony be NaN, more precisely be of type numpy.nan ? (Currently it raises a warning and returns 0.5)

The current implementation (Elephant v1.0.0) raises an error for the above script:

Spike train no. 0: [10.] ms
Spike train no. 1: [20.] ms
Traceback (most recent call last):
elephant/build/test_issue_#626.py", line 14, in <module>
    synchrony, spike_contrast_trace = spike_contrast(
elephant/elephant/spike_train_synchrony.py", line 193, in spike_contrast
    isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
ValueError: min() arg is an empty sequence

I was expecting: ValueError: All input spiketrains contain no more than 1 spike.

PS:
Just out of curiosity, are you the author of "Spike-contrast: A novel time scale independent and multivariate measure of spike train synchrony" ? Please correct me if I get this wrong.

@ManuelCiba
Copy link
Author

Hello Moritz,

Thank you for the quick reply and for testing the code!

You are right, I'm the author. Recently, I switched from Matlab to Python, so I'm finally using Elephant for my data analysis :)

Regarding batch analysis: I have, for example, 100 recordings, each containing 60 parallel recorded spike trains. For each of the 100 recordings, I want to calculate the synchrony, having all 60 spike trains in one list. Some of the 100 recordings do not contain any spikes. However, I want to save the synchrony curve (trace.synchrony) also for these empty files, in order to generate a consistent data structure that can be used for machine learning, etc.

Of course, it is also possible for the user to handle this case in their script. But I noticed that it is more convenient to receive a result from the function (since the number of bins depends on t_start and t_stop).

Regarding your example: A synchrony value of 0.5 is mathematically correct considering how Spike-contrast is defined (since both spikes are within the first bin). This particular spike train could be considered as non-stationary since all the activity is just in the beginning. In the paper, I mentioned that non-stationary spike trains will lead to unreliable synchrony values. So, we could leave it to the responsibility of the user to ensure "good" spike trains(?).

@Moritz-Alexander-Kern
Copy link
Member

Hello @ManuelCiba ,

Thank you for your insights, now I understand more.

I didn't consider the stationarity of spike trains, I was wondering about this line:

isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)

For me the distinction between stationary and non-stationary spike trains is not obvious in this line, especially when dealing with cases where there are only one or two spikes per spiketrain. Given this ambiguity, I agree, to leave it to the user to ensure good spike trains, only the "obvious cases" are handled by returning NaN?
Here NaN serves as a convenient way to handle cases where numerical values cannot be represented or computed in a meaningful way, providing a clear indication of such situations in the results.

So in conclusion, we could proceed with returning NaN for empty spike trains, as there is no sensible definition of synchrony in such cases? and raise a Warning?
For spike trains containing one spike, we could continue providing a value, but still raise a warning? or also return NaN and raise a Warning?

By adopting this approach, we maintain clarity in cases where synchrony cannot be computed due to empty spike trains while providing results for cases where spikes are present. This aligns with the preference to leave the responsibility of ensuring "nice" spike trains to the user. Currently I don't see an obvious way to handle non-stationary spike trains which lead to unreliable synchrony values. Maybe we could add a section to the documentation explaining/warning about this?

Looking forward to hear your thoughts on this.

@Moritz-Alexander-Kern
Copy link
Member

For completion: Here is an example with empty spike trains using the code on branch ManuelCiba:modify_error_handling_in_spike-contrast:

import neo
from quantities import ms
from elephant.spike_train_synchrony import spike_contrast

# Create list of spiektrains, where each spiketrain contains only one spike
spiketrains = [
    neo.SpikeTrain([]*ms, t_stop=1000 * ms),
    neo.SpikeTrain([]*ms, t_stop=1000 * ms),
]
for idx, spiketrain in enumerate(spiketrains):
    print(f"Spike train no. {idx}: {spiketrain}")

# Run spike_contrast from elephant.spike_train_synchrony
synchrony, spike_contrast_trace = spike_contrast(
    spiketrains, return_trace=True
)


print(f"Synchrony: {synchrony}")
print(f"Contrast: {spike_contrast_trace.contrast}")
print(f"Active Spiketrains: {spike_contrast_trace.active_spiketrains}")
print(f"Synchrony: {spike_contrast_trace.synchrony}")
print(f"Binsize: {spike_contrast_trace.bin_size}")

Output:

Spike train no. 0: [] ms
Spike train no. 1: [] ms
/elephant/elephant/spike_train_synchrony.py:201: UserWarning: All input spiketrains contain no more than 1 spike.
  warnings.warn("All input spiketrains contain no more than 1 spike.")
/elephant/elephant/spike_train_synchrony.py:224: RuntimeWarning: invalid value encountered in scalar divide
  active_st = (np.sum(n_k * theta_k) / np.sum(theta_k) - 1) / (
/elephant/elephant/spike_train_synchrony.py:226: RuntimeWarning: invalid value encountered in scalar divide
  contrast = np.sum(np.abs(np.diff(theta_k))) / (2 * n_spikes_total)
Synchrony: nan
Contrast: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Active Spiketrains: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Synchrony: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Binsize: [500.         450.         405.         364.5        328.05
 295.245      265.7205     239.14845    215.233605   193.7102445
 174.33922005 156.90529805 141.21476824 127.09329142 114.38396227
 102.94556605  92.65100944  83.3859085   75.04731765  67.54258588
  60.7883273   54.70949457  49.23854511  44.3146906   39.88322154
  35.89489938  32.30540945  29.0748685   26.16738165  23.55064349
  21.19557914  19.07602122  17.1684191   15.45157719  13.90641947
  12.51577752  11.26419977  10.1377798 ] ms

Comment on lines 195 to 203
try:
isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
except TypeError:
raise ValueError("All input spiketrains contain no more than 1 spike.")
except:
# Do not raise an error but just a warning and continue the code which will return nan-values as synchrony.
# This is useful when running spike-contrast in a batch script, where it is necessary
# to get a complete data structure (trace for all bin-sizes) also for empty spike trains.
warnings.warn("All input spiketrains contain no more than 1 spike.")
isi_min = 0 # continue with isi_min = 0, causing to return spike-contrast with nan as synchrony values but
bin_min = max(isi_min / 2, min_bin)
Copy link
Member

@Moritz-Alexander-Kern Moritz-Alexander-Kern Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
except TypeError:
raise ValueError("All input spiketrains contain no more than 1 spike.")
except:
# Do not raise an error but just a warning and continue the code which will return nan-values as synchrony.
# This is useful when running spike-contrast in a batch script, where it is necessary
# to get a complete data structure (trace for all bin-sizes) also for empty spike trains.
warnings.warn("All input spiketrains contain no more than 1 spike.")
isi_min = 0 # continue with isi_min = 0, causing to return spike-contrast with nan as synchrony values but
bin_min = max(isi_min / 2, min_bin)
if n_spikes_total == 0 :
warnings.warn("All input spike trains contain no spikes.")
isi_min = 0
else:
isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)
bin_min = max(isi_min / 2, min_bin)

Depending on whether or not to handle the case with only 1 spike per spike train, we could drop the try except block, by handling empty spike trains this way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Moritz-Alexander-Kern

Thank you for your suggestions!
The result for the empty spike trains, looks good!

Regarding "stationarity" in this line:

isi_min = min(np.diff(st).min() for st in spiketrains if len(st) > 1)

You are right, (non-)stationarity is not related how we should handle the case of a single spike. It just explains, why we can get different synchrony values even if we just have one spike per spike train.
For example, these two spike trains (non-stationary)

spiketrains = [
    neo.SpikeTrain([1.0]*ms, t_stop=1000 * ms),
    neo.SpikeTrain([1.2]*ms, t_stop=1000 * ms),
]

will lead to another synchrony value compared to these two spike trains (stationary):

spiketrains = [
    neo.SpikeTrain([250.0]*ms, t_stop=1000 * ms),
    neo.SpikeTrain([750.0]*ms, t_stop=1000 * ms),
]

Since it is possible to obtain a synchrony value even with just a single spike per spike train, I would suggest, to handle this case the same way as you suggested for the empty spiketrain by setting isi_min = 0 plus the warning (If I'm not wrong, we could use my original suggestion with the try-except block?).

Just to explain, what happens if we set isi_min = 0:
Spike-contrast calculates a synchrony for many different bin sizes ("bin size sweep") and at the end selects the optimum as the final synchrony.
The smallest ISI from all spiketrains is used to define the lower limit for the bin-size sweep. If isi_min=0, the lower limit will be set to min_bin:

bin_min = max(isi_min / 2, min_bin)

So, even in the case of just one spike per spike train, the algorithm can calculate a synchrony.

@coveralls
Copy link
Collaborator

coveralls commented Apr 17, 2024

Coverage Status

coverage: 88.239% (-0.03%) from 88.264%
when pulling 47db610 on ManuelCiba:modify_error_handling_in_spike-contrast
into a4e601e on NeuralEnsemble:master.

@Moritz-Alexander-Kern
Copy link
Member

Feel free to reopen at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants