Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfix: make timestamp in msg_prefix for metadata regex compile optional #940

Closed
wants to merge 1 commit into from

Conversation

SiQube
Copy link
Member

@SiQube SiQube commented Feb 10, 2025

as raised in #939 some people might want to extract the timestamp of a custom metadata type. I currently don't know if this would be possible. this PR provides a patch which fixes the problem raised in #939 while maintaining all other properties of compile_patterns.

I've additionally added a test which reproduces the problem. the code change of interest is in src/pymovements/utils/parsing.py.

feel free to reject this proposed change with a strategy to pass the new test (test_extract_timestamp_for_metadata) in tests/unit/gaze/io/asc_test.py and comment the solution on #939. additionally, we should double check documentation s.t. people can easily extract custom (meta)data from ascii files.

resolves #939

Copy link

codecov bot commented Feb 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (87f0780) to head (8d3230c).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #940   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           74        74           
  Lines         3448      3448           
  Branches       621       621           
=========================================
  Hits          3448      3448           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SiQube SiQube force-pushed the custom-metadata-timestamp branch from a1e017e to ed25251 Compare February 10, 2025 10:54
@SiQube SiQube force-pushed the custom-metadata-timestamp branch from ed25251 to 8d3230c Compare February 10, 2025 10:59
@SiQube SiQube changed the title make timestamp in msg_prefix for metadata regex compile optional hotfix: make timestamp in msg_prefix for metadata regex compile optional Feb 10, 2025
@SiQube SiQube requested a review from saeub February 10, 2025 11:10
@dkrako
Copy link
Contributor

dkrako commented Feb 11, 2025

@SiQube Thanks for taking care of #939 !

The metadata_patterns argument is intended to be used with patterns that are only expected to be parsed once per file. This issue is related to #907

I guess that this part of our code renders your workaround unusable for use cases where the messages have multiple occurrences witch changing values:

elif compiled_metadata_patterns:
for pattern_dict in compiled_metadata_patterns.copy():
if match := pattern_dict['pattern'].match(line):
if 'value' in pattern_dict:
metadata[pattern_dict['key']] = pattern_dict['value']
else:
metadata.update(match.groupdict())
# each metadata pattern should only match once
compiled_metadata_patterns.remove(pattern_dict)

If we take something like the message MSG 2055307 TRIALID TRIAL 0 MOVIESTART, your test pattern (?P<movie_start>\d+).*?(?P<movie_id>\d+)\s+MOVIESTART would match multiple times and reassign the values to the metadata field, losing the prior values in the process.

So I guess this is change is not harmful and can be still useful as a workaround to get the timestamp of a metadata pattern.

In case of multiple matches per file this would only contain the last matched values. I would rather propose an additional message_patterns argument to parse messages, or extend the patterns argument to unify message patterns, metadata patterns, and sample column patterns as written in #939 (comment)

@SiQube SiQube closed this Feb 12, 2025
@SiQube
Copy link
Member Author

SiQube commented Feb 12, 2025

I closed this, let's work on #939 instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem with additional metadata when parsing eyelink data
2 participants