-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: change shape of filelist list data instead of re-reading it #515
Conversation
Review changes with SemanticDiff. Analyzed 6 of 8 files. Overall, the semantic diff is 12% smaller than the GitHub diff.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #515 +/- ##
==========================================
+ Coverage 74.40% 74.48% +0.07%
==========================================
Files 45 45
Lines 3004 3029 +25
Branches 479 491 +12
==========================================
+ Hits 2235 2256 +21
- Misses 676 679 +3
- Partials 93 94 +1 ☔ View full report in Codecov by Sentry. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -684,6 +685,10 @@ def reload_filelist_data_as_dict(state): | |||
state.get(StepNames.data_has_header_line_step, "yes") == "yes" | |||
), | |||
) | |||
state["filelist_data"] = [] | |||
for row in state["filelist_data_list"][1:]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is dropped in state["filelist_data_list"][0]
?
Why are we dropping it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first element of state["filelist_data_list"]
is the header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah forgot about the header. Got it.
I guess I got confused with ilne 679 which gets the headers.
a751931
to
b83714c
Compare
I ran some tests. List of characters is much shorter now without the duplicates. ex: NEW: ( sdte note I manually added ':' for moh )
Previous version:
Training worked with no issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good to me except the pre-commit part in git workflows which I don't feel confident about.
b83714c
to
9004aad
Compare
for the e2e config, we don't want to require contact information declarations on each submodel
this allows for any processes that take place on the filelist list data (i.e. normalization) to be preserved
fixes #504
PR Goal?
Ensures that we are outputting a set of symbols without duplicates
Fixes?
#504
Feedback sought?
sanity. code clarity
Priority?
medium
Tests added?
tests updated
How to test?
You could try re-running on your dataset to see how the config changes
Confidence?
medium
Version change?
none. this isn't a breaking change. no version change needed really.
Related PRs?