-
Notifications
You must be signed in to change notification settings - Fork 4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - just a small request to doc how test data was generated.
1 1:7:A:C 0.0 7 C A | ||
1 1:8:A:C 0.0 8 C A | ||
1 1:9:A:C 0.0 9 C A | ||
1 1:1:G:CGCGCG 0.0 1 CGCGCG G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document how you generated this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm how about this:
- I'll add some notes for now saying this was created using software in a separate environment (hail)
- Create an issue to document this properly and include the associated code that created it
- Wait to see where the multi-repo conversation goes (https://github.com/pystatgen/sgkit/issues/65#issuecomment-670049733)
- Add this to the
validation
folder if we merge repos since it is essentially the same process as the one I used in the REGENIE and HWE PRs
That sound good? I'd rather not bootstrap another validation
-like concept with all the associated CI changes if I can help it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomwhite does @eric-czech's proposal sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (sorry forgot to reply) - sounds great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add some notes for now saying this was created using software in a separate environment (hail)
@tomwhite can I get a sign off on those updates when you get a chance? |
#12
This also updates the test data to contain alleles and sample id strings that are longer to ensure that inferred data types are correct.
The function
_max_str_len
should probably be a central utility, but we can address that in https://github.com/pystatgen/sgkit/issues/90.