You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to filter the data to get the Group 0 set, but I'm getting slightly different # sequences than those in the paper.
Q1: I assume that Processed_K50_dG_datasets/Tsuboyama2023_Dataset2_Dataset3_20230416.csv is the same as the "Dataset 1 and Dataset 2" file referenced in the paper (and that this comes from K50_dG_Dataset1_Dataset2)?
Since this processed file has 776,299 lines but Tsuboyama2023_Dataset1_20230416.csv has 1,841,286 lines?
Q2: How do I filter this file to get the Group0 variants? I'm trying to reproduce the number of sequences from Table S1 (586,938 total sequences, 434,556 singles and 152,382 doubles)
I tried using the Single list CSV file, filtering for DMS_group == G0, filtering out low-confidence values from the ddG_ML_float column.
But then I get:
607,839 total instead of 586,938 in Table S1
159,051 doubles instead of 152,382 in Table S1
Could you please let me know what I've missed? I assume there's another step of filtering I haven't done.
The text was updated successfully, but these errors were encountered:
Hi there!
I'm trying to filter the data to get the Group 0 set, but I'm getting slightly different # sequences than those in the paper.
Q1: I assume that
Processed_K50_dG_datasets/Tsuboyama2023_Dataset2_Dataset3_20230416.csv
is the same as the "Dataset 1 and Dataset 2" file referenced in the paper (and that this comes fromK50_dG_Dataset1_Dataset2
)?Tsuboyama2023_Dataset1_20230416.csv
has 1,841,286 lines?Q2: How do I filter this file to get the Group0 variants? I'm trying to reproduce the number of sequences from Table S1 (586,938 total sequences, 434,556 singles and 152,382 doubles)
I tried using the Single list CSV file, filtering for DMS_group == G0, filtering out low-confidence values from the
ddG_ML_float
column.But then I get:
Could you please let me know what I've missed? I assume there's another step of filtering I haven't done.
The text was updated successfully, but these errors were encountered: