You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The substitutions file uses one kinds of ID (NM_152486.4) and the indels file uses Ensembl IDs (e.g. ENST00000263574.5).
This adds an extra steps of mapping between ID types
Columns affected by this include:
Gene
HGVSc
HGVSp (missing in indels file)
Symbol (missing from substitutions file)
protein (missing from indels file)
If possible, could you update the files to ensure more consistent annotation? The Ensembl IDs I find especially useful for mapping onto other resources, and comparing results within the same transcripts between subs and indels.
Thanks,
Brian
The text was updated successfully, but these errors were encountered:
Hi there,
Thanks for the resources!
I noticed there seems to be some differences in how these files were annotated:
'https://marks.hms.harvard.edu/proteingym/clinical_ProteinGym_substitutions.zip'
'https://marks.hms.harvard.edu/proteingym/clinical_ProteinGym_indels.zip'
The substitutions file uses one kinds of ID (
NM_152486.4
) and the indels file uses Ensembl IDs (e.g.ENST00000263574.5
).This adds an extra steps of mapping between ID types
Columns affected by this include:
Gene
HGVSc
HGVSp
(missing in indels file)Symbol
(missing from substitutions file)protein
(missing from indels file)If possible, could you update the files to ensure more consistent annotation? The Ensembl IDs I find especially useful for mapping onto other resources, and comparing results within the same transcripts between subs and indels.
Thanks,
Brian
The text was updated successfully, but these errors were encountered: