You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly, congratulations on your excellent work and the publication in Nature! I found your research very insightful and it has sparked my interest in calculating homeology.
I am currently trying to understand the homeology function as used in your Jupyter notebook. Specifically, I am looking at the following line of code:
In this function call, stride is set to 2 and pad is set to 9. However, in the Nature paper, it is mentioned that:
For every position within a 200-bp window around each break end pair, a 41-bp bin centred at the base was queried for the corresponding hg19 reference sequence. All pairs of 41-bp bins within each junction-associated 200-bp window were then aligned to one another to construct a 200-by-200 matrix of Levenshtein edit distances.
From this description, it seems like a stride of 1 and a pad of 20 were used to create a 41-bp bin. I am a bit confused about the discrepancy between the parameters used in the notebook and those described in the paper.
Could you please help me understand why these numbers are different? Is there something I am misunderstanding about the stride and pad parameters?
Thank you in advance for your help and for your contributions to the field!
The text was updated successfully, but these errors were encountered:
Hello,
Firstly, congratulations on your excellent work and the publication in Nature! I found your research very insightful and it has sparked my interest in calculating homeology.
I am currently trying to understand the homeology function as used in your Jupyter notebook. Specifically, I am looking at the following line of code:
hm = GxG::homeology(ref = ref, gr = win + 100, stride = 2, pad = 9, verbose = TRUE, rc = FALSE)
In this function call, stride is set to 2 and pad is set to 9. However, in the Nature paper, it is mentioned that:
For every position within a 200-bp window around each break end pair, a 41-bp bin centred at the base was queried for the corresponding hg19 reference sequence. All pairs of 41-bp bins within each junction-associated 200-bp window were then aligned to one another to construct a 200-by-200 matrix of Levenshtein edit distances.
From this description, it seems like a stride of 1 and a pad of 20 were used to create a 41-bp bin. I am a bit confused about the discrepancy between the parameters used in the notebook and those described in the paper.
Could you please help me understand why these numbers are different? Is there something I am misunderstanding about the stride and pad parameters?
Thank you in advance for your help and for your contributions to the field!
The text was updated successfully, but these errors were encountered: