-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backwards #275
base: main
Are you sure you want to change the base?
Backwards #275
Conversation
at 300 generations ago, there were three extant genomes | ||
from which the samples inherited, and the inherited segments are | ||
as listed here. | ||
Note that this does not mean that "node 2 was laive 300 generations ago" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "alive"
(clearly, as node 2 represetns an extant, sampled genome), | ||
but rather that there are no other ancestral genomes recorded explicitly | ||
in the tree sequence that lie on the path along | ||
which node 2 has inherited it's genome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo "it's"
Nice. I like this. Note that there are a few examples of iterating up and down the graph at https://tskit.dev/tutorials/args.html#graph-traversal, but I don't actually do anything with the traversals, so your examples are better. Also note that some of stuff might also link in to tskit-dev/tskit#2869, and there are some suggestions of things you might want to calculate there. One thing that is much easier to do compared to the tree-by-tree approach is to find all the descendant samples of a particular ancestral node (or alternatively, all the internal nodes that are ancestors of a particular sample). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's great, a really helpful addition. I've a few minor take-or-leave implementation suggestions.
|
||
```{code-cell} ipython3 | ||
for e in ts.edges(): | ||
t = ts.node(e.parent).time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use ts.nodes_time[e.parent]
so that that this is more easily translatable to numba?
from which the samples inherited, and the inherited segments are | ||
as listed here. | ||
Note that this does not mean that "node 2 was laive 300 generations ago" | ||
(clearly, as node 2 represetns an extant, sampled genome), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(clearly, as node 2 represetns an extant, sampled genome), | |
(clearly, as node 2 represents an extant, sampled genome), |
Here is a data structure for a list of segments with labels: | ||
|
||
```{code-cell} ipython3 | ||
class LabelSegmentList: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth considering inheriting from collections.abc.MutableSequence
here, as this would give you all the dunder methods that you're implementing. I think these might be a bit scary to non-python people, and are a bit of a distraction from the main point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you'd need to inherit from list
then to get the actual storage. Maybe that's OK?
Now, edges in the EdgeTable are sorted by parent time, | ||
so if we iterate through the edges in order, we move back in time. | ||
So, we can use this to see the state of the process at, say, | ||
500 generations in the past: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo, 500 should be 300
In working out an algorithm that wants to move back through time, it seemed helpful to do a simple explainer on iterating back in time - i.e., taking the haplotype view instead of the tree-by-tree view.
This is a draft of that. Suggestions for nicer python or fun examples welcome! So far it's not demonstrating anything that you couldn't do tree-by-tree.