You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, for a circular sequence seq, seq[0:0] or seq[1:1] return the linearised version of that sequence. That makes sense, but is problematic for certain cases. Let's imagine we create a function that wants to get the first x nucleotides after a given base, the function would return an unexpected result for x == 0: seq[0:0+0] would give the entire sequence.
Knowing that this is the behaviour, it is then possible to add an exception to the function for x == 0, but it's still not great.
Possible alternative
We could support slicing of circular molecules with indexes bigger than the length of the sequence, for instance, what now is represented as seq[7:2] for a sequence of length 10, it could be represented as seq[7:12]. This is equivalent to the behaviour of a circular string, and potentially would be allowed interesting functionality, such as getting more than a full circle. In the previous example of a sequence of length 10, seq[1:15] could return more than one loop.
The problem
When we discussed the other day, we said that a lot of pydna functions use module operations to not have expressions like seq[7:12], and have seq[7:2] instead, so this change may break some code, even if both syntaxes are still supported. The tests most likely would pick up the errors introduced, but is a breaking change for other users, so it should be postponed until a major release.
The text was updated successfully, but these errors were encountered:
I think you're right in that having a special exception for 0:0 would be complicated, and even further I think maybe similar situations could come up for other indices anyway? e.g. maybe I want [7:7+x]. Maybe a way to enable this would just be to make a second accessor method and leave the slicing behavior as it is.
sequence[0:0] can still return a linearized version of the plasmid, but a second method e.g. sequence.get(wrap=False)[0:0], could return an empty Dseq object (or raise an error?)
Hi @JamesBagley I had pretty much abandoned this issue, but you are right that an alternative method to slice sequence is probably the way to go. Not sure when this would happen thought!
Hi all, I am working on a rewrite of the Dseq class, Ill soon have something that passes the Dseq and Dseqrecord tests. The new Dseq class is much closer to the Bio.Seq class and I think that the slicing could and should be reworked in order to reduce the surprise of the user.
A followup to #161
The problem
Currently, for a circular sequence
seq
,seq[0:0]
orseq[1:1]
return the linearised version of that sequence. That makes sense, but is problematic for certain cases. Let's imagine we create a function that wants to get the firstx
nucleotides after a given base, the function would return an unexpected result forx == 0
:seq[0:0+0]
would give the entire sequence.Knowing that this is the behaviour, it is then possible to add an exception to the function for
x == 0
, but it's still not great.Possible alternative
We could support slicing of circular molecules with indexes bigger than the length of the sequence, for instance, what now is represented as
seq[7:2]
for a sequence of length 10, it could be represented asseq[7:12]
. This is equivalent to the behaviour of a circular string, and potentially would be allowed interesting functionality, such as getting more than a full circle. In the previous example of a sequence of length 10,seq[1:15]
could return more than one loop.The problem
When we discussed the other day, we said that a lot of pydna functions use module operations to not have expressions like
seq[7:12]
, and haveseq[7:2]
instead, so this change may break some code, even if both syntaxes are still supported. The tests most likely would pick up the errors introduced, but is a breaking change for other users, so it should be postponed until a major release.The text was updated successfully, but these errors were encountered: