[Biopython] Bio.Motif search_pwm
Michiel de Hoon
mjldehoon at yahoo.com
Wed Aug 1 05:14:52 UTC 2012
Hi everybody,
I was using the search_pwm method in Bio.Motif (which btw is very useful, thanks Bartek) to search for motif instances on both strands of a sequence. If the motif starts at position and is located on the forward strand, this function returns +position; if it is located on the reverse strand, it returns -position. So for position==0, we cannot deduce from the sign whether the motif is located on the forward or on the backward strand.
How about using Python-style negative indices to indicate the strand? For example, +20 means that the motif is located at [20:20+motif_length] on the forward strand, while -20 means that the motif is located at [-20:-20+motif_length].
Alternatively, we could return the strand explicitly.
In the same function, I wish we could get rid of this line:
sequence=sequence.tostring().upper()
since this assumes that sequence is a Biopython Seq object, and not a plain string. We could either use str(sequence) instead of sequence.tostring() to cover both cases, or have the Seq class inherit from strings (which we have been discussing for some time; see https://redmine.open-bio.org/issues/2351).
Best,
-Michiel.
More information about the Biopython
mailing list