[Biopython] sequence coordinate mapping
Peter
biopython at maubp.freeserve.co.uk
Fri Jun 18 14:19:59 EDT 2010
On Fri, Jun 18, 2010 at 7:00 PM, Reece Hart <reece at berkeley.edu> wrote:
> Thanks, all, for feedback. I'm still digesting some of the previous
> comments. For the purposes of discussion, I've attached the crude
> (pre-crude, even) implementation that I mentioned.
Thanks
> Caveats/ToDos:
> * The interface is sufficient for my needs, but for a large number of CDS
> subfeatures, it might make sense to change the implementation index
> rather than linear search.
It looks like the core idea you are using is the same - loop over the exons
(subfeatures) to keep track of where you are.
> * I ignore strand for the moment.
That makes like a bit more fun! I haven't tested my code on mixed
strand features yet (e.g. some crazy tRNA annotation I've seen).
> * I don't use SeqFeature.AbstractPosition and friends.
Unfortunately they crop up in lots of real world GenBank/EMBL files,
so anything we add to the SeqFeature object has to cope with them.
Things like GFF3 files avoid this of course.
Peter
More information about the Biopython
mailing list