[Biopython-dev] SeqFeature and FeatureLocation objects (was Bio.GFF)
Peter Cock
p.j.a.cock at googlemail.com
Tue Apr 21 09:55:23 EDT 2009
> I have also been thinking about how I would (re)design the SeqFeature
> and FeatureLocation objects. In particular I would want to put the
> strand as part of the same object as the location, and also any
> join-locations. I would still want to cope with fuzzy locations, but
> make the non-fuzzy approximations more prominent in comparison. Also,
> I really don't like the way joins are currently stored as more
> SeqFeatures in the sub_features list (plus this kind of blocks
> alternative usage for child/parent nesting that might be nice for GFF
> files).
>
> The prime use case to keep in mind is taking a feature location (even
> a join), and using this to extract that region of nucleotides from the
> parent sequence (i.e. a Seq object or a SeqRecord object, as now both
> can be sliced).
I forgot to mention the second major use case I'm concerned about,
which is recovering the GenBank/EMBL style location string. I have
looked at this in the past, by adding methods to the FeatureLocation
and all the Position objects, but it is complicated by the fact the
Position objects don't know if they are at the start or end (and for
the start locations we need to add one to convert from Python
counting). This is the main block on having Bio.SeqIO support writing
GenBank (or EMBL) files with their features included.
Peter
More information about the Biopython-dev
mailing list