[Biopython-dev] [Biopython] Filtering SeqRecord feature list / nested SeqFeatures

Peter biopython at maubp.freeserve.co.uk
Mon Aug 31 13:15:42 UTC 2009


On Mon, Aug 31, 2009 at 1:54 PM, Brad Chapman<chapmanb at 50mail.com> wrote:
>> There are other downsides to using nested SubFeatures,
>> it will probably require a lot of reworking of the GenBank
>> output due to how composite features like joins are
>> currently stored, and I haven't even looked at the BioSQL
>> side of things. You may have looked at that already
>> though, so I may just be worrying about nothing.
>
> Agreed. My thought was to prototype this with GFF and then
> think further about GenBank features. Initially, I just want to
> get the GFF parsing documented and in the Biopython
> repository, and then the BioSQL storage would be a logical
> next step.

If (as Michiel and I suggested) your GFF parser returns some
generic object (e.g. a GFF record class, or a tuple of basic
python types including a dictionary of annotation), then yes,
that can be checked in without side effects.

However, if your code goes straight to SeqRecord and
SeqFeature objects, we are going to have to deal with
how BioSQL and the existing SeqIO output code will
react (e.g. the GenBank output).

Peter



More information about the Biopython-dev mailing list