[Biopython-dev] [Biopython - Bug #3311] (New) GFF parser fails to intelligently break lines

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Sat Oct 29 02:00:07 UTC 2011

Issue #3311 has been reported by gahoo lee.

Bug #3311: GFF parser fails to intelligently break lines

Author: gahoo lee
Status: New
Priority: Normal
Target version: 

Move from "BioStar":http://biostar.stackexchange.com/questions/13651/gff-parsing-in-python-is-not-so-perfect

I use BCBio.GFF to parse "chr01.gff3":ftp://ftp.plantbiology.msu.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_6.1/chr01.dir/chr01.gff3 and "all.gff3":ftp://ftp.plantbiology.msu.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_6.1/all.dir/all.gff3 . But things didn't work out as I expect. Here's the code:

@from BCBio import GFF
limits = dict(gff_type = ["gene","mRNA","CDS"])
gff_handle = open('chr01.gff3')
for rec in GFF.parse(gff_handle,target_lines=1000,limit_info=limits):
    #Chromosome seq level
    for gene_feature in rec.features:
        #gene level
        for mRNA_feature in gene_feature.sub_features:
            #mRNA level
            print mRNA_feature.type
            print mRNA_feature.qualifiers['Alias']@

And I got:

@Traceback (most recent call last):
  File "R:\Untitled 1.py", line 14, in <module>
    print mRNA_feature.qualifiers['Alias']
KeyError: 'Alias'@

And the 'type' is "CDS" which is not correct. When parsing without


everything is ok. But parsing all.gff3 came to the same problem. Maybe all.gff3 is too huge to parse.

The problem might be due to the parser did not recognise the entry boudary correctly.

You have received this notification because this email was added to the New Issue Alert plugin

You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org

More information about the Biopython-dev mailing list