[Biopython] GFF parsing

John Reid j.reid at mail.cryst.bbk.ac.uk
Fri Feb 26 14:01:19 UTC 2010



Brad Chapman wrote:
> There are two different ways to limit the parsing to sections of the
> file at once: either limit by the number of lines or by features you
> are interested in. I added some text to the documentation examples
> on the wiki to try and help explain the usage. Could you give it a
> look now that it's better explained and see if this is helpful?
This looks helpful.
> 
> Alternatively, there could be something especially hard about the
> GFF file in particular you are using. If you are still having issues
> and could pass along the code and file you are parsing, I can take
> a deeper look.
For my purposes the python csv module is doing the job. I would prefer 
to use a proper GFF parser but for the moment your parser is taking 100 
seconds to parse a 40Mb file and the csv reader is doing it in about 10 
seconds. Do you think this is reasonable or do you want to take a closer 
look?

> 
> Thanks for the feedback. It's really helpful and we are currently trying
> to work through use cases and designing an API for accessing GFF in the
> most intuitive way.
Thanks yourself for the quick response.

John.




More information about the Biopython mailing list