[Biopython-dev] Bio.GFF and Brad's code

Brad Chapman chapmanb at 50mail.com
Wed Dec 2 12:57:44 UTC 2009


Hi Peter;

> Brad has some GFF parsing code he as been working on, which
> would be nice to merge into Biopython at some point. See:
> 
> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005700.html
> 
> As we started to discuss earlier this year, we need to think about
> what to do with the existing (old) Bio.GFF module. This was written
> by Michael Hoffman back in 2002 which accesses MySQL General
> Feature Format (GFF) databases created with BioPerl.
> 
> I've been looking at the old Bio.GFF code, and there are a lot of
> redundant things like its own GenBank/EMBL location parsing,
> plus its own location objects and its own Feature objects (rather
> than reusing Bio.SeqFeature which should have sufficed).

I'm ambivalent on deprecating GFF. Agreed that some of it is not
well integrated with the rest of Biopython, with the
Location/LocationFromString code being the most duplicated. It's too
bad feature were reimplemented as well. Is Michael around at all?

> I want to suggest we deprecate Michael Hoffman's Bio.GFF module
> in Biopython 1.53 (I'm hoping we can do this next month, Dec 2009).
> Depending on how soon Brad's code is ready to be merged (which I
> am assuming could be Biopython 1.54, spring 2010), we can perhaps
> accelerate removal of the old module.

The current structure of the GFF code does not require removing what
is currently there. It needs a couple of lines in __init__.py to
expose the useful classes at the top level:

from GFFParser import GFFParser, DiscoGFFParser, GFFExaminer
from GFFOutput import GFF3Writer

and we'd need to move the MySQLdb check to the Connection class so
it's only needed if you are actually using the database code.

So these can happen in parallel. Ideally, I'd like to get the GFF
stuff in sooner rather than later. The main item on my todo list is
finishing the documentation, with the stubs here:

http://biopython.org/wiki/GFF_Parsing

If I crank that out what do we think about putting it in with the
__init__.py modifications I suggested?

Brad



More information about the Biopython-dev mailing list