[Biopython-dev] New: Uniprot XML parser

Eric Talevich eric.talevich at gmail.com
Thu Jan 14 20:03:35 UTC 2010


On Thu, Jan 14, 2010 at 1:57 PM, Andrea Pierleoni
<andrea at biocomp.unibo.it> wrote:
> Hi Everyone,
> I've been using a lot biopython in the last couple of years, it is very
> useful to me. So now it's my turn to contribute and be helpful to someone
> else.
> I wrote a parser for the Uniprot XML format, that is reasonably fast (8000
> entries/min on a core2duo mainstream PC). The main improvements with the
> actual SwissProt flat file parser are a deeper parsing of comment fields,
> and a Seqrecord containing features.
>
> The parser is based on the ElementTree library and was successfully tested
> on the complete SwissProt database (v57.12). Thus I think it is ready to
> be released.

Have you tried using this with Python 2.4? The ElementTree module
wasn't added to the standard library until Python 2.5, so a simple
"from xml.etree import ElementTree" may need some additional
protection. It's also nice to let the user use a third-party
implementation of ElementTree if they're stuck on Py2.4.

An example of this is at the top of Bio.Phylo.PhyloXMLIO -- not
pretty, but functional:
http://github.com/biopython/biopython/blob/master/Bio/Phylo/PhyloXMLIO.py

-Eric



More information about the Biopython-dev mailing list