[Biopython-dev] Planning for Biopython 1.54

Peter Cock p.j.a.cock at googlemail.com
Thu Mar 11 17:31:08 UTC 2010


On Thu, Mar 11, 2010 at 5:11 PM, Andrea Pierleoni
<andrea at biocomp.unibo.it> wrote:
> What about the Uniprot XML format parser?
> The code is functional, and was reviewd, but it would be nice to have some
> beta testing.
> The only remaining "issue" is where to save the comment fields.
> The actual implementation will work for biosql schema, and store most
> of the data in the comment fields.
>
> Andrea

Hi Andrea,

Your UnitProt XML parser was one of the things I thought we should
delay until after getting Biopython 1.54 out the door, but I would
expect it to be included in Biopython 1.55.

There are at least two remaining issues, (1) where to save the comment
fields, and (2) what to call the format in SeqIO. Both of these should
ideally be run by BioPerl and EMBOSS on the openbio-l mailing list to
ensure the OBF projects which use simple strings for file formats are
consistent. Would you like me to start a discussion there regarding
the format name? e.g. Should it be "uniprot", "uniprot-xml", or maybe
even "unitprotxml". Personally, "uniprot" seems fine provided this is
going to be the primary file format for UniProt records in the short
to medium term.

Also I don't think any of the current Biopython developers have sat
down to review the code. As the Bio.SeqIO maintainer, I will do this,
but right now I think getting Biopython 1.54 out should be
prioritised. From a very quick look just now, the recent merging of
the SFF support to the trunk will require a few tweaks in
test_SeqIO.py (e.g. an empty file is not valid for SFF files as well
as the UniProt XML). Also including a UniProt XML file in
test_BioSQL_SeqIO.py would be worthwhile.

Regards,

Peter



More information about the Biopython-dev mailing list