[Biopython-dev] Uniprot XML parser on TrEmbl
biopython at maubp.freeserve.co.uk
Wed Nov 24 13:03:03 EST 2010
I *think* I have fixed the problem with empty names in the UniProt XML
format, without affecting the unit tests, but I don't have the 62GB free to
unpack uniprot_trembl.xml.gz to try it out:
Would you be able to retest the trunk code on that please?
I also changed the handling of the organism host (where present)
in both the UniProt and SwissProt parsers to be more consistent.
I've checked uniprot_sprot.dat still parses, but haven't tried the
much bigger uniprot_trembl.dat from uniprot_trembl.dat.gz - so
again, would you be able to retest the "swiss" text parser too?
P.S. Did you get any reply from UniProt about the apparent error in
the Q2LEH1 record within uniprot_trembl.xml.gz?
More information about the Biopython-dev