[Biopython-dev] [BioPython] blast text vs XML, was: Need help parsing Blastoutput

Peter (BioPython-dev) biopython-dev at maubp.freeserve.co.uk
Thu Apr 20 12:34:25 UTC 2006


Peter wrote:
>>According to my notes, I was getting lists for the following with the 
>>plain text output, which are now integers using the XML parser:
>> 
>>hsp.gaps
>>hsp.positives
>>hsp.identities

Thanks for confirming that.

Michiel De Hoon wrote:
> Actually, I like the XML parser output a bit better, but we can change it to
> the text parser's output if preferred.

I agree that the XML parser output is much simpler.  However, my gut 
instinct is to preserve the old behaviour so that anyone with an old 
script can simply swap the parser from plain text to XML and have 
everything else "just work".

> Do you know of any other inconsistencies between the parsers?

No - but unless someone sits down with a pair of match files and 
compares the resulting data structures, we don't know for sure.

> If not, I suggest raising a deprecation warning with the text-based Blast
> parser, so users won't waste time trying to figure out why it doesn't work.

Not a bad idea.

In addition, it would be nice if the text parser could also check the 
first line to see if its actual XML output and issue a helpful error 
message.

Or maybe even handle this transparently for the user with just a warning 
message?

At some point we should also change the default parameters for the blast 
commands in Bio/Blast/NCBIStandalone.py to default to XML output (as I 
did with the rpsblast support, using the -m 7 command line option).

Peter




More information about the Biopython-dev mailing list