[Biopython-dev] [BioPython] blast text vs XML, was: Need help parsing Blastoutput
Peter (BioPython-dev)
biopython-dev at maubp.freeserve.co.uk
Thu Apr 20 12:34:25 UTC 2006
Peter wrote:
>>According to my notes, I was getting lists for the following with the
>>plain text output, which are now integers using the XML parser:
>>
>>hsp.gaps
>>hsp.positives
>>hsp.identities
Thanks for confirming that.
Michiel De Hoon wrote:
> Actually, I like the XML parser output a bit better, but we can change it to
> the text parser's output if preferred.
I agree that the XML parser output is much simpler. However, my gut
instinct is to preserve the old behaviour so that anyone with an old
script can simply swap the parser from plain text to XML and have
everything else "just work".
> Do you know of any other inconsistencies between the parsers?
No - but unless someone sits down with a pair of match files and
compares the resulting data structures, we don't know for sure.
> If not, I suggest raising a deprecation warning with the text-based Blast
> parser, so users won't waste time trying to figure out why it doesn't work.
Not a bad idea.
In addition, it would be nice if the text parser could also check the
first line to see if its actual XML output and issue a helpful error
message.
Or maybe even handle this transparently for the user with just a warning
message?
At some point we should also change the default parameters for the blast
commands in Bio/Blast/NCBIStandalone.py to default to XML output (as I
did with the rpsblast support, using the -m 7 command line option).
Peter
More information about the Biopython-dev
mailing list