[Biopython-dev] [BioPython] Need help parsing Blastoutput

Peter (BioPython Dev) biopython-dev at maubp.freeserve.co.uk
Wed Apr 19 14:46:10 EDT 2006


Michiel De Hoon wrote:
> Peter wrote:
> 
>>Have you noticed that there are some slight differences between the XML 
>>parser and the text parser results (single values versus lists with one 
>>entry)?
>>
>>i.e. As it stands, the XML parser is not quite a drop in replacement for 
>>existing code.
> 
> 
> No, I was not aware of that. Can you give an example where the two parsers
> give a different result?
> 
> --Michiel.

As I recall, it wasn't different data, just a slightly different format...

I've just been trying to get a matched pair of both plain text and XML 
output to demonstrate this.

The online qblast "Text" appears to be slightly different to what the 
current parser is expecting.  For standalone blast I only have RPS-BLAST 
databases on my local machine, and the text output form RPS-BLAST is 
very different and cannot be parsed by the current Standalone Blast parser.

If anyone has a matched set of Blast output files which BioPython can 
parse they could email me that would be great.  Might even turn it into 
a short addition to the test suite.

i.e. same data, in both the XML and plain text formats.  Maybe blastp or 
blastn output?

According to my notes, I was getting lists for the following with the 
plain text output, which are now integers using the XML parser:

hsp.gaps
hsp.positives
hsp.identities

The list behaviour may have been my own fault, as that code was written 
to use my modified standalone NCBI parser for use with RPS-BLAST...

Peter



More information about the Biopython-dev mailing list