[Biopython-dev] [BioPython] Need help parsing Blastoutput
Peter (BioPython Dev)
biopython-dev at maubp.freeserve.co.uk
Wed Apr 19 18:46:10 UTC 2006
Michiel De Hoon wrote:
> Peter wrote:
>
>>Have you noticed that there are some slight differences between the XML
>>parser and the text parser results (single values versus lists with one
>>entry)?
>>
>>i.e. As it stands, the XML parser is not quite a drop in replacement for
>>existing code.
>
>
> No, I was not aware of that. Can you give an example where the two parsers
> give a different result?
>
> --Michiel.
As I recall, it wasn't different data, just a slightly different format...
I've just been trying to get a matched pair of both plain text and XML
output to demonstrate this.
The online qblast "Text" appears to be slightly different to what the
current parser is expecting. For standalone blast I only have RPS-BLAST
databases on my local machine, and the text output form RPS-BLAST is
very different and cannot be parsed by the current Standalone Blast parser.
If anyone has a matched set of Blast output files which BioPython can
parse they could email me that would be great. Might even turn it into
a short addition to the test suite.
i.e. same data, in both the XML and plain text formats. Maybe blastp or
blastn output?
According to my notes, I was getting lists for the following with the
plain text output, which are now integers using the XML parser:
hsp.gaps
hsp.positives
hsp.identities
The list behaviour may have been my own fault, as that code was written
to use my modified standalone NCBI parser for use with RPS-BLAST...
Peter
More information about the Biopython-dev
mailing list