[BioPython] Problem parsing Blast XML output from different sources

Steffi Gebauer-Jung gebauer-jung at ice.mpg.de
Thu Oct 5 10:30:36 UTC 2006


Hello,

because of blastall 2.2.14 output was not parsed from the 
Bio.Blast.NCBIStandalone parser,
I tried to switch to the recommended Bio.Blast.NCBIXML parser.

Thereby I found, that the xml output of the locally installed standalone 
blastall (2.2.14)
differs from the web xml output.

For BlastN hsps on Plus/Minus strands, the xml gives
query_frame/hit_frame  1 / -1 as usual.
But query and frame positions and sequences are switched in direction
(would match frames -1/1).

As the Bio.Blast.Record returned by the NCBIXML parser only gives 
frames, sequences
and start positions it is not possible (without knowing the source of 
the xml file)
to be sure to find the right data.

This is clearly a problem of Blast.
But because of the missing end positions in the returned record object
it becomes a problem for users of the parser too.

Could somebody try to confirm the different behaviour of the xml blast 
output
with his/her own examples/installation?

Thanks, Steffi






More information about the Biopython mailing list