[BioPython] Problem parsing Blast XML output from different sources
Steffi Gebauer-Jung
gebauer-jung at ice.mpg.de
Thu Oct 5 10:30:36 UTC 2006
Hello,
because of blastall 2.2.14 output was not parsed from the
Bio.Blast.NCBIStandalone parser,
I tried to switch to the recommended Bio.Blast.NCBIXML parser.
Thereby I found, that the xml output of the locally installed standalone
blastall (2.2.14)
differs from the web xml output.
For BlastN hsps on Plus/Minus strands, the xml gives
query_frame/hit_frame 1 / -1 as usual.
But query and frame positions and sequences are switched in direction
(would match frames -1/1).
As the Bio.Blast.Record returned by the NCBIXML parser only gives
frames, sequences
and start positions it is not possible (without knowing the source of
the xml file)
to be sure to find the right data.
This is clearly a problem of Blast.
But because of the missing end positions in the returned record object
it becomes a problem for users of the parser too.
Could somebody try to confirm the different behaviour of the xml blast
output
with his/her own examples/installation?
Thanks, Steffi
More information about the Biopython
mailing list