[Biopython] Legacy blastn XML outfile parsing is slow. What XML parser is actually used?

Martin Mokrejs mmokrejs at fold.natur.cuni.cz
Tue Sep 25 11:26:46 UTC 2012


Peter Cock wrote:

> Currently the HSP object in SearchIO uses hit_start,
> hit_end, query_start and query_end - but also note
> that we're using Python counting.

Ah, thanks for the reminder. Yes, this is exactly why I wasn't very happy to re-implement
my code right now to use searchio but forgot to say that. I already did fix all the
off-by-one tweaks in my code to use somewhere the zero-based counting and somewhere to
rather use 1-based (where human is reading the output text files/tables). And these are
scattered through the program (I think) and this will be probably the major stopper for me.
;) Things might break for me all over the places.

I am not saying this is good idea but really, providing cElementTree calls from within
NCBIXML would be more appealing to me (instead of current python-based expat parser
calls).



More information about the Biopython mailing list