[Biopython] Legacy blastn XML outfile parsing is slow. What XML parser is actually used?
Martin Mokrejs
mmokrejs at fold.natur.cuni.cz
Tue Sep 25 07:26:46 EDT 2012
Peter Cock wrote:
> Currently the HSP object in SearchIO uses hit_start,
> hit_end, query_start and query_end - but also note
> that we're using Python counting.
Ah, thanks for the reminder. Yes, this is exactly why I wasn't very happy to re-implement
my code right now to use searchio but forgot to say that. I already did fix all the
off-by-one tweaks in my code to use somewhere the zero-based counting and somewhere to
rather use 1-based (where human is reading the output text files/tables). And these are
scattered through the program (I think) and this will be probably the major stopper for me.
;) Things might break for me all over the places.
I am not saying this is good idea but really, providing cElementTree calls from within
NCBIXML would be more appealing to me (instead of current python-based expat parser
calls).
More information about the Biopython
mailing list