[Biopython-dev] Slight modifcation to BlastXML parser for AB-BLAST input

Peter Cock p.j.a.cock at googlemail.com
Thu Dec 13 17:09:59 UTC 2012


On Thu, Dec 13, 2012 at 4:14 PM, Wibowo Arindrarto <bow at bow.web.id> wrote:
>> Presently, SearchIO can't parse AB-BLAST XML output
>> for multiple queries as the AB-BLAST output is just a concatentation of
>> multiple single queries. Each query contains the <?xml version  ...> section
>> at the beginning and causes ElementTree to error during iteration. To get
>> around this I have been piping the AB-BLAST output and parsing it into a
>> more NCBI-BLAST form.
>
> Hmm..it is a problem if AB-BLAST concatenates outputs like that. It
> makes the XML invalid, though, so I'm not sure if we should change
> the parser to tolerate this. What are the other differences?

The older NCBI BLAST tools had this bug as well - and as a result
our NCBIXML has a hack to cope with it. It might be worth applying
the same kind of fix to the SearchIO BLAST XML parser as well
if it would help with both AB-BLAST and any older NCBI XML files.

Peter



More information about the Biopython-dev mailing list