[Biopython] Problems parsing with PSIBlastParser
Peter
biopython at maubp.freeserve.co.uk
Tue Nov 3 08:32:55 EST 2009
On Tue, Nov 3, 2009 at 1:16 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> We had the same problem w/ the BioPerl XML parser and ended up preprocessing
> the data into separate XML files, carrying over the relevant information
> into each file (yes, there is a better way, but it essentially involves a
> redesign of the XML parser and related objects).
>
> BTW, the same thing happens if one runs multiple queries in the same file.
> All individual report XML are in one single XML file, and information
> relevant to all reports is only found into the first report. I think this
> has been known for a while. I've repeatedly tried contacting NCBI but
> haven't had a response re: this problem.
>
> chris
Hi Chris,
Old versions of blastall (also) used to produce concatenated XML files for
multiple queries, but from about 2.2.14 they started (ab)using the iteration
fields originally for PSI-BLAST output to hold multiple queries (there was
some discussion of this on Biopython Bugs 1933 and 1970 - Biopython
*should* cope with either).
Apparently pgpblast was left producing concatenated XML files.
The upshot of this is multi-query BLASTP etc XML files look just like
single query multi-round PSI-BLAST XML files. This means having a
single BLAST XML parser that automatically treats the two differently
is tricky.
Does that fit with your experience?
Peter
More information about the Biopython
mailing list