[BioPython] NCBIXML for multiple queries

Michael Anthony Maibaum mike at maibaum.org
Thu Jan 19 06:59:07 EST 2006


On 16 Jan 2006, at 21:08, Michael Anthony Maibaum wrote:

>
> On 16 Jan 2006, at 15:50, David Weisman wrote:
>
>> Hello,
>>
>> I tried using NCBIXML parsing on a local blast run, in which the  
>> input had multiple
>> query sequences.  Blastall writes multiple xml documents to the  
>> output file, and the
>> SAX parser threw a SAXParseException on the second <?xml...>  
>> declaration, complaining
>> of junk after the document element.

--snip--

>  I've been meaning to check if this fixed in cvs and file a bug if  
> not but haven't got around to it yet.


FWIW, I tried to file a bug with a patch, but bugzilla appears to  
have taken a dislike to me. Hopefully someone with cvs access can  
have a look at the patch I sent to biopython-dev but in the meantime  
if anyone else actually wants a patch I've included it with this  
message.


NCBIStandalone chunks multiple searches based on the string 'BLAST',  
which works fine for text output but doesn't work for xml. The patch  
attached adds '<?xml ' as an extra option to chunk the output. I've  
tested this on a fair amount of Blastpgp output but not with any  
other output, although I don't know of a reason why it wouldn't work.

If users are encouraged to use the xml output mode it may be better  
to put the '<?xml ' string first rather than last in the sequence of  
options.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: NCBIStandalone.patch
Type: application/octet-stream
Size: 572 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20060119/e7f5c630/NCBIStandalone.obj


More information about the BioPython mailing list