[BioPython] XML Parser problem

Christof Winter winter at biotec.tu-dresden.de
Mon Dec 11 15:44:24 UTC 2006


Dear Alper:

The error you get is probably due to a not well-formed XML document produced by older 
versions of BLAST. On my Debian Linux system, blastall 2.2.10 produces such XML files, 
whereas blastall 2.2.13 does not anymore.

A workaround was included in the NCBIStandalone Iterator class by Michael Anthony Maibaum:
http://portal.open-bio.org/pipermail/biopython/2006-January/002889.html

The following code should work:

from Bio.Blast import NCBIXML, NCBIStandalone

blast_results = open(filename)
iterator = NCBIStandalone.Iterator(blast_results, NCBIXML.BlastParser())

for record in iterator:
     # do something

blast_results.close()


http://www.biopython.org/DIST/docs/tutorial/Tutorial.pdf on page 21 still lists code that 
uses b_record = b_parser.parse(blast_out), which gives the error when parsing a file that 
consists of several XML documents.

Hope that helps,
cheers,

Christof


alper soyler wrote:
> Dear all,
> 
> I run blastall with option -m7 to save the resulting file as xml. However, when I open the xml file with firefox, it gave the following error message.
> 
> XML Parsing Error: junk after document element
> Location: file:///home/alper/Desktop/genes/combinedblastfile.xml
> Line Number 38, Column 1:
> <?xml version="1.0"?>
> ^
> But it can be opened with the text editor. When I tried to parse the results with biopython it also gives the below errors. I did not understand the reason. If you help me, I will be very glad. Thank you in advance.
>   Traceback (most recent call last):
>   File "XMLBlastParser.py", line 13, in ?
>       b_record = b_parser.parse(blast_out)
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 112,          in parse
>         self._parser.parse(handler)
>   File "/usr/lib/python2.4/site-packages/_xmlplus/sax/expatreader.py", line       109, in parse
>         xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/lib/python2.4/site-packages/_xmlplus/sax/xmlreader.py", line       123, in parse
>         self.feed(buffer)
>   File "/usr/lib/python2.4/site-packages/_xmlplus/sax/expatreader.py", line       220, in feed
>         self._err_handler.fatalError(exc)
>   File "/usr/lib/python2.4/site-packages/_xmlplus/sax/handler.py", line 38,       in fatalError
>         raise exception
> xml.sax._exceptions.SAXParseException:/home/alper/Desktop/genes/combinedblastfile.xml:38:0: junk after document element
> 
> 
> Alper Soyler
> Dept. of Food Engineering
> Middle East Technical University,Turkey
> Tel:+90312 2105625
> Fax:+90312 2102767
> http://www.metu.edu.tr/~soyler


-- 
Christof Winter
Bioinformatics Group
TU Dresden
Tatzberg 47-51
01307 Dresden, Germany



More information about the Biopython mailing list