[BioPython] BLAST XML problem?

Peter biopython at maubp.freeserve.co.uk
Wed Jan 11 13:39:46 EST 2006


>> It may be the NCBI's fault for producing a bad XML file...
> 
> Yeah, well, I still have to deal with it :(  In any case, why is this 
> cropping up now? Schäffer has been in NCBI for years...

I would guess because BioPython users would have parsed the plain text 
output from blast, rather than XML.

> The file is available at http://iddo-friedberg.org/biopy_bad_blast.xml
> 
> in case anyone wants to have a look-see.

The first line of the XML file could (should?) define an encoding, e.g.

<?xml version="1.0" encoding="utf-8"?>

or:

<?xml version="1.0" encoding="ISO-8859-1"?>

Instead its just:

<?xml version="1.0"?>

Short term solutions which I have just tried and got to work:

(1) Edit the offending character by hand (as you did)
(2) Specify encoding="ISO-8859-1" by editing the first line by hand
(2) Covert the file to unicode (doubles the size)

BTW - Are you getting the file from standalone blast, or the NCBI website?

Unless a local XML expert steps up, would you like to contact the NCBI 
on this issue?

Peter



More information about the BioPython mailing list