[BioPython] BLAST XML problem?
Peter
biopython at maubp.freeserve.co.uk
Wed Jan 11 13:39:46 EST 2006
>> It may be the NCBI's fault for producing a bad XML file...
>
> Yeah, well, I still have to deal with it :( In any case, why is this
> cropping up now? Schäffer has been in NCBI for years...
I would guess because BioPython users would have parsed the plain text
output from blast, rather than XML.
> The file is available at http://iddo-friedberg.org/biopy_bad_blast.xml
>
> in case anyone wants to have a look-see.
The first line of the XML file could (should?) define an encoding, e.g.
<?xml version="1.0" encoding="utf-8"?>
or:
<?xml version="1.0" encoding="ISO-8859-1"?>
Instead its just:
<?xml version="1.0"?>
Short term solutions which I have just tried and got to work:
(1) Edit the offending character by hand (as you did)
(2) Specify encoding="ISO-8859-1" by editing the first line by hand
(2) Covert the file to unicode (doubles the size)
BTW - Are you getting the file from standalone blast, or the NCBI website?
Unless a local XML expert steps up, would you like to contact the NCBI
on this issue?
Peter
More information about the BioPython
mailing list