[BioPython] BLAST XML problem?
Iddo Friedberg
idoerg at burnham.org
Wed Jan 11 12:08:01 EST 2006
Peter wrote:
> Iddo Friedberg wrote:
>
>> Slight correction to my previous email: using biopython from CVS, and
>> python 2.3 as you can see from the stack dump
>>
>> Iddo Friedberg wrote:
>>
>>> Not sure what we're doing wrong here...
>>>
>>> Using the cookbook example, biopython 1.41, python 2.2 (our Zope
>>> needs that Python version, sorry):
>>>
>>> from Bio.Blast import NCBIXML
>>>
>>> b_parser = NCBIXML.BlastParser()
>>> b_record = b_parser.parse(blast_out)
>>>
>>>
>>> Breaks on "Alejandro Schäffer", in the XML <BlastOutput_reference>
>>> tag. The ä seems to cause the error. Replace it with a regular "a"
>>> everything is hunky-dory
>>
>
> Is the lower-case a with umlaut in the XML file as ä, or using an
> encoding like ä or ä instead? (ampersand characters, aka
> character entities)
It's an ä not a character entity.
>
> Also, what character set does the blast_out XML file claim to be in?
> And does that fit with the inclusion of an a-umlaut as a character?
I haven't the foggiest... :)
>
> It may be the NCBI's fault for producing a bad XML file...
>
Yeah, well, I still have to deal with it :( In any case, why is this
cropping up now? Schäffer has been in NCBI for years...
The file is available at http://iddo-friedberg.org/biopy_bad_blast.xml
in case anyone wants to have a look-see.
Thanks,
Iddo
--
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://iddo-friedberg.org
More information about the BioPython
mailing list