[BioPython] NCBIXML error

Christof Winter winter at biotec.tu-dresden.de
Thu May 8 08:47:38 UTC 2008


gbastian at pasteur.fr wrote:
> Dear all,
> 
> I have been using a script to blast sequences for days without a
> problem, then, after 2/3 hours it started giving me this error
> and never worked again...did they change xml blast format?
> 
> this is the error:
> 
> File "ppinvestigator.py", line 918, in ?
>     pdbs.find_homologous_seqs(int_list)
>   File "ppinvestigator.py", line 122, in find_homologous_seqs
>     data = search_seq(self.sequences[chain][0], interactor_list)
>   File
> "/home/giacomotion/Desktop/VU-PROJECT/PPI_PDBS/PPINVESTIGATOR/tools.py",
> line 32, in search_seq
>     blast_record = blast_records.next()
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 576,
> in parse
>     expat_parser.Parse(text, False)
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 98,
> in endElement
>     eval("self.%s()" % method)
>   File "<string>", line 0, in ?
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 216,
> in _end_BlastOutput_version
>     self._header.date = self._value.split()[2][1:-1]
> IndexError: list index out of range

[...]

> this is the xml that I get:
> 
> <?xml version="1.0"?>
> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
> "NCBI_BlastOutput.dtd">
> <BlastOutput>
>   <BlastOutput_program>blastp</BlastOutput_program>
>   <BlastOutput_version>BLASTP 2.2.18+</BlastOutput_version>
>   <BlastOutput_reference>Altschul, Stephen F., Thomas L. Madden, Alejandro

[...]

It seems they did change the format. When I run blast locally, it says

<BlastOutput_version>blastp 2.2.18 [Mar-02-2008]</BlastOutput_version>

self._header.date = self._value.split()[2][1:-1] works in that case, whereas it 
chokes on your

<BlastOutput_version>BLASTP 2.2.18+</BlastOutput_version>

as "BLASTP 2.2.18+".split() lacks a third element.
Should be easy to fix, shouldn't it?

Christof





More information about the Biopython mailing list