[BioPython] NCBIXML error

gbastian at pasteur.fr gbastian at pasteur.fr
Wed May 7 21:29:26 UTC 2008


Dear all,

I have been using a script to blast sequences for days without a
problem, then, after 2/3 hours it started giving me this error
and never worked again...did they change xml blast format?

this is the error:

File "ppinvestigator.py", line 918, in ?
    pdbs.find_homologous_seqs(int_list)
  File "ppinvestigator.py", line 122, in find_homologous_seqs
    data = search_seq(self.sequences[chain][0], interactor_list)
  File
"/home/giacomotion/Desktop/VU-PROJECT/PPI_PDBS/PPINVESTIGATOR/tools.py",
line 32, in search_seq
    blast_record = blast_records.next()
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 576,
in parse
    expat_parser.Parse(text, False)
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 98,
in endElement
    eval("self.%s()" % method)
  File "<string>", line 0, in ?
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 216,
in _end_BlastOutput_version
    self._header.date = self._value.split()[2][1:-1]
IndexError: list index out of range


this is my script:

#Launch the blastp search
    result_handle = NCBIWWW.qblast('blastp', 'nr', sequence,
hitlist_size=10, perc_ident=10, alignments=10, descriptions=10,
entrez_query='"Saccharomyces ce
revisiae" [Organism]')

    #Handle the result file
    blast_results = result_handle.read()
    output_filename = 'tmp_blast.xml'
    save_file = open(output_filename,'w')
    save_file.write(blast_results)
    save_file.close()
    result_handle2 = open(output_filename, 'r')
    blast_records = NCBIXML.parse(result_handle2)


    #Initialize the record dictionary
    record_storage = []

    #Iterate on the blast_handle file (only one iteration if one blastp
search)
    blast_record = blast_records.next()


this is the xml that I get:

<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
"NCBI_BlastOutput.dtd">
<BlastOutput>
  <BlastOutput_program>blastp</BlastOutput_program>
  <BlastOutput_version>BLASTP 2.2.18+</BlastOutput_version>
  <BlastOutput_reference>Altschul, Stephen F., Thomas L. Madden, Alejandro
A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
 &quot;Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs&quot;, Nucleic Acids Res.
25:3389-3402.</BlastOutput_reference>
  <BlastOutput_db>nr</BlastOutput_db>
  <BlastOutput_query-ID>4678</BlastOutput_query-ID>
  <BlastOutput_query-def>unnamed protein product</BlastOutput_query-def>
  <BlastOutput_query-len>280</BlastOutput_query-len>
  <BlastOutput_param>
    <Parameters>
      <Parameters_matrix>BLOSUM62</Parameters_matrix>
      <Parameters_expect>10</Parameters_expect>
      <Parameters_gap-open>11</Parameters_gap-open>
      <Parameters_gap-extend>1</Parameters_gap-extend>
    </Parameters>
  </BlastOutput_param>
  <BlastOutput_iterations>
    <Iteration>
      <Iteration_iter-num>1</Iteration_iter-num>
      <Iteration_query-ID>4678</Iteration_query-ID>
      <Iteration_query-def>unnamed protein product</Iteration_query-def>
      <Iteration_query-len>280</Iteration_query-len>
      <Iteration_hits>
        <Hit>
          <Hit_num>1</Hit_num>
          <Hit_id>gi|151567870|pdb|2PM9|B</Hit_id>
          <Hit_def>Chain B, Crystal Structure Of Yeast Sec1331 VERTEX
ELEMENT OF THE Copii Vesicular Coat</Hit_def>
          <Hit_accession>2PM9-B</Hit_accession>
          <Hit_len>297</Hit_len>
          <Hit_hsps>
            <Hsp>
              <Hsp_num>1</Hsp_num>
              <Hsp_bit-score>577.785</Hsp_bit-score>
              <Hsp_score>1488</Hsp_score>




Thanks for any suggestion,

Giacomo








More information about the Biopython mailing list