[Biopython] Problems parsing with PSIBlastParser

Miguel Ortiz Lombardia ibdeno at gmail.com
Mon Oct 12 04:11:38 EDT 2009


Dear list members,

I have a problem with NCBIStandalone.PSIBlastParser, which I need to  
use instead of NCBIXML since the latter one lacks some record  
properties that I need.

My code used to work until recently (say three months) and now it  
seems something has changed in the latest biopython (1.52-1, I install  
it on an intel OSX 10.5.8 via fink). I get the same problem  
irrespectively of whether I use python 2.5 or 2.6.

Here follows the relevant part of the code:

####

     blast_out, error_info = NCBIStandalone.blastpgp(
         blastcmd='/usr/local/blast-2.2.18/bin/blastpgp',
         database='/opt/BlastDBs/' + db,
         infile=file,
         npasses=passes,
         program='blastpgp',
         descriptions='500',
         alignments='1000',
         align_view='0',
         matrix_outfile=outbase + '.' + db + '.' + str(passes) +  
'.pssm')

     b_parser = NCBIStandalone.PSIBlastParser()

     b_record = b_parser.parse(blast_out)

####

And this is the error that I now get:

####

   File "/Users/mol/bin/lpbl.py", line 64, in doblast
     b_record = b_parser.parse(blast_out)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 777, in parse
     self._scanner.feed(handle, self._consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 97, in feed
     self._scan_rounds(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 234, in _scan_rounds
     self._scan_alignments(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 376, in _scan_alignments
     self._scan_pairwise_alignments(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 386, in _scan_pairwise_alignments
     self._scan_one_pairwise_alignment(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 398, in _scan_one_pairwise_alignment
     self._scan_hsp(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 433, in _scan_hsp
     self._scan_hsp_alignment(uhandle, consumer)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 464, in _scan_hsp_alignment
     read_and_call(uhandle, consumer.query, start='Query')
   File "/sw/lib/python2.6/site-packages/Bio/ParserSupport.py", line  
303, in read_and_call
     method(line)
   File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py",  
line 1138, in query
     raise ValueError("I could not find the query in line\n%s" % line)
ValueError: I could not find the query in line
Query: 0    -

####

Now, the interesting thing is that if I run blastpgp directly and  
catch the output to a file, this file never includes such a line as:

Query: 0    -

Actually, if I modify my code so it reads this output file, the  
PSIBlastParser processes it without error.

I have found that something may have changed in NCBIStandalone  
recently, namely, this bit:

     _query_re = re.compile(r"Query(:?) \s*(\d+)\s*(.+) (\d+)")
     def query(self, line):
         m = self._query_re.search(line)
         if m is None:
             raise ValueError("I could not find the query in line\n%s"  
% line)

Anyone has a clue?

Thank you!


-- Miguel



More information about the Biopython mailing list