[Biopython-dev] [Bug 2927] New: Problem parsing PSI-BLAST plain text output with NCBStandalone.PSIBlastParser
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Oct 13 02:58:37 EDT 2009
http://bugzilla.open-bio.org/show_bug.cgi?id=2927
Summary: Problem parsing PSI-BLAST plain text output with
NCBStandalone.PSIBlastParser
Product: Biopython
Version: 1.52
Platform: Macintosh
OS/Version: Mac OS
Status: NEW
Severity: blocker
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: ibdeno at gmail.com
This is a problem with NCBIStandalone.PSIBlastParser, which I need to use
instead of NCBIXML since the latter one lacks some record properties that I
need.
My code used to work until recently (say three months) and now it seems
something has changed in the latest biopython (1.52-1, I install it on an intel
OSX 10.5.8 via fink). I get the same problem irrespectively of whether I use
python 2.5 or 2.6 and also the same for blastpgp 2.2.18 and 2.2.22
Here follows the relevant part of the code:
####
blast_out, error_info = NCBIStandalone.blastpgp(
blastcmd='/usr/local/blast-2.2.18/bin/blastpgp',
database='/opt/BlastDBs/' + db,
infile=file,
npasses=passes,
program='blastpgp',
descriptions='500',
alignments='1000',
align_view='0',
matrix_outfile=outbase + '.' + db + '.' + str(passes) + '.pssm')
b_parser = NCBIStandalone.PSIBlastParser()
b_record = b_parser.parse(blast_out)
####
And this is the error that I now get:
####
File "/Users/mol/bin/lpbl.py", line 64, in doblast
b_record = b_parser.parse(blast_out)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 777,
in parse
self._scanner.feed(handle, self._consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 97,
in feed
self._scan_rounds(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 234,
in _scan_rounds
self._scan_alignments(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 376,
in _scan_alignments
self._scan_pairwise_alignments(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 386,
in _scan_pairwise_alignments
self._scan_one_pairwise_alignment(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 398,
in _scan_one_pairwise_alignment
self._scan_hsp(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 433,
in _scan_hsp
self._scan_hsp_alignment(uhandle, consumer)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 464,
in _scan_hsp_alignment
read_and_call(uhandle, consumer.query, start='Query')
File "/sw/lib/python2.6/site-packages/Bio/ParserSupport.py", line 303, in
read_and_call
method(line)
File "/sw/lib/python2.6/site-packages/Bio/Blast/NCBIStandalone.py", line 1138,
in query
raise ValueError("I could not find the query in line\n%s" % line)
ValueError: I could not find the query in line
Query: 0 -
####
Now, the interesting thing is that if I run blastpgp directly and catch the
output to a file, this file never includes such a line as:
Query: 0 -
Actually, if I modify my code so it reads this output file, the PSIBlastParser
processes it without error.
Not sure if this is relevant, but I have found that something may have changed
in NCBIStandalone recently, namely, this bit:
_query_re = re.compile(r"Query(:?) \s*(\d+)\s*(.+) (\d+)")
def query(self, line):
m = self._query_re.search(line)
if m is None:
raise ValueError("I could not find the query in line\n%s" % line)
I will post log files in plain text and xml after submitting this bug report.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list