[BioPython] Help a clueless newbie with BLAST queries?
bp
dubious@2xtreme.net (bp)
Fri, 1 Jun 2001 12:37:36 -0700
I got tired of manually sending my queries through the webpage...
The biopython project gives me the perfect excuse to finally break down and
learn Python, so I'm now trying to whip up a script to send a series of
fasta sequences (piped in from stdin) to ncbi and get the results for me.
I'm running into two problems though. Forgive me for posting the source
code, but there's not much to it - I'll append it to this message.
The two problems I'm having are:
1)The parser (b_parser) craps out with an "unexpected end of stream"
2)I can't seem to get the optional settings to work (I'm trying to limit
the number of 'hits' returned from the BLAST search). Setting alignment=10
and descriptions=10 in the NCBIWWW.blast instantiation seems to be ignored,
and trying to set an "expect" value causes the script to die with "no PID
found".
Getting advice from someone that actually knows what they're doing seems
to be my only hope now :-) I'm using the 1.00a release of biopython.
Thanks!
Here's what it looks like when I run it:
-------------------------------------------------------------------
bash-2.04$ cat sequences.seq | python blasttest.py Opened file for reading
Defined f_iterator
read first f_record entering While loop
Getting BLAST results
Saving results to my_blast.out
copying the file handle to save data with
writing from handle copy to output file
closing my_blast.out
creating b_parser
reading b_record from b_parser analysis of b_results
Traceback (most recent call last): File "blasttest.py", line 29, in ?
b_record = b_parser.parse(b_results) File
"/usr/lib/python2.0/site-packages/Bio/Blast/NCBIWWW.py", line 45, in parse
self._scanner.feed(handle, self._consumer) File
"/usr/lib/python2.0/site-packages/Bio/Blast/NCBIWWW.py", line 94, in feed
contains='BLAST') File
"/usr/lib/python2.0/site-packages/Bio/ParserSupport.py", line 187, in
read_and_call_until line = safe_readline(uhandle) File
"/usr/lib/python2.0/site-packages/Bio/ParserSupport.py", line 263, in
safe_readline raise SyntaxError, "Unexpected end of stream." SyntaxError:
Unexpected end of stream.
bash-2.04$
-------------------------------------------------------------------
Here's the script I've got so far
(borrowing heavily from the tutorial):
--------------------------------------------------------------------
from Bio import Fasta
from Bio.Blast import NCBIWWW
import copy
import sys
#file_for_blast = open('sequences.seq', 'r')
file_for_blast=sys.stdin
print 'Opened file for reading\n'
f_iterator = Fasta.Iterator(file_for_blast)
print 'Defined f_iterator\n'
f_record = f_iterator.next()
print 'read first f_record'
while f_record is not None:
print 'entering While loop\n'
print 'Getting BLAST results\n'
b_results = NCBIWWW.blast('blastn', 'nr', f_record,
descriptions='10',alignments='10')
print 'Saving results to my_blast.out\n'
save_file = open('my_blast.out', 'w')
#copy the handle so we can write it out and still have something for parsing
print 'copying the file handle to save data with\n'
save_handle = copy.deepcopy(b_results)
print 'writing from handle copy to output file\n'
save_file.write(save_handle.read())
print 'closing my_blast.out\n'
save_file.close()
print 'creating b_parser\n'
b_parser = NCBIWWW.BlastParser()
print 'reading b_record from b_parser analysis of b_results\n'
b_record = b_parser.parse(b_results)
E_VALUE_THRESH = 0.0
print 'threshold set to '+E_VALUE_THRESH+' and beginning iteration through
alignments\n'
for alignment in b_record.alignments:
for hsp in alignment.hsps:
if hsp.expect < E_VALUE_THRESH:
print '****Alignment****'
print 'sequence:', alignment.title
print 'length:', alignment.length
print 'e value:', hsp.expect
print hsp.query[0:75] + '...'
print hsp.match[0:75] + '...'
print hsp.sbjct[0:75] + '...'
print 'reading next record from f_iterator'
f_record = f_iterator.next()
else:
print 'no more f_records. Done.'
--------------------------------------------------------------------------------------
--
"Given the pace of technology, I propose we leave math to the machines
and go play outside." - Calvin