[BioPython] BLAST in a python generator
Kael Fischer
kael at sonic.net
Tue Oct 26 18:56:08 EDT 2004
Regarding threads and Biopython.
I've been experimenting with keeping BLAST running in a separate thread
using a python generator. Then calling the .next() method of the generator
when I want to run the next query. The query is held in a stringIO buffer
that the generator can read.
The Idea is that the overhead of reading the database needn't be repeated
as that is part of the generator's state.
This is the first time I've written a generator. Unfortunately I don't seem to
be able to get _all_ of the output of the BLAST record. Most of the time
my select loops return only part of the result. The code below is one of
several schemes I've tried.
For those interested in the idea, here is some of the code:
(no Biopython in this snippet)
def BLASTpipe(inBuf, blastDB = genomeDB):
"""Generator for a BLAST process.
inBuf is a StringIO buffer that contains one or
more query sequences.
.next() processes the query(s) in inBuf. inBuf is consumed
and a tuple of the output and error strings is returned.
"""
# Format DB, if necessary
if not os.access(blastDB + '.nhr' ,os.R_OK) \
or not os.access(blastDB + '.nin' ,os.R_OK) \
or not os.access(blastDB + '.nsq' ,os.R_OK):
# db is not formatted
tmpDbFile = NamedTemporaryFile()
userDbFile = file(blastDB,'r')
tmpDbFile.write(userDbFile.read())
userDbFile.close()
tmpDbFile.flush()
blastDB = tmpDbFile.name
# format db
os.system('%s -pF -l /dev/null -i%s' % (formatdb_exe, blastDB))
blast_in, blast_out, blast_err = os.popen3(blast_exe + \
' -p blastn -d %s ' % (blastDB), 't',1)
while True:
outString = ''
errString = ''
inBuf.seek(0)
inQuery = inBuf.read()
blast_in.write(inQuery)
inBuf.seek(0)
inBuf.truncate()
readyReaders, undef, undef = select([blast_out,blast_err],[],[],0.5)
while readyReaders != []:
if blast_out in readyReaders:
outString = blast_out.read(1)
while blast_out in select([blast_out],[],[],0.5) [0]:
outString += blast_out.read(1)
if blast_err in readyReaders:
errString = blast_err.read(1)
while blast_err in select([blast_err],[],[],0.5) [0]:
errString += blast_err.read(1)
readyReaders, undef, undef = select([blast_out],[],[],0.5)
yield outString, errString
# end
Comments?
Kael
--
Kael Fischer, Ph.D.
DeRisi Lab, University of California San Francisco
Desk: 415-514-4320
kael at derisilab.ucsf.edu
More information about the BioPython
mailing list