[BioPython] blastall does not flush buffers due to biopython buffering?

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Sun May 11 22:48:41 UTC 2008


Hi,
  when I try to use Bio/Blast/NCBIStandalone blast sometimes the process hangs
and sometimes it works (tested from Unix shell and via Apache mod_python).
I see blastall process in the list of system processes, attaching strace(1)
to it shows that it did print some line from the result output, but somewhat
does not continue to write out the buffers (you know that at the end of blast
output is the summary stats ...;). I believe that is because the consuming
process did not read yet the output already written. Effectively, blastall
gets blocked due to biopython.


I see in the stacktrace of a killed process:

    print ''.join(_error_info.readlines())
  File "/usr/lib/python2.5/site-packages/Bio/File.py", line 37, in readlines
    lines = self._saved + self._handle.readlines(*args,**keywds)
KeyboardInterrupt
$



Currently, there is in CVS:

def blastall(blastcmd, program, database, infile, align_view='7', **keywds):
    """blastall(blastcmd, program, database, infile, align_view='7', **keywds)
    -> read, error Undohandles
...
    w, r, e = os.popen3(" ".join([blastcmd] + params))
    w.close()
    return File.UndoHandle(r), File.UndoHandle(e)



I did not study yet Bio/File.py but let me say that running just the following
works fine for me:

>>> import os
>>> w, r, e = os.popen3('/usr/bin/blastall -p blastn -d /home/mmokrejs/a.fa -i /tmp/bl_FCOri7fa -m 0 -S 1 -e 1000 -W 4 -E 1 -G 1')
>>> print ''.join(r.readlines())
BLASTN 2.2.18 [Mar-02-2008]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
[...] --> WORKS
>>> print ''.join(e.readlines())
>>>


I have found that first I have to read from STDOUT of blastall and only
afterwards I may try to read from its STDERR. Otherwise, readline() or
readlines() get blocked in the "same way" although the os.popen3() approach
works otherwise.


Is there a way to ensure no output is buffered in python? Something like
'man perlopentut' would be helpful. ;-) Why is the File.UndoHandle() used
here at all?

Thanks for clarification,
Martin



More information about the Biopython mailing list