[BioPython] blastall does not flush buffers due to biopython buffering?

Michiel de Hoon mjldehoon at yahoo.com
Mon May 12 02:11:39 UTC 2008


Can you show an example script that causes the UndoHandle to block? Just to understand better what is going on.
On a related note, the UndoHandle works by saving all lines that were read. Particularly for large Blast files, that is not what one would like to do. So if there is no strong reason for returning a UndoHandle, I'd be in favor of simply returning the handle directly.

--Michiel.

Martin MOKREJÅ  <mmokrejs at ribosome.natur.cuni.cz> wrote: Hi,
  when I try to use Bio/Blast/NCBIStandalone blast sometimes the process hangs
and sometimes it works (tested from Unix shell and via Apache mod_python).
I see blastall process in the list of system processes, attaching strace(1)
to it shows that it did print some line from the result output, but somewhat
does not continue to write out the buffers (you know that at the end of blast
output is the summary stats ...;). I believe that is because the consuming
process did not read yet the output already written. Effectively, blastall
gets blocked due to biopython.


I see in the stacktrace of a killed process:

    print ''.join(_error_info.readlines())
  File "/usr/lib/python2.5/site-packages/Bio/File.py", line 37, in readlines
    lines = self._saved + self._handle.readlines(*args,**keywds)
KeyboardInterrupt
$



Currently, there is in CVS:

def blastall(blastcmd, program, database, infile, align_view='7', **keywds):
    """blastall(blastcmd, program, database, infile, align_view='7', **keywds)
    -> read, error Undohandles
...
    w, r, e = os.popen3(" ".join([blastcmd] + params))
    w.close()
    return File.UndoHandle(r), File.UndoHandle(e)



I did not study yet Bio/File.py but let me say that running just the following
works fine for me:

>>> import os
>>> w, r, e = os.popen3('/usr/bin/blastall -p blastn -d /home/mmokrejs/a.fa -i /tmp/bl_FCOri7fa -m 0 -S 1 -e 1000 -W 4 -E 1 -G 1')
>>> print ''.join(r.readlines())
BLASTN 2.2.18 [Mar-02-2008]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
[...] --> WORKS
>>> print ''.join(e.readlines())
>>>


I have found that first I have to read from STDOUT of blastall and only
afterwards I may try to read from its STDERR. Otherwise, readline() or
readlines() get blocked in the "same way" although the os.popen3() approach
works otherwise.


Is there a way to ensure no output is buffered in python? Something like
'man perlopentut' would be helpful. ;-) Why is the File.UndoHandle() used
here at all?

Thanks for clarification,
Martin
_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.



More information about the Biopython mailing list