[BioPython] Blastall problem w/ restrict_gi

Roger Barrette rwbarrette at gmail.com
Wed Jun 13 14:32:56 EDT 2007


Hi Peter,

Thank you for the response.  In regards to your questions, I am using Python
2.5 w/ biopython v1.43, on a Windows platform (XP).  The specific problem I
am having occurs when I attempt to run the blastall command from Python
through the NCBIStandalone module.  If I run the blastall without the
"restrict_gi" option, it gives me  alignment results in the .xml
file.  However, when I include the "restrict_gi" option, I get an empty .xml
result file.  As per your suggestion however, the error file does list an
error.  The output of this .err file is:

[NULL_Caption] ERROR: gi|90968860|gb|DQ443515.1|: Unable to open file
.\/BLAST/DATAout/A10241.txt

This is odd because when I run the command from the c:\ prompt:
 c:\>/BLAST/blastall.exe -p tblastx -d /BLAST/DATAout/VirDBX -i
/BLAST/sequencesXX.fasta -m 7 -l .\/BLAST/DATAout/A10241.txt

;it works fine, and I get the alignment results, and no error.

Because I do not get any results in the xml file when I get this error,
running the blast from the python script,  there is nothing to parse,
however, when I run the script without the "restrict_gi" option, from either
the command prompt or my python script, I get results in the xml file, and
they are able to be parsed.  Any suggestions as to how to fix this problem
would be greatly appreciated.  Thanks

-Roger


> Roger Barrette wrote:
> > Hello, I'm new to the list, and relatively new at Python.
>
> Hi Roger, and welcome to the list!
>
> > I need to run local blast using tblastx, but I have to limit my
> > searches to subsets of my local database.  To do this I have gi lists
> >  (*.gid.txt file) obtained from NCBI, to define my subsets.  To run
> > this blast, I'm using the following command to run blastall in my
> > script:
> >
> > result_handle, error_info =
> > NCBIStandalone.blastall("/BLAST/blastall.exe", "tblastx",
> > "/BLAST/DATAout/VirDBX", "/BLAST/sequencesXX.fasta", "7",
> > restrict_gi="/BLAST/DATAout/10241.gid.txt")
> >
> > When I include the restrict_gi keyword and option, I get no results
> > back when I run this through python.
>
> Could you be a little more specific about what goes wrong? Also are you
> using Windows, what version of Biopython and what version of Python?
>
> Have you looked at the contents of both result_handle AND error_info?
> You say you get no results back (is result_handle is blank?), so
> checking error_info would be a good idea.  Try something like this...
>
> save_file = open("my_blast.xml", "w")
> save_file.write(result_handle.read())
> save_file.close()
>
> save_file = open("my_blast.err", "w")
> save_file.write(error_info.read())
> save_file.close()
>
> > I went into NCBIStandalone and modified it to print out the command
> > that is supposed to be passed through the os.popen3() command, which
> >  is:
> >
> > /BLAST/blastall.exe -p tblastx -d /BLAST/DATAout/VirDBX -i
> > /BLAST/sequencesXX.fasta -m 7 -l /BLAST/DATAout/10241.gid.txt
> >
> > When I copy this string directly into the windows command line, I get
> > results, and it works fine, but it doesn't work when called through
> > python. It does work in Python , however, if I don't include the
> > "restrict_gi" option.    Can anyone suggest a modification to the
> > Blastall function or how I call blast from my script that may fix
> > this problem?
>
> Have you tried running this command at the command line, and redirecting
> the output to a file (e.g. test.xml) and then getting Biopython to parse
> that file?
>
> i.e. This should tell us if there is a problem parsing the XML output,
> or a problem in calling standalone blast.
>
> Peter
>
>


More information about the BioPython mailing list