[BioPython] Blastall problem w/ restrict_gi
Roger Barrette
rwbarrette at gmail.com
Wed Jun 13 14:32:56 EDT 2007
Hi Peter,
Thank you for the response. In regards to your questions, I am using Python
2.5 w/ biopython v1.43, on a Windows platform (XP). The specific problem I
am having occurs when I attempt to run the blastall command from Python
through the NCBIStandalone module. If I run the blastall without the
"restrict_gi" option, it gives me alignment results in the .xml
file. However, when I include the "restrict_gi" option, I get an empty .xml
result file. As per your suggestion however, the error file does list an
error. The output of this .err file is:
[NULL_Caption] ERROR: gi|90968860|gb|DQ443515.1|: Unable to open file
.\/BLAST/DATAout/A10241.txt
This is odd because when I run the command from the c:\ prompt:
c:\>/BLAST/blastall.exe -p tblastx -d /BLAST/DATAout/VirDBX -i
/BLAST/sequencesXX.fasta -m 7 -l .\/BLAST/DATAout/A10241.txt
;it works fine, and I get the alignment results, and no error.
Because I do not get any results in the xml file when I get this error,
running the blast from the python script, there is nothing to parse,
however, when I run the script without the "restrict_gi" option, from either
the command prompt or my python script, I get results in the xml file, and
they are able to be parsed. Any suggestions as to how to fix this problem
would be greatly appreciated. Thanks
-Roger
> Roger Barrette wrote:
> > Hello, I'm new to the list, and relatively new at Python.
>
> Hi Roger, and welcome to the list!
>
> > I need to run local blast using tblastx, but I have to limit my
> > searches to subsets of my local database. To do this I have gi lists
> > (*.gid.txt file) obtained from NCBI, to define my subsets. To run
> > this blast, I'm using the following command to run blastall in my
> > script:
> >
> > result_handle, error_info =
> > NCBIStandalone.blastall("/BLAST/blastall.exe", "tblastx",
> > "/BLAST/DATAout/VirDBX", "/BLAST/sequencesXX.fasta", "7",
> > restrict_gi="/BLAST/DATAout/10241.gid.txt")
> >
> > When I include the restrict_gi keyword and option, I get no results
> > back when I run this through python.
>
> Could you be a little more specific about what goes wrong? Also are you
> using Windows, what version of Biopython and what version of Python?
>
> Have you looked at the contents of both result_handle AND error_info?
> You say you get no results back (is result_handle is blank?), so
> checking error_info would be a good idea. Try something like this...
>
> save_file = open("my_blast.xml", "w")
> save_file.write(result_handle.read())
> save_file.close()
>
> save_file = open("my_blast.err", "w")
> save_file.write(error_info.read())
> save_file.close()
>
> > I went into NCBIStandalone and modified it to print out the command
> > that is supposed to be passed through the os.popen3() command, which
> > is:
> >
> > /BLAST/blastall.exe -p tblastx -d /BLAST/DATAout/VirDBX -i
> > /BLAST/sequencesXX.fasta -m 7 -l /BLAST/DATAout/10241.gid.txt
> >
> > When I copy this string directly into the windows command line, I get
> > results, and it works fine, but it doesn't work when called through
> > python. It does work in Python , however, if I don't include the
> > "restrict_gi" option. Can anyone suggest a modification to the
> > Blastall function or how I call blast from my script that may fix
> > this problem?
>
> Have you tried running this command at the command line, and redirecting
> the output to a file (e.g. test.xml) and then getting Biopython to parse
> that file?
>
> i.e. This should tell us if there is a problem parsing the XML output,
> or a problem in calling standalone blast.
>
> Peter
>
>
More information about the BioPython
mailing list