[BioPython] megablast

Peter (BioPython List) biopython at maubp.freeserve.co.uk
Fri Aug 25 09:29:35 EDT 2006


meric ovacik wrote:
> I like to use biopyhon in order to serch megaBLAST instead of BLAST.
> I'll appreciate any help!
> best regards

Hi Meric

Do you want to use the online or standalone version of megablast?

According to the this page, you can use the -D option to control the 
output format of the standalone version of megablast:

http://www.ncbi.nlm.nih.gov/blast/docs/megablast.html

I would expect -D 2 to give traditional plain text BLAST (blastn) 
output, which BioPython might be able to read (there are often slight 
variations in the exact text formatting between different versions of 
blast, so fingers crossed).

Alternatively, using the standalone argument -D 3 should give simple tab 
separated data lines, which is easily read in and dealt with, e.g. 
something like this

input_file = open("mode3output.txt","rU")
for line in input_file.readlines() :
     if line[0] == "#" :
         #header line, ignore
     else :
         parts = line.rstrip().split()
         print "Query id = %s" % parts[0]
         ...

That code was based on what the online tool with give as its "plain 
text" output.  You could probably write your own code to request a 
megablast search in this format, or try and get the existing BioPython 
online blast code to do it for you.

Also, it looks like the online version will produce XML, which at first 
glance looks like the same sort of output produced by normal blast.  So 
again, BioPython should be able to pass that.

Note that I personally use standalone blast, and don't have much 
experience using the online version via BioPython.

Peter


More information about the BioPython mailing list