[Biopython] Biopython local blastn query

Ara Kooser ghashsnaga at gmail.com
Tue Jul 30 16:10:20 UTC 2013


Peter,

  Thanks for catching that! I missed that one. I also needed to upgrade to
biopython 1.62b which I did. I still get one short sequence coming through.


*General question*
Hopefully one last question from me on this project. Can I query multiple
blast databased in a single command? I have all the nt.xx downloaded and
need to query each one to look for all my sequences.

Thanks!
ara


Here is the current code. Once I get this cleaned up I will push it over to
a github repo in case anyone wants it.

########################################################################################
#I want to read in one sequence at a time from a fasta file and blast it
against a local
#blast db.

from Bio.Blast.Applications import NcbiblastnCommandline
from Bio.Blast import NCBIXML
from Bio import SeqIO
from Bio import Seq
from Bio.SeqRecord import SeqRecord

nt = "/Users/arakooser/blast/db/nt.00"
#Where the database is located at
file_out = open("metadata_genus.level.csv","w+")

#Contains all the data my boss wants on the sequences
file_in = open("clear_genus_level.fasta")

#The main fasta file that needs to be blasted

fas_rec = SeqIO.parse(file_in,"fasta")
#Parses the main fasta file

for first_seq in fas_rec:
#Hopefully grabs the first sequence
#Takes that sequence from standard in and sumbits it to the blast
commandline and spits
#out an xml

    result = NcbiblastnCommandline(task="megablast",query="-", db=nt,
evalue=0.001,
                                   outfmt=5,
perc_identity=100,out="temp.xml",
                                   max_hsps_per_subject=1, num_alignments=1)
    stdout, stderr = result(stdin=first_seq.format("fasta"))
    #    print result

#Reading in the xml file.
#

    record = open("temp.xml")
    blast_record = NCBIXML.read(record)
    record.close()
    #print blast_record

    for alignment in blast_record.alignments:

        for hsp in alignment.hsps:

                title_element = alignment.title.split()

                print  title_element[1]+" "+title_element[2]+","+"
"+alignment.accession\
                  +","+" "+str(alignment.length)

                file_out.write(title_element[1]+" "+title_element[2]+","+"
"\
                               +alignment.accession+","+"
"+str(alignment.length)+","+\
                               " "+hsp.sbjct+"\n")


On Tue, Jul 30, 2013 at 9:36 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Tue, Jul 30, 2013 at 4:32 PM, Ara Kooser <ghashsnaga at gmail.com> wrote:
> > Ivan,
> >
> >  Thanks! I found the blastn documentation!! This looks like what I want.
> >
> > I am running blast 2.2.26. I am getting an error with those parameters.
> >
> > I entered the parameters as:
> > max_hsps_per_subject=1, num_alignments=1 in the NcbiblastnCommandline
> line
> >
> >
> > Error:
> > Aras-MacBook-Air:CEM Genus arakooser$ python meta_data_local.py
> >   File "meta_data_local.py", line 30
> >     -out="temp.xml", max_hsps_per_subject=1, num_alignments=1)
> > SyntaxError: keyword can't be an expression
> >
> > I think this means I am not using the correct keyword.
> >
> > ara
>
> Python function argument names can't have minus signs in them,
> check the -out bit which should probably just be out.
>
> Peter
>



-- 
Quis hic locus, quae regio, quae mundi plaga. Ubi sum. Sub ortu solis an
sub cardine glacialis ursae.

Geoscience website: http://www.tattooedscience.org/



More information about the Biopython mailing list