[Biopython] Biopython local blastn query

Ara Kooser ghashsnaga at gmail.com
Tue Jul 30 17:44:21 UTC 2013


Here is what I did with everyone's suggestions that got things working:

    result = NcbiblastnCommandline(task="megablast",query="-", db="nt",
                                   outfmt=5, perc_identity=100,
out="temp.xml",
                                   max_target_seqs=1)


The big thing I am noticing is that this is incredible slow. Currently I am
blasting 4 databases with 6 query sequences.

Is there a way to speed this up?

I started a run a 11:38 and the first returned hit came across at 11:41. It
looks like it's about 2-3 minutes per sequence.

ara


On Tue, Jul 30, 2013 at 11:08 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Tue, Jul 30, 2013 at 6:02 PM, Ara Kooser <ghashsnaga at gmail.com> wrote:
> > This will sound like a silly question. I found the nt.nal file that lists
> > all the databses. How do I call the alias from biopython?
> >
> > I thought it would be something like this:
> >
> > nt = "/Users/arakooser/blast/db/nt.nal"
> >
> >  result = NcbiblastnCommandline(task="megablast",query="-", db=nt,
> >                                    outfmt=5, perc_identity=100,
> > out="temp.xml",
> >                                    max_hsps_per_subject=1,
> num_alignments=1)
> >
> > But that throws an error letting me know that nothing was returned.
> >
> > ara
>
> Just as a string in quotes, "nt",
>
> NcbiblastnCommandline(task="megablast", query="-", db="nt", ...)
>
> Peter
>



-- 
Quis hic locus, quae regio, quae mundi plaga. Ubi sum. Sub ortu solis an
sub cardine glacialis ursae.

Geoscience website: http://www.tattooedscience.org/



More information about the Biopython mailing list