[Biopython] send SeqIO.parse to NcbiblastnCommandline

Matthew MacManes macmanes at gmail.com
Wed Nov 2 16:21:44 UTC 2011


Hi All,

I am trying to take a large fasta file, send sequences one by one
to NcbiblastnCommandline, sending results to a unique file based on the
query ID. So far I have

MUSDATABASE='/media/hd/blastdb/mouse.rna'

from Bio import SeqIO
from Bio.Blast.Applications import NcbiblastnCommandline
for seq_record in SeqIO.parse("test1.fa", "fasta"):
cl = NcbiblastnCommandline(cmd="/home/matthew/ncbi-blast/bin/blastn",
 query=seq_record.seq,
db=MUSDATABASE, evalue=0.0000000001,
outfmt="'10 qseqid qseq sseqid sseq bitscore'",
 out=seq_record.id,
max_target_seqs=1,
 num_threads=15)
print cl
stdout, stderr = cl()


This seems like a promising approach, but the issue is that the query
argument expects a file, not a sequence itself.  In reading in the BLAST+
manual, blastn can accept a sequence from the standard input via query="-",
but I cannot get this to work, does not catch the sequence.


Any pointers greatly appreciated.
Matt



More information about the Biopython mailing list