[Biopython] send SeqIO.parse to NcbiblastnCommandline
Matthew MacManes
macmanes at gmail.com
Wed Nov 2 16:21:44 UTC 2011
Hi All,
I am trying to take a large fasta file, send sequences one by one
to NcbiblastnCommandline, sending results to a unique file based on the
query ID. So far I have
MUSDATABASE='/media/hd/blastdb/mouse.rna'
from Bio import SeqIO
from Bio.Blast.Applications import NcbiblastnCommandline
for seq_record in SeqIO.parse("test1.fa", "fasta"):
cl = NcbiblastnCommandline(cmd="/home/matthew/ncbi-blast/bin/blastn",
query=seq_record.seq,
db=MUSDATABASE, evalue=0.0000000001,
outfmt="'10 qseqid qseq sseqid sseq bitscore'",
out=seq_record.id,
max_target_seqs=1,
num_threads=15)
print cl
stdout, stderr = cl()
This seems like a promising approach, but the issue is that the query
argument expects a file, not a sequence itself. In reading in the BLAST+
manual, blastn can accept a sequence from the standard input via query="-",
but I cannot get this to work, does not catch the sequence.
Any pointers greatly appreciated.
Matt
More information about the Biopython
mailing list