[BioPython] Running local blast with aa
Sebastian Bassi
sbassi@asalup.org
Fri, 25 Oct 2002 12:56:08 -0300
Hello,
I have a problem I think is a Python problem since when I trie to do it
manually with the blast executable it works.
I´m running local BLAST (the standalone version under WinNT).
I did the test on the blast readme file and went OK. (the test was to
blastn a ecoli sequence)
Then I tried to blastX an ecoli nucleitide sequence against the ecoli.aa
database (I did formated the ecolia.aa with formatdb of course). The
nucleotide sequence I used to test was:
>test de prot
TATGAGCATACTTTGATGGCTTTGGAGGCTGGTTGTCATGTTATGTGTGAGAAGCCTCCTGCTATGACTCCTGAGCAGGC
TCGTGAGATGTGTGATACTGCTCGTAAGTTGGGTAAGGTTTTGGCTTATGATTTTCATCATCGTTTTGCTTTGGATACTC
AGCAGTTGCGTGAGCAGGTTACTAATGGTGTTTTGGGTGAGATTTATGTTACTACTGCTCGTGCT
This is a backtranslation from a part of an actual ecoli protein
(AAC74397). (was backtranslated with python using table bacterial table,
11)
When I run this blast on the NCBI site, I get the original protein as a
first hit (AAC74397). When I do it locally from the command line BLAST
it also works fine.
But the problem is that when I do it locally using Python (I mean, the
blastX against the ecoli.aa database) I get 0 hits. I´m using all
standard/default values.
Here´s the output I get:
BLASTX 2.2.4 [Aug-26-2002]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= test de prot
(225 letters)
Database: //server/bioinfo/blast/data/ecoli.aa
4289 sequences; 1,358,990 total letters
***** No hits found ******
Database: //server/bioinfo/blast/data/ecoli.aa
Posted date: Oct 25, 2002 8:30 AM
Number of letters in database: 1,358,990
Number of sequences in database: 4289
If you want to see my input file:
>test de prot
TATGAGCATACTTTGATGGCTTTGGAGGCTGGTTGTCATGTTATGTGTGAGAAGCCTCCTGCTATGACTC
CTGAGCAGGCTCGTGAGATGTGTGATACTGCTCGTAAGTTGGGTAAGGTTTTGGCTTATGATTTTCATCA
TCGTTTTGCTTTGGATACTCAGCAGTTGCGTGAGCAGGTTACTAATGGTGTTTTGGGTGAGATTTATGTT
ACTACTGCTCGTGCT
And here is my little python program (BTW, it works fine for blastn, so
I asume the program logic is OK).
from Bio.Blast import NCBIStandalone
import os
import string
import re
pathdb="//server/bioinfo/blast/data/ecoli.aa"
blastexe="//server/bioinfo/blast/blastall.exe"
pathin="D:\\projects\\bioinfo-adv\\set-cd-small\\"
filesentr="D:\\projects\\bioinfo-adv\\set-cd-small"
pathout="blastbatch\\outblast"
print "OK 1"
cont = 0
# mando todos los files de set-cd-complete a una lista
listaentrada=os.listdir(filesentr)
for x in listaentrada:
cont = cont + 1
blast_out, error_info = NCBIStandalone.blastall(blastexe, 'blastx',
pathdb, pathin+x)
salida=open(pathout+x+".txt","w")
salida.writelines(blast_out)
salida.close()
print "Blast numero "+`cont`+" "+x
print "FIN"
print "OK 2 re ok"