[BioPython] Running local blast with aa

Carlo Bifulco carlo_bif@yahoo.com
Fri, 25 Oct 2002 20:59:06 +0200


Hi Sebastian,

I think that the problem you having is due to the fact that when in your 
code you do:
blast_out, error_info = NCBIStandalone.blastall(blastexe, 'blastx', 
pathdb, pathin+x)
what you actually get as blast_out is a file handle and not a string.

That's probably why
salida.writelines(blast_out)
doesn't go anywhere.

A simple fix could be replacing it with:
salida.write(blast_out.read()) # not tested !

Regards,
Carlo Bifulco, MD






Sebastian Bassi wrote:
> Hello,
> 
> I have a problem I think is a Python problem since when I trie to do it 
> manually with the blast executable it works.
> I´m running local BLAST (the standalone version under WinNT).
> I did the test on the blast readme file and went OK. (the test was to 
> blastn a ecoli sequence)
> Then I tried to blastX an ecoli nucleitide sequence against the ecoli.aa 
> database (I did formated the ecolia.aa with formatdb of course). The 
> nucleotide sequence I used to test was:
>  >test de prot
> TATGAGCATACTTTGATGGCTTTGGAGGCTGGTTGTCATGTTATGTGTGAGAAGCCTCCTGCTATGACTCCTGAGCAGGC 
> 
> TCGTGAGATGTGTGATACTGCTCGTAAGTTGGGTAAGGTTTTGGCTTATGATTTTCATCATCGTTTTGCTTTGGATACTC 
> 
> AGCAGTTGCGTGAGCAGGTTACTAATGGTGTTTTGGGTGAGATTTATGTTACTACTGCTCGTGCT
> This is a backtranslation from a part of an actual ecoli protein 
> (AAC74397). (was backtranslated with python using table bacterial table, 
>  11)
> When I run this blast on the NCBI site, I get the original protein as a 
> first hit (AAC74397). When I do it locally from the command line BLAST 
> it also works fine.
> But the problem is that when I do it locally using Python (I mean, the 
> blastX against the ecoli.aa database) I get 0 hits. I´m using all 
> standard/default values.
> Here´s the output I get:
> 
> BLASTX 2.2.4 [Aug-26-2002]
> 
> 
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
> programs",  Nucleic Acids Res. 25:3389-3402.
> 
> Query= test de prot
>          (225 letters)
> 
> Database: //server/bioinfo/blast/data/ecoli.aa
>            4289 sequences; 1,358,990 total letters
> 
> 
> 
>  ***** No hits found ******
> 
>   Database: //server/bioinfo/blast/data/ecoli.aa
>     Posted date:  Oct 25, 2002  8:30 AM
>   Number of letters in database: 1,358,990
>   Number of sequences in database:  4289
> 
> If you want to see my input file:
> 
>  >test de prot
> TATGAGCATACTTTGATGGCTTTGGAGGCTGGTTGTCATGTTATGTGTGAGAAGCCTCCTGCTATGACTC
> CTGAGCAGGCTCGTGAGATGTGTGATACTGCTCGTAAGTTGGGTAAGGTTTTGGCTTATGATTTTCATCA
> TCGTTTTGCTTTGGATACTCAGCAGTTGCGTGAGCAGGTTACTAATGGTGTTTTGGGTGAGATTTATGTT
> ACTACTGCTCGTGCT
> 
> And here is my little python program (BTW, it works fine for blastn, so 
> I asume the program logic is OK).
> 
> 
> from Bio.Blast import NCBIStandalone
> import os
> import string
> import re
> 
> pathdb="//server/bioinfo/blast/data/ecoli.aa"
> blastexe="//server/bioinfo/blast/blastall.exe"
> pathin="D:\\projects\\bioinfo-adv\\set-cd-small\\"
> filesentr="D:\\projects\\bioinfo-adv\\set-cd-small"
> pathout="blastbatch\\outblast"
> 
> print "OK 1"
> cont = 0
> 
> # mando todos los files de set-cd-complete a una lista
> 
> listaentrada=os.listdir(filesentr)
> for x in listaentrada:
>    cont = cont + 1
>    blast_out, error_info = NCBIStandalone.blastall(blastexe, 'blastx', 
> pathdb, pathin+x)
>    salida=open(pathout+x+".txt","w")
>    salida.writelines(blast_out)
>    salida.close()
>    print "Blast numero "+`cont`+" "+x
> print "FIN"
> 
> print "OK 2 re ok"
> 
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
>