[BioPython] NCBIWWW.qblast with refseq by organism

Michiel Jan Laurens de Hoon mdehoon at c2b2.columbia.edu
Fri May 26 22:33:15 UTC 2006


Which gi_number do you use when you call BlastGI?

--Michiel


Denil Wickrama wrote:
> Hi,
>     I would like to BLAST a list of proteins against the refseq database and retrieve the corresponding accession numbers of the exact hits.  I get errors when I change from the nr database to the refseq database.  Also I am trying to restrict the results by organism name, but that was not successful.
>     result_handle = NCBIWWW.qblast("blastp", "nr", seq, entrez_query='"rattus norvegicus" [Organism]') 
>     result_handle = NCBIWWW.qblast("blastp", "refseq", seq, entrez_query='"rattus norvegicus" [Organism]')
> Is it possible to do refseq searches with NCBIWWW.qblast?
> What do you think I am doing wrong, any help would be appreciated.
> The code attached below.
>  
> Thanks,
> Denil Wickrama
> Bioinformatics research assistant
> Biochemistry and Biomedical Sciences
> McMaster University
>  
>  
> from Bio import Fasta
> from Bio import GenBank
> from Bio.Blast import NCBIWWW
> from Bio.Blast import NCBIStandalone
> import csv
> from Bio.SeqRecord import SeqRecord
> from sys import *
> 
> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', parser =  Fasta.RecordParser())                                   
> 
> #gi_number is sent to this function from another function
> def BlastGI(gi_number):
>     print gi_number
>     seq = ncbi_dict[gi_number]
>     # I would like to change "nr" to "refseq" but I get an error when I do so
>     # I am trying to us entrez_query='"rattus norvegicus" [Organism]' to restrict the result to those from rat
>     save_file = open('my_blast.out', 'w')
>     blast_results = result_handle.read()
> 
>     save_file.write("")
>     save_file.write(blast_results)
>     save_file.close()
>     blast_out = open('my_blast.out', 'r')
>     from Bio.Blast import NCBIXML
>     b_parser = NCBIXML.BlastParser()
>     b_record = b_parser.parse(blast_out)
>     
>     #http://biopython.org/docs/tutorial/Tutorial004.html#toc10
>     E_VALUE_THRESH = 0.001
>     hspHit = False
>     exactHit = False
>     print 'GI:', gi_number,"";
>     print 'alignments', len(b_record.alignments),"";
>     for alignment in b_record.alignments:
>         for hsp in alignment.hsps:
>             if hsp.expect < E_VALUE_THRESH:
>                 hspHit = True
>                 if hsp.sbjct == seq.sequence:
>                     #print 'sequence:', hsp.sbjct
>                     print 'title:', hits.name
>                     exactHit = True
>     if exactHit == True:
>         print "Exact HIT: True, 1, 1"
>     else:
>         print "Exact HIT: False, 0, 1"
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


-- 
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1130 St Nicholas Avenue
New York, NY 10032



More information about the Biopython mailing list