[BioPython] NCBIWWW.qblast with refseq by organism

Denil Wickrama denilw at yahoo.com
Fri May 26 16:57:11 UTC 2006


Hi,
    I would like to BLAST a list of proteins against the refseq database and retrieve the corresponding accession numbers of the exact hits.  I get errors when I change from the nr database to the refseq database.  Also I am trying to restrict the results by organism name, but that was not successful.
    result_handle = NCBIWWW.qblast("blastp", "nr", seq, entrez_query='"rattus norvegicus" [Organism]') 
    result_handle = NCBIWWW.qblast("blastp", "refseq", seq, entrez_query='"rattus norvegicus" [Organism]')
Is it possible to do refseq searches with NCBIWWW.qblast?
What do you think I am doing wrong, any help would be appreciated.
The code attached below.
 
Thanks,
Denil Wickrama
Bioinformatics research assistant
Biochemistry and Biomedical Sciences
McMaster University
 
 
from Bio import Fasta
from Bio import GenBank
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIStandalone
import csv
from Bio.SeqRecord import SeqRecord
from sys import *

ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', parser =  Fasta.RecordParser())                                   

#gi_number is sent to this function from another function
def BlastGI(gi_number):
    print gi_number
    seq = ncbi_dict[gi_number]
    # I would like to change "nr" to "refseq" but I get an error when I do so
    # I am trying to us entrez_query='"rattus norvegicus" [Organism]' to restrict the result to those from rat
    save_file = open('my_blast.out', 'w')
    blast_results = result_handle.read()

    save_file.write("")
    save_file.write(blast_results)
    save_file.close()
    blast_out = open('my_blast.out', 'r')
    from Bio.Blast import NCBIXML
    b_parser = NCBIXML.BlastParser()
    b_record = b_parser.parse(blast_out)
    
    #http://biopython.org/docs/tutorial/Tutorial004.html#toc10
    E_VALUE_THRESH = 0.001
    hspHit = False
    exactHit = False
    print 'GI:', gi_number,"";
    print 'alignments', len(b_record.alignments),"";
    for alignment in b_record.alignments:
        for hsp in alignment.hsps:
            if hsp.expect < E_VALUE_THRESH:
                hspHit = True
                if hsp.sbjct == seq.sequence:
                    #print 'sequence:', hsp.sbjct
                    print 'title:', hits.name
                    exactHit = True
    if exactHit == True:
        print "Exact HIT: True, 1, 1"
    else:
        print "Exact HIT: False, 0, 1"



More information about the Biopython mailing list