[BioPython] NCBIWWW.qblast with refseq by organism
Denil Wickrama
denilw at yahoo.com
Fri May 26 16:57:11 UTC 2006
Hi,
I would like to BLAST a list of proteins against the refseq database and retrieve the corresponding accession numbers of the exact hits. I get errors when I change from the nr database to the refseq database. Also I am trying to restrict the results by organism name, but that was not successful.
result_handle = NCBIWWW.qblast("blastp", "nr", seq, entrez_query='"rattus norvegicus" [Organism]')
result_handle = NCBIWWW.qblast("blastp", "refseq", seq, entrez_query='"rattus norvegicus" [Organism]')
Is it possible to do refseq searches with NCBIWWW.qblast?
What do you think I am doing wrong, any help would be appreciated.
The code attached below.
Thanks,
Denil Wickrama
Bioinformatics research assistant
Biochemistry and Biomedical Sciences
McMaster University
from Bio import Fasta
from Bio import GenBank
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIStandalone
import csv
from Bio.SeqRecord import SeqRecord
from sys import *
ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', parser = Fasta.RecordParser())
#gi_number is sent to this function from another function
def BlastGI(gi_number):
print gi_number
seq = ncbi_dict[gi_number]
# I would like to change "nr" to "refseq" but I get an error when I do so
# I am trying to us entrez_query='"rattus norvegicus" [Organism]' to restrict the result to those from rat
save_file = open('my_blast.out', 'w')
blast_results = result_handle.read()
save_file.write("")
save_file.write(blast_results)
save_file.close()
blast_out = open('my_blast.out', 'r')
from Bio.Blast import NCBIXML
b_parser = NCBIXML.BlastParser()
b_record = b_parser.parse(blast_out)
#http://biopython.org/docs/tutorial/Tutorial004.html#toc10
E_VALUE_THRESH = 0.001
hspHit = False
exactHit = False
print 'GI:', gi_number,"";
print 'alignments', len(b_record.alignments),"";
for alignment in b_record.alignments:
for hsp in alignment.hsps:
if hsp.expect < E_VALUE_THRESH:
hspHit = True
if hsp.sbjct == seq.sequence:
#print 'sequence:', hsp.sbjct
print 'title:', hits.name
exactHit = True
if exactHit == True:
print "Exact HIT: True, 1, 1"
else:
print "Exact HIT: False, 0, 1"
More information about the Biopython
mailing list