[BioPython] NCBIWWW.qblast with refseq by organism
Michiel Jan Laurens de Hoon
mdehoon at c2b2.columbia.edu
Fri May 26 22:33:15 UTC 2006
Which gi_number do you use when you call BlastGI?
--Michiel
Denil Wickrama wrote:
> Hi,
> I would like to BLAST a list of proteins against the refseq database and retrieve the corresponding accession numbers of the exact hits. I get errors when I change from the nr database to the refseq database. Also I am trying to restrict the results by organism name, but that was not successful.
> result_handle = NCBIWWW.qblast("blastp", "nr", seq, entrez_query='"rattus norvegicus" [Organism]')
> result_handle = NCBIWWW.qblast("blastp", "refseq", seq, entrez_query='"rattus norvegicus" [Organism]')
> Is it possible to do refseq searches with NCBIWWW.qblast?
> What do you think I am doing wrong, any help would be appreciated.
> The code attached below.
>
> Thanks,
> Denil Wickrama
> Bioinformatics research assistant
> Biochemistry and Biomedical Sciences
> McMaster University
>
>
> from Bio import Fasta
> from Bio import GenBank
> from Bio.Blast import NCBIWWW
> from Bio.Blast import NCBIStandalone
> import csv
> from Bio.SeqRecord import SeqRecord
> from sys import *
>
> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', parser = Fasta.RecordParser())
>
> #gi_number is sent to this function from another function
> def BlastGI(gi_number):
> print gi_number
> seq = ncbi_dict[gi_number]
> # I would like to change "nr" to "refseq" but I get an error when I do so
> # I am trying to us entrez_query='"rattus norvegicus" [Organism]' to restrict the result to those from rat
> save_file = open('my_blast.out', 'w')
> blast_results = result_handle.read()
>
> save_file.write("")
> save_file.write(blast_results)
> save_file.close()
> blast_out = open('my_blast.out', 'r')
> from Bio.Blast import NCBIXML
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
>
> #http://biopython.org/docs/tutorial/Tutorial004.html#toc10
> E_VALUE_THRESH = 0.001
> hspHit = False
> exactHit = False
> print 'GI:', gi_number,"";
> print 'alignments', len(b_record.alignments),"";
> for alignment in b_record.alignments:
> for hsp in alignment.hsps:
> if hsp.expect < E_VALUE_THRESH:
> hspHit = True
> if hsp.sbjct == seq.sequence:
> #print 'sequence:', hsp.sbjct
> print 'title:', hits.name
> exactHit = True
> if exactHit == True:
> print "Exact HIT: True, 1, 1"
> else:
> print "Exact HIT: False, 0, 1"
> _______________________________________________
> BioPython mailing list - BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
--
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1130 St Nicholas Avenue
New York, NY 10032
More information about the Biopython
mailing list