[BioPython] pblast in NCBI's website differs from biopython's pblast
Raul Guerra
colochera at gmail.com
Thu May 29 12:47:55 UTC 2008
Hi everyone,
I was wondering if someone has had the same problem. I am running the
following code in BioPython.
result_handle = NCBIWWW.qblast("blastp", "nr", fastaStr,
entrez_query='"Chlamydomonas" [ORGN]',ncbi_gi= True,matrix_name='BLOSUM62',
hitlist_size=50)
where fastaStr is the fasta string for NP_012855. (When I mention
Biopython's pBlast I refer to the code above)
The results that I got back are different from the results I get from the
pblast option at http://www.ncbi.nlm.nih.gov/blast/Blast.cgi , if you
follow the link click on pblast and do a blast just specifiying the organism
and the sequence accession number.
The results that I got from NCBI's website are 2 sequences, which were what
I was looking for. On the other hand, Biopython gives back as many hits as I
specify in the limit. Also in Biopython's pBlast, I only get one of the hits
that I get in NCBI's pBlast. I know that the qBlast option in NCNBIWWW has
many parameters.
def qblast(program, database, sequence,
auto_format=None,composition_based_statistics=None,
db_genetic_code=None,endpoints=None,entrez_query='(none)',
expect=10.0,filter=None,gapcosts=None,genetic_code=None,
hitlist_size=50,i_thresh=None,layout=None,lcase_mask=None,
matrix_name=None,nucl_penalty=None,nucl_reward=None,
other_advanced=None,perc_ident=None,phi_pattern=None,
query_file=None,query_believe_defline=None,query_from=None,
query_to=None,searchsp_eff=None,service=None,threshold=None,
ungapped_alignment=None,word_size=None,
alignments=500,alignment_view=None,descriptions=500,
entrez_links_new_window=None,expect_low=None,expect_high=None,
format_entrez_query=None,format_object=None,format_type='XML',
ncbi_gi=None,results_file=None,show_overview=None
):
I also know that the pBlast in NCBI's website utilizes a Gap Cost of
"Existence: 11 Extension:1". I am not sure how to translate that into the
qblast function in Biopython. I am not sure if this is the problem, but it
could be that Biopython's pblast and NCBI's pblast have different
parameters.
Thank you for your time,
David
More information about the Biopython
mailing list