[BioPython] pblast in NCBI's website differs from biopython's pblast

Raul Guerra colochera at gmail.com
Thu May 29 08:47:55 EDT 2008


Hi everyone,

I was wondering if someone has had the same problem. I am running the
following code in BioPython.

        result_handle = NCBIWWW.qblast("blastp", "nr", fastaStr,
entrez_query='"Chlamydomonas" [ORGN]',ncbi_gi= True,matrix_name='BLOSUM62',
hitlist_size=50)

where fastaStr is the fasta string for NP_012855. (When I mention
Biopython's pBlast I refer to the code above)

The results that I got back are different from the results I get from the
pblast option  at http://www.ncbi.nlm.nih.gov/blast/Blast.cgi , if you
follow the link click on pblast and do a blast just specifiying the organism
and the sequence accession number.

The results that I got from NCBI's website are 2 sequences, which were what
I was looking for. On the other hand, Biopython gives back as many hits as I
specify in the limit. Also in Biopython's pBlast, I only get one of the hits
that I get in NCBI's pBlast. I know that the qBlast option in NCNBIWWW has
many parameters.

def qblast(program, database, sequence,
           auto_format=None,composition_based_statistics=None,
           db_genetic_code=None,endpoints=None,entrez_query='(none)',
           expect=10.0,filter=None,gapcosts=None,genetic_code=None,
           hitlist_size=50,i_thresh=None,layout=None,lcase_mask=None,
           matrix_name=None,nucl_penalty=None,nucl_reward=None,
           other_advanced=None,perc_ident=None,phi_pattern=None,
           query_file=None,query_believe_defline=None,query_from=None,
           query_to=None,searchsp_eff=None,service=None,threshold=None,
           ungapped_alignment=None,word_size=None,
           alignments=500,alignment_view=None,descriptions=500,
           entrez_links_new_window=None,expect_low=None,expect_high=None,
           format_entrez_query=None,format_object=None,format_type='XML',
           ncbi_gi=None,results_file=None,show_overview=None
           ):

I also know that the pBlast in NCBI's website utilizes a Gap Cost of
"Existence: 11 Extension:1". I am not sure how to translate that into the
qblast function in Biopython. I am not sure if this is the problem, but it
could be that Biopython's pblast and NCBI's pblast have different
parameters.

Thank you for your time,

David


More information about the BioPython mailing list