[BioPython] qblast

Peter biopython at maubp.freeserve.co.uk
Sat Aug 18 21:38:52 UTC 2007


Hi Michael,

Here is a short script based on your example, which I have tested for 
calling qblast:

from Bio.Blast.NCBIWWW import qblast
seq_string = "TGTGATGGATATCTGCAGAATTCGCCCTTTAAACTTCAGGGTGACCAAAA" \
            + "AATCAAAATAAATGTTGAAATAATACTGGATCTCCACCACCACTAACTTC" \
            + "AAAAAATGTTGTATTAAAATTTCTATCAGTTAATAACATTGTTATAGCAC" \
            + "CCCCTAATACTGGTAATGATAATAATAATAATCATGCTGTTATAAATACA" \
            + "GCTCAAACAAATAAAGGTAACTTAAACATACTCATACCAGGTGTTCGCAT" \
            + "ATTAATAACAGTAACAATAAAATTTATTGAACCTAATATTGATGATATAC" \
            + "CAGCTAAATGTAAACTAAATATTGCACATTCTATTGAACCTCCTGAATGT" \
            + "GAAAATATACCAGATAATGGTGGATAAACAGTTCAACCTGTACCTGCCCC" \
            + "CATCTCGACTACAGATGATCAAATTAATAAAAAAAATGATGGTACTAATA" \
            + "ATCAAAAACTTATATTATTTAATCTTGGGAATGCCATATCAGGAGCTCCT" \
            + "ATCATTAAAGGTAAAAATCAATTACCAAAACCACCCATTAATGCAGGCAT" \
            + "AACCATAAAAAATATCATTATTAAAGCATGTGCTGTTATTAACACATTAT" \
            + "ATGCTTGATGATTGTAATTTAATATTACTGCACCAGCATCTGATAATTCT" \
            + "ATACGTATTAATATAGATCAAAATGTTCCTATTAAACCTGCTAAAAATGC" \
            + "AAATATTAAATATAATGTTCCAATATCTTTATGATTTGTTGACCAAGGGC" \
            + "GAATTCCAGCACACTGGCGGCCGTTACTAG"
#result_handle = qblast('blastn', 'nr', seq_string, format_type='HTML')
#output_handle = open("test.html", "w")
#output_handle.write(result_handle.read())
#output_handle.close()
result_handle = qblast('blastn', 'nr', seq_string, format_type='Text')
output_handle = open("test.txt", "w")
output_handle.write(result_handle.read())
output_handle.close()
#result_handle = qblast('blastn', 'nr', seq_string, format_type='XML')
#output_handle = open("test.xml", "w")
#output_handle.write(result_handle.read())
#output_handle.close()
print "Done"

The top hits from the script were:

gb|AY916130.1|  Epidermophyton floccosum mitochondrion, complete
gb|EF180206.1|  Penicillium confertum voucher 171.87 cytochrom...
gb|EF180399.1|  Penicillium soppii voucher IBT 14908 cytochrom...
gb|EF180398.1|  Penicillium soppii voucher IBT 3331 cytochrome...
gb|EF180397.1|  Penicillium soppii voucher IBT 18220 cytochrom...
gb|EF180396.1|  Penicillium soppii voucher 226.28 cytochrome o...
gb|EF180395.1|  Penicillium soppii voucher 144.83 cytochrome o...

The top hits for me using online nblast for the same sequence also on 
the nr database:

gb|AY129164.1|  Pythium aphanidermatum cytochrome oxidase I ge...
gb|AY561976.1|  Scopalina ruetzleri cytochrome oxidase subunit...
gb|EF468468.1|  Phytophthora sp. H-6/02 cytochrome oxidase sub...
gb|DQ832717.1|  Phytophthora sojae mitochondrion, complete genome
gb|EF468470.1|  Phytophthora sp. H-8/02 cytochrome oxidase sub...
gb|EF468469.1|  Phytophthora sp. H-7/02 cytochrome oxidase sub...
gb|AY129166.1|  Phytophthora capsici cytochrome oxidase I gene...

i.e. Very different!

I switched to using plain text output as its easier to read by hand.

Both correctly understood the input query was 780 letters long.
Both claimed to be output from BLASTN 2.2.17
Both claimed to be output from the same database

There where some differences in the parameters footer - but I'm not sure 
why.  Using the script:

   Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS,environmental
samples or phase 0, 1 or 2 HTGS sequences)
     Posted date:  Aug 16, 2007  6:06 PM
   Number of letters in database: -51,729,944
   Number of sequences in database:  5,751,035
Lambda     K      H
     1.37    0.711     1.31
Gapped
Lambda     K      H
     1.37    0.711     1.31
Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
...

While using the web browser:

   Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS,environmental
samples or phase 0, 1 or 2 HTGS sequences)
     Posted date:  Aug 16, 2007  6:06 PM
   Number of letters in database: -51,729,944
   Number of sequences in database:  5,751,035
Lambda     K      H
    0.634    0.408    0.912
Gapped
Lambda     K      H
    0.634    0.408    0.912
Matrix: blastn matrix:2 -3
Gap Penalties: Existence: 5, Extension: 2
...


There is something funny here... does this throw any light on things?

Peter




More information about the Biopython mailing list