[Biopython] corrupted blast results

Norbert Auer norbert.auer at boku.ac.at
Tue May 14 16:27:00 UTC 2013


Hi,

I have currently some problems using the NCBIWWW.qblast function. I used this query to blast some sequences.

result_handle = NCBIWWW.qblast("blastn", "refseq_genomic", seq_fasta,entrez_query="txid10029 [ORGN]",hitlist_size=2)
save_file = open("blast.xml", "w")
blast_results = result_handle.read()
save_file.write(blast_results)
result_handle.close()

Last time I haven't any problems with this script but today I get only corrupted (not well formed) XML files back. In my last try I got a correct XML File but after a deeper investigation of this file I found out that the showed alignment was wrong. The header shows Identities = 660/661 but looking into the alignment shows that this cannot be true. I used a similar query over the web fronted and got the same hit expect that the alignment was correct. It seems that there was a insertion of 3 nucleotides in the middle of the subject sequence. How could this be? I have no explanation for this behaviour.

from the NCBIWWW.qblast function:
Query      241     AAGGCAGGACTGAAGAGTGTCATTATGGGGTGAGCCTTTCAAGGTCCCTGCCACTCTCTC  300
                             |||||||||||||||||||||||||||||||||||||||||         |  |      
Sbjct  1002610  AAGGCAGGACTGAAGAGTGTCATTATGGGGTGAGCCTTTCATCAAGGTCCCTGCCACTCT  1002551

from the web fronted:
Query      241     ACTCTCTTTGTGTACTTTAAAGGTGCTGTGCCCCAAACTCCTGGGACACGGAGAGAACTC  300
              ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  1169534  ACTCTCTTTGTGTACTTTAAAGGTGCTGTGCCCCAAACTCCTGGGACACGGAGAGAACTC  1169593

I was wondering if this is a NCBI service problem (running on a different server than the web fronted) or is it a biopython issue?

I use biopython version 1.61

If necessary I could attach the blast XML files but they are very long.
Thanks







More information about the Biopython mailing list