[Biopython] Calculating the Hamming distance

Michiel de Hoon mjldehoon at yahoo.com
Thu Jun 27 09:08:52 UTC 2013


Hi Philipp,

Maybe the sequence alignment doesn't show up clearly in the email, but the two sequences do match very well. The Hamming distance is only 4 (i.e. 4 mismatches/insertions/deletions).

Best,
-Michiel.




________________________________
 From: Philipp Schiffer <philipp.schiffer at gmail.com>
To: Michiel de Hoon <mjldehoon at yahoo.com> 
Cc: "biopython at biopython.org" <biopython at biopython.org> 
Sent: Thursday, June 27, 2013 4:47 PM
Subject: Re: [Biopython] Calculating the Hamming distance
 


Hi Michiel, 

maybe I am thick here (or lack the biological) knowledge, but to me it looks as if your sequence just don't match. Thus the Bio.pairwise2 alignment is 'correct' in terms if alignment.

Cheers

Philipp


-- 
Philipp Schiffer
Sent with Sparrow

On Thursday, 27. June 2013 at 09:13, Michiel de Hoon wrote:
Dear all,
>
>
>I am trying to align a small RNA sequence to a (shortish) DNA sequence.
>The alignment I am looking for is:
>
>
>
>
>AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGAT--TCCCGGTCAGGGAACCA-
>                                  GGATGATCCCGGTCAGGGAACCAA
>
>
>where the first sequence is the DNA and the second sequence is the RNA.
>The Hamming distance is 4 (the initial mismatch, the 2 insertions, and the gap at the end).
>
>
>If I try to calculate this alignment with Bio.pairwise2, I get the following if I use
>globalms(dna, rna, 0, -1, -1, -1, penalize_end_gaps=True):
>
>
>AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGATTCCCGGTCAGGGAACC-A
>-GGAT--G--------A---------------------TCCCGGTCAGGGAACCAA
>
>
>However, if I set penalize_end_gaps to False, I get
>
>
>-----------------------AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGATTCCCGGTCAGGGAACCA
>GGATGATCCCGGTCAGGGAACCAA------------------------------------------------------
>
>
>I guess the solution is to penalize end gaps in the DNA but not in the RNA.
>I could modify Bio.parwise2 to allow for that possibility, but before I do so, I was wondering if there are any other ways to find the desired alignment with Biopython (preferably without using 3rd-party software).
>
>
>Thanks,
>-Michiel.
>_______________________________________________
>Biopython mailing list  - Biopython at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biopython 



More information about the Biopython mailing list