[Biopython] bug with pairwise2 local alignments?
Jocelyne
jocelyne at gmail.com
Fri Sep 7 02:15:59 EDT 2012
Hi Peter:
I added 4 lines of code in each snippet below (there are copies of the same
code). I'm pretty sure it should fix it (there are copies of line 438-439,
with the indexes changed). Basically, the previous code allowed for
negative scores in the first row and column of the matrix, even in the case
of local alignments (in which case scores shouldn't go negative). I didn't
test it, so please make sure it works before merging.
Also, it seems that it imports _make_score_matrix_fasta from a C library
(line 851), which overload the corresponding python function, so that would
have to be fixed too.
Thanks!
Jocelyne
378 # The top and left borders of the matrices are special cases
379 # because there are no previously aligned characters. To simplify
380 # the main loop, handle these separately.
381 for i in range(lenA):
382 # Align the first residue in sequenceB to the ith residue in
383 # sequence A. This is like opening up i gaps at the beginning
384 # of sequence B.
385 score = match_fn(sequenceA[i], sequenceB[0])
386 if penalize_end_gaps:
387 score += gap_B_fn(0, i)
388 score_matrix[i][0] = score
+++ if not align_globally and score_matrix[0][i] < 0:
+++ score_matrix[i][0] = 0
389 for i in range(1, lenB):
390 score = match_fn(sequenceA[0], sequenceB[i])
391 if penalize_end_gaps:
392 score += gap_A_fn(0, i)
393 score_matrix[0][i] = score
+++ if not align_globally and score_matrix[0][i] < 0:
+++ score_matrix[0][i] = 0
461 # The top and left borders of the matrices are special cases
462 # because there are no previously aligned characters. To simplify
463 # the main loop, handle these separately.
464 for i in range(lenA):
465 # Align the first residue in sequenceB to the ith residue in
466 # sequence A. This is like opening up i gaps at the beginning
467 # of sequence B.
468 score = match_fn(sequenceA[i], sequenceB[0])
469 if penalize_end_gaps:
470 score += calc_affine_penalty(
471 i, open_B, extend_B, penalize_extend_when_opening)
472 score_matrix[i][0] = score
+++ if not align_globally and score_matrix[i][0] < 0:
+++ score_matrix[i][0] = 0
473 for i in range(1, lenB):
474 score = match_fn(sequenceA[0], sequenceB[i])
475 if penalize_end_gaps:
476 score += calc_affine_penalty(
477 i, open_A, extend_A, penalize_extend_when_opening)
478 score_matrix[0][i] = score
+++ if not align_globally and score_matrix[0][i] < 0:
+++ score_matrix[0][i] = 0
On Thu, Sep 6, 2012 at 8:59 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
> On Thu, Sep 6, 2012 at 9:31 PM, Jocelyne <jocelyne at gmail.com> wrote:
> > Hello:
> > First, I'd like to say that I really appreciate the effort of the
> community
> > to provide us with such a nice package.
> > I found some odd scoring behavior with the pairwise2 local alignment
> (see 5
> > below). I think these 2 alignments should have the same score.
>
> Hmm. I'm not overly familiar with this bit of the code, but did
> occur to me it might be something related to this open issue:
>
> https://redmine.open-bio.org/issues/2776
>
> I was able to repeat your pairwise2.align.localms example
> and the score matrix example a Mac using the latest code
> from github, and got the same answers. So (as I suspected)
> this does not seem to be a platform specific issue.
>
> Unfortunately the original author of this code (Jeff Chang)
> isn't active with Biopython anymore - we can try emailing
> him directly, but if you're willing to look into this in more
> detail and can propose a fix, I'm happy to take a look at
> merging it.
>
> Peter
>
More information about the Biopython
mailing list