[Biopython-dev] [Bug 2550] Alphabet problems when adding sequences

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sun Jul 27 15:59:50 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2550





------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk  2008-07-27 11:59 EST -------
Trying to fix this by chaning the Alphabet and AlphabetEncoder classes'
contains method only is nasty, and wouldn't cover situations like this:

p = Seq("PKL-PAK", Gapped(generic_protein,"-"))
q = Seq("ADKS*", HasStopCodon(generic_protein,"*"))

where you might expect something like:

p+q == Seq("PKL-PAKADKS*", HasStopCodon(Gapped(generic_protein,"-"),"*")

Taken literally, neither of these two alphabets contains the other - but there
is a fairly obvious consensus alphabet!  I think the best solution would
require changes to the Seq object's add method to pick a consensus alphabet in
the non-simple cases where one alphabet is clearly a sub-set of the other.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list