[Biopython-dev] [Bug 2597] New: Enforce alphabet letters in Seq objects

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri Sep 26 13:06:32 EDT 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2597

           Summary: Enforce alphabet letters in Seq objects
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
 BugsThisDependsOn: 2532


If a Seq object is created with an alphabet with a pre-defined set of letters
(e.g. the IUPAC alphabets) then I think Biopython should validate that the
sequence does indeed only use those letters.

This will catch mis-use of ambiguous sequences with non-ambiguous alphabets,
letters in an unexpected case, and most importantly any unexpected symbols
(e.g. from a parsing problem).

This will impose a performance overhead - which can be avoided if the user
instead chooses to use a generic dna/rna/protein alphabet which does not list
the letters expected.

Note that we will have to resolve Bug 2532 before doing this, as currently some
parts of Biopython are mis-using the upper case only IUPAC alphabet objects
with mixed case sequences.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list