[Biopython-dev] [Bug 2351] Make Seq more like a string, even subclass string?

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Oct 31 01:30:20 UTC 2007


http://bugzilla.open-bio.org/show_bug.cgi?id=2351





------- Comment #6 from mdehoon at ims.u-tokyo.ac.jp  2007-10-30 21:30 EST -------
First, let's think about how a Seq object should look like, before getting into
implementation details.

In my opinion, a Seq object is essentially a string, but with some added
functionality that are useful in biological contexts. Currently, this is
limited to specifying an alphabet. Personally, I never used such an alphabet,
so in practice I prefer using a simple string instead of a Seq object.

However, if we extend its functionality, I think a Seq class can be useful
enough to warrant its existence in Biopython.

In short, to my mind a Seq object should have the following properties:
1) A Seq object is basically a string, so it should behave as if it were
subclassed from string.
2) As a result, functions that have a sequence as an argument, but don't need
the added features of a Seq object, should work with strings as well as Seq
objects.
3) The sequence should be mutable, so that we won't need a separate MutableSeq
class. This also implies that a Seq class cannot subclass from string, since
strings are not mutable.
4) Currently, Seq objects have an associated alphabet; SeqRecord objects have
annotations, dbxrefs, a description, features, id, and name. I think a new Seq
object should have both, so that we can avoid having both a Seq and a SeqRecord
class. Of course, some or all of these fields can remain None.
5) A Seq class should have methods that one expects from a sequence class, in
particular complement(), reverse_complement(), perhaps a modified count() that
can ignore case.

With respect to 3), we'd probably have to write such a Seq class in C.

The end result would be a Seq class that actually has some benefit to the user,
without requiring its use when a string suffices, and avoids having three
classes (Seq, MutableSeq, SeqRecord) for essentially the same thing.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list