[Biopython-dev] [Biopython - Bug #2351] Make Seq more like a string, even subclass string?

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Fri Jun 29 13:13:33 EDT 2012


Issue #2351 has been updated by Peter Cock.


I have concerns at least:

Does anything break if we do make Seq subclass string? Is this possible for MutableSeq? What about DBSeq (the lazy loading sequence object in BioSQL)? Or UnknownSeq (which tries to avoid creating large repetitive strings in memory)? What worries me however is a possible dichotomy between Seq-type objects which do and don't subclass strings. Another potential example is memory efficient bit-encoded nucleotide sequences (BioJava has this). i.e. There are lots of Seq like objects where we do NOT want to have a big string buffer allocated in memory, and would that be required if we subclass string?

Also, for Python 3, we may want to consider sub classing byte string rather than the (unicode) string. However, with Python 3.3 the memory bloat problem of using Unicode even for simple ASCII strings does go away.
----------------------------------------
Bug #2351: Make Seq more like a string, even subclass string?
https://redmine.open-bio.org/issues/2351

Author: Peter Cock
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: Not Applicable
URL: 


We've started talking on the mailing list about making the SeqRecord class a subclass of the Seq object, and making that a subclass of the Python string.

This bug is for holding patches - I suspect a lot of the discussion will continueon the mailing lists rather than here.

I explicitly have left the "assign to" field pointing at the dev mailinglist.


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org



More information about the Biopython-dev mailing list