[BioPython] Sequence numbering. Moving on...

Iddo Friedberg idoerg@cc.huji.ac.il
Mon, 27 Sep 1999 11:22:10 +0200 (GMT+0200)


Hi,

Moving on the sequence object proposition, I propose the following:

1) A sequence will be a base class, with subclasses for nucleotide and
peptide sequences. The subclasses will be used.

2) A method which returns a sequence, will return a sequence object, not a
string.

3) I liked Andrew's compromise. I think the __call__ method should wrap
one of the others (either python, seq, perl, omg; I'm partial to the seq 
method myself), and that will be our
wrapper. If we wish to make things more flexible, the __call__ method may
accept as it's last argument the name of the method it wraps. See the
``Alternative __call__ method'', bottom.

4) I'm not very good at streamlining Python code. If anyone likes this and
has a proposition on how to make this faster, I'd like to know.


So:

>>> seq=sequence.DNASeq('ATGC')
>>> seq(1,3)
['A', 'T', 'G']

#(using the alternate __call__ method):
>>> seq(1,3,'python')   # using the alternate __call__ method
['T', 'G']

>>> seq(1,3).complement()
['T', 'A', 'C']



Iddo

PS: I use tab indentations. Set your editor to 3 space tabstops, and the
following should look fine

------------------------------ CUT HERE ----------------------------
from UserList import UserList

# Base class for sequences
class Seq(UserList):
	def __init__(self, inseq=''):
		self.data = []
		for i in inseq:
			self.append(i)
	def __call__(self, min=None, max=None):
	# The call method will be used with the seq attribute. This is the
	# wrapper
		if min == None:
			min = 1
		if max == None:
			max = len(self)
		retSeq = self.__class__((self.seq(min,max)))
		return retSeq

	def seq(self, min, max):
		# with a base of 1 and including the end
		# Negative slice notation not allowed
		assert max >= min and min >0
		return self[min-1:max]
	def omg(self, min, max):
		# with a base of 1 and excluding the end
		# Negative slice notation not allowed
		assert min>=1 and max>min
		return self[min-1:max-1]
	def perl(self, min, max):
		# with a base of 0 and including the end
		# Negative slice notation not allowed
		assert min>=1 and max>=min
		return self[min:max+1]
	def python(self, min, max):
		return self[min:max]

compTable = {'A':'T', 'T':'A', 'G':'C', 'C':'G'}
class DNASeq(Seq):
	# Example of a nucleotide sequence subclass, with a method
	def complement(self, min=None, max=None):
		retSeq = DNASeq()
		for i in self(min,max): # This calls the seq attribute in the
										# base class
			retSeq.append(compTable[i])
		return retSeq

chargeTable = {'E':-1., 'D':-1., 'K':1., 'R':1, 'H':0.5}
class PEPSeq(Seq):
	# Example of an amino-acid subclass, with a method
	def charge(self, min=None, max=None):
		c=0.
		for i in self(min,max): # This calls the seq attribute 
										# in the base class
			try:
				c = c + chargeTable[i]
			except KeyError:
				pass
		return c



--------------------------------------------------------
Alternative __call__ method:

	# The call method will be used with the any attribute. This is the
	# wrapper
	def __call__(self, min=None, max=None, attrib='seq'):
		if min == None:
			min = 1
		if max == None:
			max = len(self)
		retSeq = getattr(self,attrib)(min,max)
		return self.__class__(retSeq)