[BioPython] sequence proposals (long)

Jeffrey Chang jchang@SMI.Stanford.EDU
Thu, 30 Mar 2000 14:00:53 -0800 (PST)


> [Jeff]
> > For example:
> > >>> seq = GappedSequence("AT-G--C")
> > >>> seq[1:3]
> > 'TG'
> > >>> seq.gapped[1:3]
> > 'T-G'
> 
[Andrew]
> I would have it the other way around, where the default subscript
> contains the '-' and the ".ungapped" attribute yields the sequence.
> This makes it easier to compare relative positions of a sequence
> with a gapped sequence.

It looks like there's 2 things going on here.  In this example, one is
getting a display-able representation of the sequence, where you can make
inferences on character lengths and such, and the other is accessing the
biological sequence.

More generally, all sequences need to support some way of getting the
biological sequence, and possibly other access methods depending on the
requirements of the class.

Maybe all sequences will need to support at least biological sequence
access, in addition to displayable representation?  I'm beginning to worry
about sliding down a slippery slope towards large classes, though...

The disadvantage of having it the other way around, is that people who
want to access the underlying biological sequence (without gap characters)
will need to do it a different way for every type of sequence.

Jeff