[Biocorba-l] Part III - SeqFeature / CompositeSeqFeature
Alan Robinson
alan@ebi.ac.uk
Mon, 12 Feb 2001 18:29:26 +0000 (GMT Standard Time)
3) Currently, the 'Seq' interface has a method:
'all_SeqFeatures(in boolean sub_seqfeatures)'.
IMO, the problem with this is that if we request all SeqFeatures of a Seq,
including the sub_SeqFeatures, then if we request the sub_SeqFeatures of
one of these SeqFeature objects, we end up getting some of the features
twice - Either as duplicates or equivalents! Plus we don't know what
SeqFeature may have sub-SeqFeatures until we've called this method. (At
least that's my interpretation).
This sounds like more than enough rope for someone to hang themselves
with...
Solution 1: Remove the 'in boolean sub_seqfeatures' from all methods. It
is the responsibility of the client to descend the structure and return
all sub-SeqFeatures (It is a little more work for the client, but removes
the potential for object duplication/equivalence and confusion).
Solution 2: A sequence feature may be either singular, or composed of
other sequence features. This sounds like SeqFeature should be modelled as
a composite to me.
................
Personally, I would prefer to remove the 'boolean sub_seqfeatures'
attribute and model SeqFeatures and sub_SeqFeatures using a composite
model:
interface Seq {
// ...
SeqFeatureVector get_SeqFeatures();
}
The SeqFeatures returned are the top level ones only - It is necessary to
descend those SeqFeatures that have sub-SeqFeatures in a recursive manner.
Yes, this is more work, but it also means the client is less likely to get
themselves in a mess with duplicate objects.
If a feature may have sub-features - Then this should be modelled as a
composite:
interface SeqFeature {
// All the normal methods, bar the current 'sub_SeqFeatures()' method.
}
interface CompositeSeqFeature : SeqFeature {
SeqFeatureVector sub_SeqFeatures();
}
Thus for a sequence with a 'gene' feature made up of 'exons' and
'introns', the call:
my $seqFeatureVector = $seq -> get_SeqFeatures();
will return a SeqFeatureVector containing a 'gene' feature which is
actually a CompositeSeqFeature object (since it has sub-SeqFeatures of
exon and intron SeqFeatures).
So, as a composite, the 'sub_SeqFeatures()' method is available on this
'gene' CompositeSeqFeature and will return the 'exons' and 'introns' as
SeqFeature objects in a SeqFeatureVector object.
N.B. A side-effect of having the SeqFeatureComposite object, is that it
would be possible to have a parameter to specify if the order of the
sub-SeqFeatures returned in the vector is significant, or not. (I cannot
decide if this appropriate currently).
--
============================================================
Alan J. Robinson, D.Phil. Tel:+44-(0)1223 494444
European Bioinformatics Institute Fax:+44-(0)1223 494468
EMBL Outstation - Hinxton Email: alan@ebi.ac.uk
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD, UK http://industry.ebi.ac.uk/~alan/
============================================================