[Biojava-l] FeatureHolder.containsFeature()
Matthew Pocock
matthew_pocock@yahoo.co.uk
Tue, 09 Apr 2002 17:57:03 +0100
Hi David,
This is getting into a discusion about what we mean by a feature. There
seems to be a more abstract and platonic entity floating arround that is
in some way the 'ideal feature'. Then, there are projections of this
into our mortal and imperfect world of sequences, locations, feature
objects and annotations. Perhaps this could be fixed (if we could rip
everything up and start again) by having one more type of entity that
decouples hierachy from annotation and which is in a 1-1 relationship
with those pesky feature IDs. The new data-structures (in hokey syntax)
would look like:
seq
has:
Set<FeatureImages> features
SymbolList symbols
any number of properties
FeatureImages
herachy API:
(FeatureImage or Sequence) parent
Set<FeatureImage> features
Location location
any number of properties - specific to the image
has:
Feature
Feature
has:
any number of properties - defining or describing this feature
where "any number of properties" could be a mixture of get/set pairs and
annotation bundles.
For the cases we are describing, it is the Feature objects that get
compared for equality (e.g. forward/reverse strand projections of
features), probably using something similar/identical to the current
rules. Things like feature strand would go on the FeatureImage, where as
things like blast scores would go on the Feature.
Ah - the benefit of hindsight. If only data structures could be
symultaneously fluid and compile-time checked.
Matthew (wishing that all programming was more ontology driven)
> What about the case where you have 2 sequences one of which is a
> sub-sequence of the other such, as a chromosome and a BAC clone? Should the
> chromosome 'contain' all the features of the BAC? Is this another case for a
> different containsFeature() method?
>
> At present RevCompSeq.containsFeature() returns true for all features that
> are contained by it, and by the underlying non-Revcomp sequence. That is to
> say for all features in origSeq, revSeq.containsFeature(origFeature) returns
> true. But the reverse is not true, because the original sequence of course
> knows nothing about its RevComp brother. So although both sequences actually
> have the same features (one is just a projection of the other) the original
> sequence does not know to look for features that are really the same just
> backwards. I guess this is another case for some clarity about what we mean
> when two features are equivalent.
>
> David
>
>
>