[Bioperl-l] Opening up deleting features from Seq objects again

David Block dblock@gene.pbi.nrc.ca
Thu, 25 Oct 2001 13:34:49 -0600 (CST)


On Thu, 25 Oct 2001, Ewan Birney wrote:

> On Wed, 24 Oct 2001, David Block wrote:
> 
> > As part of our work in Genquire (soon to be released BSD! Finally!) we
> > have come to a point where it makes sense to include delete_feature
> > functionality in our Bio::SeqI implementation.
> 
> Great!
> 
> > 
> > This should be extended to Bio::Seq generally because:
> > 
> 
> No... see below...
> 
> > With complex GeneStructure objects, rebuilding a hierarchy of annotations
> > is not trivial.  The old technique, flush/add, flattens the hierarchy and
> > can result in multiple copies of exons being added to the sequence.
> > 
> > delete_feature can understand the feature's context and remove only those
> > parts of the parent gene that make sense.
> > 
> > Our implementation looks like this:
> > my $orphanlist=$seq->delete_feature($feature,$transcript,$gene);
> > 
> > This allows the current exon/transcript/gene hierarchy to be passed to the
> > sequence.  It returns a list of features which are no longer part of a
> > coherent gene structure, i.e. if you want to delete one of two
> > transcripts, but want the hypothetical exons that make up the transcript 
> > to stick around, the exons will be attached as top-level features and
> > returned to you.
> > 
> > This allows our gui to function as expected.
> > 
> > I volunteer to bolt this functionality on to Bio::Seq or one of its
> > descendants, if that's better.  We want it in SeqI, at least as a stub,
> > so SeqCanvas doesn't barf if it's given any SeqI and asked to delete
> > something.
> 
> I'm really against adding this functionality in the Bio::Seq
> implementation
> 
> 
> You are really forcing the Genquire update model (deleting individual
> features) into the default Bioperl sequence object. I think this is a bad
> idea and should be discouraged.
> 
This is a natural outgrowth of SeqFeature::Gene::GeneStructure et al.
Once you have structured data in memory, flush/add is not appropriate all
of the time.

And what is the problem with having a working implementation of
delete_feature?  You don't have to use it...

> 
> There are other "update policy" systems for database access (Bioperl-db
> follows a more cvs publish/update type model - ish) 
> 

Of course, that's fine if that's what you want, but how do you dynamically
decide what goes into the next update?  Bio::Seq should be able to remove
one of its features from memory - this may or may not affect the
underlying persistent storage.

> 
> Furthermore, I have a sneaky suspicision that the feature delete
> requirements becomes a bit of a can of worms wrt to things like multiple
> users.
> 
We have a lock system so that users must register locks on portions of the
database.  Bio::Seq _has_no_persistence_mechanism_, so no two users are
ever looking at the same memory space.  What's your problem?

> 
> What I think you should go for is this sort of model
> 
> 
>   # interface that GenQuire needs for a sequence objects to be
>   # editable
> 
I already have all of this - I want to make this portable for the benefit
of SeqCanvas, not for Genquire.

>   Bio::GenQuire::UpdateableSeqI 
> 
>   # implementation of this in pure Perl, can inhereit from
>   # Bio::Seq if so wished to reduce coding
> 
>   Bio::GenQuire::Seq
> 
>   # implementation of this with DB backend
> 
>   Bio::GenQuire::DB::DavidsNameSpace::Whatever
> 
> 
> This mirrors what we have done in Ensembl, separating out the "Ensembl
> specific" interfaces into Bio::EnsEMBL::* space, and therefore not
> inflicting Ensembl's update model on everyone else (not that we have one).
> 
But that doesn't allow people to use Ensembl on any old Bio::Seq, which is
what we want from Bio::Tk::SeqCanvas.


> 
> 
> Does this make sense? Do other people have views?
> 
> 
> The important thing is to keep "updatability" coupled with the update
> policy/functionality of the editor/system, as this is very variable.
> 
updatability can be crippled by default, and available on certain
conditions (such as the presence of GeneStructures).  
 
> > SeqCanvas does nothing if nothing is returned, again as you would expect.
> > 
So a stub in Bio::SeqI that returns nothing and does nothing accomplishes
most of what we want.

-- 
David Block
soon to be moving
dblock@gene.pbi.nrc.ca
http://bioinfo.pbi.nrc.ca/wiki
NRC Plant Biotechnology Institute
Saskatoon, SK, Canada