[Bioperl-l] Opening up deleting features from Seq objects again

Ewan Birney birney@ebi.ac.uk
Thu, 25 Oct 2001 03:28:34 +0100 (BST)


On Wed, 24 Oct 2001, David Block wrote:

> As part of our work in Genquire (soon to be released BSD! Finally!) we
> have come to a point where it makes sense to include delete_feature
> functionality in our Bio::SeqI implementation.

Great!

> 
> This should be extended to Bio::Seq generally because:
> 

No... see below...

> With complex GeneStructure objects, rebuilding a hierarchy of annotations
> is not trivial.  The old technique, flush/add, flattens the hierarchy and
> can result in multiple copies of exons being added to the sequence.
> 
> delete_feature can understand the feature's context and remove only those
> parts of the parent gene that make sense.
> 
> Our implementation looks like this:
> my $orphanlist=$seq->delete_feature($feature,$transcript,$gene);
> 
> This allows the current exon/transcript/gene hierarchy to be passed to the
> sequence.  It returns a list of features which are no longer part of a
> coherent gene structure, i.e. if you want to delete one of two
> transcripts, but want the hypothetical exons that make up the transcript 
> to stick around, the exons will be attached as top-level features and
> returned to you.
> 
> This allows our gui to function as expected.
> 
> I volunteer to bolt this functionality on to Bio::Seq or one of its
> descendants, if that's better.  We want it in SeqI, at least as a stub,
> so SeqCanvas doesn't barf if it's given any SeqI and asked to delete
> something.

I'm really against adding this functionality in the Bio::Seq
implementation


You are really forcing the Genquire update model (deleting individual
features) into the default Bioperl sequence object. I think this is a bad
idea and should be discouraged.


There are other "update policy" systems for database access (Bioperl-db
follows a more cvs publish/update type model - ish) 


Furthermore, I have a sneaky suspicision that the feature delete
requirements becomes a bit of a can of worms wrt to things like multiple
users.


What I think you should go for is this sort of model


  # interface that GenQuire needs for a sequence objects to be
  # editable

  Bio::GenQuire::UpdateableSeqI 

  # implementation of this in pure Perl, can inhereit from
  # Bio::Seq if so wished to reduce coding

  Bio::GenQuire::Seq

  # implementation of this with DB backend

  Bio::GenQuire::DB::DavidsNameSpace::Whatever


This mirrors what we have done in Ensembl, separating out the "Ensembl
specific" interfaces into Bio::EnsEMBL::* space, and therefore not
inflicting Ensembl's update model on everyone else (not that we have one).



Does this make sense? Do other people have views?


The important thing is to keep "updatability" coupled with the update
policy/functionality of the editor/system, as this is very variable.





> 
> SeqCanvas does nothing if nothing is returned, again as you would expect.
> 
> Let me know...
> 
> -- 
> David Block
> soon to be moving
> dblock@gene.pbi.nrc.ca
> http://bioinfo.pbi.nrc.ca/wiki
> NRC Plant Biotechnology Institute
> Saskatoon, SK, Canada
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------