[Bioperl-l] Re: [Fwd: Re: marker manipulation in bioperl]

Heikki Lehvaslaiho heikki@ebi.ac.uk
Fri, 19 Jan 2001 14:05:14 +0000


Arek Kasprzyk wrote:
> 
> On Fri, 19 Jan 2001, Heikki Lehvaslaiho wrote:
> 
> Hi guys,
> I have not been following this discussion very closely but
> thought you may find useful to poke around a set of ensembl modules which
> called ensembl-map. I think that some of the ideas you are talking
> about have been implemented there.

The URl is:

http://www.ensembl.org/cgi-bin/cvsweb/cvsweb.cgi/ensembl-map/modules/Bio/EnsEMBL/Map/

	-Heikki

> Arek
> 
> 
> > -------- Original Message --------
> > Subject: Re: marker manipulation in bioperl
> > Date: Thu, 18 Jan 2001 13:06:26 -0500 (EST)
> > From: Jason Stajich <jason@chg.mc.duke.edu>
> > To: Heikki Lehvaslaiho <heikki@ebi.ac.uk>
> > CC: Eric Snyder <SnyderEE@pbrc.edu>
> >
> > Heikki - yes I think going via Variation::VariantI is a good way - I
> > am
> > not as familiar as I'd like to be with the Variation objects, but this
> > makes sense and I could imagine actually having ways to handle alleles
> > later on which might become useful.
> >
> > I'd still like to have an interface describe a Marker so we can do
> > some
> > fun inheritance things later with different types of markers.  So I'd
> > make
> > a MarkerI and it would subclasses VariantI and add the methods
> > pcr_fwd,
> > pcr_rev (or a more appropriate function name).
> >
> > Eric [ might want to read below first ] does the OO stuff make sense
> > here?
> > If we make MarkerI with basic methods pcrprimers, chrom, sequence
> > location
> > then a concrete implementation of this can be GenericMarker, and
> > various
> > subclasses - RhMarker, STSMarker, MicrosatteliteMarker or
> > GeneticMarker,
> > RhMarker, ... depending on how you want to describe them.  If they
> > have
> > specific attributes or methods that are particular to that type of
> > marker.
> >
> > Then on the Maps front, something like a
> > LinkageMap could be then build using GeneticMarkers or STSMarkers
> > as they implemented a function like get_genetic_location... or
> > get_location('cM');
> >
> > Am I too far out there in interface land for you?
> >
> > -jason
> > On Thu, 18 Jan 2001, Heikki Lehvaslaiho wrote:
> > >
> > > Jason,
> > >
> > > I finally found my notes on upgrading the Ensembl Variation class.
> > > The problem there is that the SNP with an ID can have several
> > > locations in a genome. At the moment when several locations are needed
> > > I simply return several Variation objects with same ID. Not very
> > > pretty, but the interface requires me to return SeqFeature objects not
> > > something that contains them.
> > >
> > > So, your needs. You said that you need the following methods:
> > >
> > > fwd_primer, rev_primer, length, genetic_location, marker_sequence
> > >
> > > The following lists where they could go (+) are are already in
> > > Variation
> > > classes(%) :
> > >
> > > Bio::Variation::VariantI
> > >  subclassed by DNAMutation, RNAChange, AAChange
> > >
> > > + fwd_primer, (moltype not protein)
> > > + rev_primer, (moltype not protein)
> > > % length,
> > > % add_DBLink
> > > % each_DBLink
> > > % status
> > >
> > > Bio::Variation::SeqDiff (VariantI holder class)
> > > % chromosome
> > > + genetic_location, (for strings like 12p13.3 )
> > >
> > > Bio::Variation::Allele
> > >     isa Bio::PrimarySeq
> > > % marker_sequence
> > >     ->seq
> > >     has additional methods repeat_unit and repeat_count
> > >     to describe the sequence: e.g. (CA)5
> > >
> > >
> > > Separately, these are the methods that I have in Variation:
> > >
> > > Bio::Ensembl::ExternalData::Variation
> > > -------------------------------------
> > >  same inheritance as in VariantI
> > >
> > > in addition:
> > >
> > > start_in_clone_coord
> > > end_in_clone_coord
> > > (status)
> > > alleles         (string as opposed to Allele object in VariantI)
> > > (upStreamSeq) (same as in VariantI)
> > > (dnStreamSeq) (same as in VariantI)
> > >
> > >
> > > So, it seems to me almost everything can be accomodated within
> > > VariantI implementing objects.
> > >
> > > Do you want to say if marker is defined on DNA or RNA?
> > > moltype method?
> > > What additional methods you can think of having?
> > >
> > >
> > > It might be enough just to have a
> > > Bio::Variation::Marker class (isa Bio::Variation::VariantI)
> > > add
> > >   + fwd_primer, (moltype not protein)
> > >   + rev_primer, (moltype not protein)
> > > into Bio::Variation::VariantI
> > >
> > > and have method for genetic_location and override status method to
> > > accept
> > > any scalar (it is now restricted to values 'suspected'/'proven').  It
> > > might
> > > be a good idea to have a separate chromosome method a la GenBank/EMBL?
> > >
> > > + chromosom
> > > + genetic_location
> > > + status
> > >
> > > You could use Allele class and VariantI method to manipulate the
> > > sequence
> > > data of you could come up with a simplier implementation or interface.
> > >
> > > What do you think?
> > >
> > > Yours,
> > >
> > >     -Heikki
> > >
> > >
> > >
> > > Jason Stajich wrote:
> > > >
> > > > I won't be writing anything substantial until holidays are over, I have
> > > > just been thinking about this and had some time to play last week as
> > > > things were slow for me.  I guessed you would have some ideas and insight.
> > > > Let's see if we start coming up with an interface or extensions to
> > > > VariationI after Jan 1st.
> > > >
> > > > Happy holidays.
> > > > -jason
> > > >
> > > > On Sat, 23 Dec 2000, Heikki Lehvaslaiho wrote:
> > > >
> > > > > Hi Jason,
> > > > >
> > > > > Sorry I have not answered. I am on holiday and Christmas is in a day
> > > > > or two.
> > > > >
> > > > >
> > > > > Jason Stajich wrote:
> > > > > >
> > > > > > I'm trying to write some code that allows me to manipulate marker
> > > > > > information (SNPs, Microsattelites, STS).  Thought it might be a useful
> > > > > > bioperl object.  Right now I want to associate the following data with a
> > > > > > marker name - fwd_primer, rev_primer, length, genetic_location,
> > > > > > marker_sequence.  I am also querying GDB, genbank, and local databases for
> > > > > > this and thought it would make sense to create a reusable object.  Does
> > > > > > any/all of this fit into any of the Variation modules?  I feel like if
> > > > >
> > > > > It fits fine. You could also have a look what I have put into
> > > > > ensembl-external as a Variation class. That is a gough and dirty class
> > > > > for holding SNP information.
> > > > >
> > > > > I have plans somewhere to extend it .... (I can not find the text I
> > > > > wrote...have to look with more time in my hands.... )
> > > > >
> > > > > > there isn't one already this should somehow fall into the Variation
> > > > > > category.  I have already written many throw away scripts to manipulate
> > > > > > the information, but it seems to me that this should be a object.  I can
> > > > > > relate the information to physical sequence via blast and the
> > > > > > marker_sequence or e-PCR and the primers, but often I might want to
> > > > > > process the markers for something else.
> > > > > >
> > > > > > Bio::Variation::GeneticMarker?  A SNP would be a sequence change, but also
> > > > > > a marker ... I imagine this working on multiple levels - sequence, maps,
> > > > > > etc.
> > > > >
> > > > > I think we should see what could be put into a interface file and what
> > > > > into an istantiable class.
> > > > >
> > > > > Bio::Variation::MarkerI
> > > > > Bio::Variation::Marker
> > > > >
> > > > > Altenatively, Bio::Variation::VariationI is already there and can me
> > > > > extended.
> > > > >
> > > > > I have to go...
> > > > > Are you going to do write this right now or can we think about this
> > > > > over the holidays?
> > > > >
> > > > >       -Heikki
> > > > >
> > > > > > Jason Stajich
> > > > > > jason@chg.mc.duke.edu
> > > > > > Center for Human Genetics
> > > > > > Duke University Medical Center
> > > > > > http://www.chg.duke.edu/
> > > > >
> > > > > --
> > > > > ______ _/      _/_____________________________________________________
> > > > >       _/      _/                      http://www.ebi.ac.uk/mutations/
> > > > >      _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
> > > > >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> > > > >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> > > > >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> > > > >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > > > > ___ _/_/_/_/_/________________________________________________________
> > > > >
> > > >
> > > > Jason Stajich
> > > > jason@chg.mc.duke.edu
> > > > Center for Human Genetics
> > > > Duke University Medical Center
> > > > http://www.chg.duke.edu/
> > >
> > > --
> > > ______ _/      _/_____________________________________________________
> > >       _/      _/                      http://www.ebi.ac.uk/mutations/
> > >      _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
> > >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> > >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> > >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> > >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > > ___ _/_/_/_/_/________________________________________________________
> > >
> >
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > Center for Human Genetics
> > Duke University Medical Center
> > http://www.chg.duke.edu/
> >
> 
> -------------------------------------------------------------------------------
> Dr Arek Kasprzyk
> EMBL-European Bioinformatics Institute.
> Wellcome Trust Genome Campus, Hinxton,
> Cambridge CB10 1SD, UK.
> Tel: +44-(0)1223-494606
> Fax: +44-(0)1223-494468
> -------------------------------------------------------------------------------

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________