[Bioperl-l] Re: [Fwd: Re: marker manipulation in bioperl]

Arek Kasprzyk arek@ebi.ac.uk
Fri, 19 Jan 2001 09:58:45 +0000 (GMT)


On Fri, 19 Jan 2001, Heikki Lehvaslaiho wrote:


Hi guys,
I have not been following this discussion very closely but
thought you may find useful to poke around a set of ensembl modules which
called ensembl-map. I think that some of the ideas you are talking
about have been implemented there. 

Arek




 
> -------- Original Message --------
> Subject: Re: marker manipulation in bioperl
> Date: Thu, 18 Jan 2001 13:06:26 -0500 (EST)
> From: Jason Stajich <jason@chg.mc.duke.edu>
> To: Heikki Lehvaslaiho <heikki@ebi.ac.uk>
> CC: Eric Snyder <SnyderEE@pbrc.edu>
> 
> Heikki - yes I think going via Variation::VariantI is a good way - I
> am 
> not as familiar as I'd like to be with the Variation objects, but this
> makes sense and I could imagine actually having ways to handle alleles
> later on which might become useful.  
> 
> I'd still like to have an interface describe a Marker so we can do
> some
> fun inheritance things later with different types of markers.  So I'd
> make
> a MarkerI and it would subclasses VariantI and add the methods
> pcr_fwd,
> pcr_rev (or a more appropriate function name).
> 
> Eric [ might want to read below first ] does the OO stuff make sense
> here?
> If we make MarkerI with basic methods pcrprimers, chrom, sequence
> location
> then a concrete implementation of this can be GenericMarker, and
> various
> subclasses - RhMarker, STSMarker, MicrosatteliteMarker or
> GeneticMarker,
> RhMarker, ... depending on how you want to describe them.  If they
> have
> specific attributes or methods that are particular to that type of
> marker.
> 
> Then on the Maps front, something like a 
> LinkageMap could be then build using GeneticMarkers or STSMarkers
> as they implemented a function like get_genetic_location... or
> get_location('cM');
> 
> Am I too far out there in interface land for you?
> 
> -jason
> On Thu, 18 Jan 2001, Heikki Lehvaslaiho wrote:
> > 
> > Jason,
> > 
> > I finally found my notes on upgrading the Ensembl Variation class.  
> > The problem there is that the SNP with an ID can have several
> > locations in a genome. At the moment when several locations are needed
> > I simply return several Variation objects with same ID. Not very
> > pretty, but the interface requires me to return SeqFeature objects not
> > something that contains them.
> > 
> > So, your needs. You said that you need the following methods:
> > 
> > fwd_primer, rev_primer, length, genetic_location, marker_sequence
> > 
> > The following lists where they could go (+) are are already in
> > Variation
> > classes(%) :
> > 
> > Bio::Variation::VariantI
> >  subclassed by DNAMutation, RNAChange, AAChange
> > 
> > + fwd_primer, (moltype not protein)
> > + rev_primer, (moltype not protein)
> > % length,
> > % add_DBLink
> > % each_DBLink
> > % status
> > 
> > Bio::Variation::SeqDiff (VariantI holder class)
> > % chromosome
> > + genetic_location, (for strings like 12p13.3 )
> > 
> > Bio::Variation::Allele
> > 	isa Bio::PrimarySeq
> > % marker_sequence
> > 	->seq
> > 	has additional methods repeat_unit and repeat_count
> > 	to describe the sequence: e.g. (CA)5
> > 
> > 
> > Separately, these are the methods that I have in Variation:
> > 
> > Bio::Ensembl::ExternalData::Variation
> > -------------------------------------
> >  same inheritance as in VariantI
> > 
> > in addition:
> > 
> > start_in_clone_coord
> > end_in_clone_coord
> > (status)
> > alleles	    (string as opposed to Allele object in VariantI)
> > (upStreamSeq) (same as in VariantI)
> > (dnStreamSeq) (same as in VariantI)
> > 
> > 
> > So, it seems to me almost everything can be accomodated within
> > VariantI implementing objects.
> > 
> > Do you want to say if marker is defined on DNA or RNA? 
> > moltype method?
> > What additional methods you can think of having?
> > 
> > 
> > It might be enough just to have a 
> > Bio::Variation::Marker class (isa Bio::Variation::VariantI)
> > add 
> >   + fwd_primer, (moltype not protein)
> >   + rev_primer, (moltype not protein)
> > into Bio::Variation::VariantI
> > 
> > and have method for genetic_location and override status method to
> > accept
> > any scalar (it is now restricted to values 'suspected'/'proven').  It
> > might
> > be a good idea to have a separate chromosome method a la GenBank/EMBL?
> > 
> > + chromosom
> > + genetic_location
> > + status
> > 
> > You could use Allele class and VariantI method to manipulate the
> > sequence
> > data of you could come up with a simplier implementation or interface.
> > 
> > What do you think?
> > 
> > Yours,
> > 
> > 	-Heikki
> > 
> > 
> > 
> > Jason Stajich wrote:
> > > 
> > > I won't be writing anything substantial until holidays are over, I have
> > > just been thinking about this and had some time to play last week as
> > > things were slow for me.  I guessed you would have some ideas and insight.
> > > Let's see if we start coming up with an interface or extensions to
> > > VariationI after Jan 1st.
> > > 
> > > Happy holidays.
> > > -jason
> > > 
> > > On Sat, 23 Dec 2000, Heikki Lehvaslaiho wrote:
> > > 
> > > > Hi Jason,
> > > >
> > > > Sorry I have not answered. I am on holiday and Christmas is in a day
> > > > or two.
> > > >
> > > >
> > > > Jason Stajich wrote:
> > > > >
> > > > > I'm trying to write some code that allows me to manipulate marker
> > > > > information (SNPs, Microsattelites, STS).  Thought it might be a useful
> > > > > bioperl object.  Right now I want to associate the following data with a
> > > > > marker name - fwd_primer, rev_primer, length, genetic_location,
> > > > > marker_sequence.  I am also querying GDB, genbank, and local databases for
> > > > > this and thought it would make sense to create a reusable object.  Does
> > > > > any/all of this fit into any of the Variation modules?  I feel like if
> > > >
> > > > It fits fine. You could also have a look what I have put into
> > > > ensembl-external as a Variation class. That is a gough and dirty class
> > > > for holding SNP information.
> > > >
> > > > I have plans somewhere to extend it .... (I can not find the text I
> > > > wrote...have to look with more time in my hands.... )
> > > >
> > > > > there isn't one already this should somehow fall into the Variation
> > > > > category.  I have already written many throw away scripts to manipulate
> > > > > the information, but it seems to me that this should be a object.  I can
> > > > > relate the information to physical sequence via blast and the
> > > > > marker_sequence or e-PCR and the primers, but often I might want to
> > > > > process the markers for something else.
> > > > >
> > > > > Bio::Variation::GeneticMarker?  A SNP would be a sequence change, but also
> > > > > a marker ... I imagine this working on multiple levels - sequence, maps,
> > > > > etc.
> > > >
> > > > I think we should see what could be put into a interface file and what
> > > > into an istantiable class.
> > > >
> > > > Bio::Variation::MarkerI
> > > > Bio::Variation::Marker
> > > >
> > > > Altenatively, Bio::Variation::VariationI is already there and can me
> > > > extended.
> > > >
> > > > I have to go...
> > > > Are you going to do write this right now or can we think about this
> > > > over the holidays?
> > > >
> > > >       -Heikki
> > > >
> > > > > Jason Stajich
> > > > > jason@chg.mc.duke.edu
> > > > > Center for Human Genetics
> > > > > Duke University Medical Center
> > > > > http://www.chg.duke.edu/
> > > >
> > > > --
> > > > ______ _/      _/_____________________________________________________
> > > >       _/      _/                      http://www.ebi.ac.uk/mutations/
> > > >      _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
> > > >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> > > >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> > > >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> > > >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > > > ___ _/_/_/_/_/________________________________________________________
> > > >
> > > 
> > > Jason Stajich
> > > jason@chg.mc.duke.edu
> > > Center for Human Genetics
> > > Duke University Medical Center
> > > http://www.chg.duke.edu/
> > 
> > -- 
> > ______ _/      _/_____________________________________________________
> >       _/      _/                      http://www.ebi.ac.uk/mutations/
> >      _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
> >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > ___ _/_/_/_/_/________________________________________________________
> > 
> 
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center 
> http://www.chg.duke.edu/
> 

-------------------------------------------------------------------------------
Dr Arek Kasprzyk
EMBL-European Bioinformatics Institute.
Wellcome Trust Genome Campus, Hinxton, 
Cambridge CB10 1SD, UK.
Tel: +44-(0)1223-494606
Fax: +44-(0)1223-494468
-------------------------------------------------------------------------------