From jungdl at ornl.gov Tue Mar 29 17:52:12 2005 From: jungdl at ornl.gov (David Jung) Date: Tue Mar 29 17:46:28 2005 Subject: [Biojava-dev] Sequence cloning with annotations & features... Message-ID: <1205.160.91.212.176.1112136732.squirrel@callisto.ornl.gov> Hello. I was wondering if someone could suggest the best way to clone a Sequence (since it isn't a Cloneable). I'd like to clone the raw sequence, annotations and features. I know how to clone the raw sequence and annotations, but aren't sure how to clone features, including all nested features recursively. (This is so that I can add features to the clone and they won't appear on the original Sequence) Thanks, -David Jung. From mark.schreiber at novartis.com Tue Mar 29 20:51:02 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Tue Mar 29 20:45:57 2005 Subject: [Biojava-dev] Sequence cloning with annotations & features... Message-ID: A very crude way to "deep clone" would be to serialize the Sequence object and deserialize it as another object. We did a lot of work on Serialization a few years back to get biojava to work with it. If you notice anything odd let us know. The advantage of this is that both would be exactly the same, even down to the implementing classes. Another less crude method with a bigger overhead would be to serialize to BioSQL and then read it back in. You could also write the sequence to a flat file (EMBL format or similar) and read it back in again, actually that may destroy your nested features as I don't think there is a flatfile format that can support them. Finally and most elegantly you could code up a Sequence copier that regenerates the annotation and feature heirachy. This would be ideal and possibly not to hard to do it simplistically. If you wanted to get really fancy you could use reflection to make sure you even got the exact same implementation of Sequence, Feature etc. Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 David Jung Sent by: biojava-dev-bounces@portal.open-bio.org 03/30/2005 06:52 AM To: biojava-dev@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Sequence cloning with annotations & features... Hello. I was wondering if someone could suggest the best way to clone a Sequence (since it isn't a Cloneable). I'd like to clone the raw sequence, annotations and features. I know how to clone the raw sequence and annotations, but aren't sure how to clone features, including all nested features recursively. (This is so that I can add features to the clone and they won't appear on the original Sequence) Thanks, -David Jung. _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ______________________________________________________________________ The Novartis email address format has changed to firstname.lastname@novartis.com. Please update your address book accordingly. ______________________________________________________________________ From heuermh at acm.org Wed Mar 30 02:05:55 2005 From: heuermh at acm.org (Michael Heuer) Date: Wed Mar 30 02:12:09 2005 Subject: [Biojava-dev] Sequence cloning with annotations & features... In-Reply-To: Message-ID: On Wed, 30 Mar 2005 mark.schreiber@novartis.com wrote: > A very crude way to "deep clone" would be to serialize the Sequence object > and deserialize it as another object. We did a lot of work on > Serialization a few years back to get biojava to work with it. If you > notice anything odd let us know. The advantage of this is that both would > be exactly the same, even down to the implementing classes. Presumably if you serialize a Sequence to a stream or file and then deserialize it back into memory again, shouldn't the before and after copies be "equal" to each other? Thinking back to a chapter in Bloch's Effective Java that I haven't read in a while... michael From mark.schreiber at novartis.com Wed Mar 30 02:42:34 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Wed Mar 30 02:37:09 2005 Subject: [Biojava-dev] Sequence cloning with annotations & features... Message-ID: Only if you reassign it to the same reference. Otherwise there is a true deep copy/ eg Sequence seqA = ... //code to initialize SeqA Sequence seqB = null ObjectOutputStream oos = ... ObjectInputStream ois = ... oos.writeObject(seqA); seqB = (Sequence)ois.readObject(); seqA == seqB //this is false, there are now two copies in memory in different locations. Unless someone has done some major work on overriding the readResolve method of the Sequence implementation seqA and seqB do not reference the same memory location. seqA.equals(seqB) //probably true depending on implementation of equals() This is a true deep copy. Most clone operations are shallow so many of the internal objects of a cloned object are cannonical, thier reference points to the same memory location, (== returns true). There are however some tricks in BioJava. Because lots of things are singletons (eg instances of Alphabet) there are read resolve methods to prevent these been duplicated. If you serialized a Distribution over DNA and then read it back in. Both Distributions would still reference the same Alphabet object even though everything else has been duplicated. The regenerated Distribution is simply reconnected with an existing Alphabet instance. This even happens across JVMs! - Mark Michael Heuer Sent by: Michael Heuer 03/30/2005 03:05 PM To: Mark Schreiber/GP/Novartis@PH cc: David Jung , Subject: Re: [Biojava-dev] Sequence cloning with annotations & features... On Wed, 30 Mar 2005 mark.schreiber@novartis.com wrote: > A very crude way to "deep clone" would be to serialize the Sequence object > and deserialize it as another object. We did a lot of work on > Serialization a few years back to get biojava to work with it. If you > notice anything odd let us know. The advantage of this is that both would > be exactly the same, even down to the implementing classes. Presumably if you serialize a Sequence to a stream or file and then deserialize it back into memory again, shouldn't the before and after copies be "equal" to each other? Thinking back to a chapter in Bloch's Effective Java that I haven't read in a while... michael ______________________________________________________________________ The Novartis email address format has changed to firstname.lastname@novartis.com. Please update your address book accordingly. ______________________________________________________________________ From carys689 at yahoo.com Fri Mar 4 09:48:57 2005 From: carys689 at yahoo.com (Cary Scofield) Date: Sun Apr 10 22:08:49 2005 Subject: [Biojava-dev] missing docbook file Message-ID: <20050304145413.87797.qmail@web31010.mail.mud.yahoo.com> Hello, I have the source to biojava-1.4pre1 and when I tried to execute the ant target "docbook", it complained that it could not find biojava-doc-main.xml file. If anyone has this file, could you please email it to me. Thanks, Cary Scofield ===== Cary Scofield carys689@yahoo.com cps0@labs.gte.com __________________________________ Celebrate Yahoo!'s 10th Birthday! Yahoo! Netrospective: 100 Moments of the Web http://birthday.yahoo.com/netrospective/ From uwaje at affectis.com Wed Mar 23 09:35:18 2005 From: uwaje at affectis.com (Nkemdilim Uwaje) Date: Sun Apr 10 22:11:47 2005 Subject: [Biojava-dev] WG: pI Calculation in BioJava Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 2950 bytes Desc: image001.jpg Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050323/72476b84/attachment.jpg From pac at lifeformulae.com Wed Mar 23 14:13:09 2005 From: pac at lifeformulae.com (Pamela Culpepper) Date: Sun Apr 10 22:11:48 2005 Subject: [Biojava-dev] LifeFormulae ASN.1 ReaderToolSet (LARTS) Message-ID: <1111605703.4699.54.camel@localhost.localdomain> LifeFormulae ASN.1 ReaderToolSet (LARTS) The LifeFormulae ASN.1 Reader Tool Set (LARTS) presents a new approach to the data mining of binary ASN.1 genomic data. - NCBI ASN.1 binary data converted to XML data stream - Complete Conversion of All ASN.1 data fields - XML Schema Provided for Converted Data - 100% Java - runs on any platform - Uses standard Java XMLFilterInterface to emit SAX events - Bundled XML Data Stream Filter Set Includes -- - Sequence Conversion - Fasta-Formatted Sequences - GenBank-type data reports LARTS is intended to read binary ASN.l data files from the National Center for Biotechnology Information (NCBI) and convert this data into an XML data stream. An XML schema is created to describe it. There is no need to write any code or use the NCBI ASN.1 programming toolkit. It uses the standard Java XMLFilter interface emitting SAX events and decodes that part of the ASN.1 standard that is needed to support the NCBI ASN.1 types. LARTS takes the NCBI ASN.1 grammar and presents it, so that freely available software such as Java and XSLT can process and present genomics data in a manner consistent with extensive data mining efforts. LARTS permits filtering of this XML stream providing further data enhancements such as sequence conversion. Bundled filters produce Fasta-formatted sequences and GenBank data reports. LARTS enables XML filter chaining allowing unparalleled access to NCBI genomics data for manipulation. LARTS was developed by William J. Eaton (wje@lifeformulae.com) and Pamela A.Culpepper (pac@lifeformulae.com) at Lifeformulae,L.L.C., 2476 Bolsover #524, Houston, TX 77005. URL: http://www.lifeformulae.com. 281-431-2351. LifeFormulae is currently accepting applications for three beta test sites. Please forward contact information to one of the above company representatives. Deadline for submissions is April 8, 2005. From smh1008 at cam.ac.uk Wed Mar 30 01:51:07 2005 From: smh1008 at cam.ac.uk (David Huen) Date: Sun Apr 10 22:11:49 2005 Subject: [Biojava-dev] Sequence cloning with annotations & features... In-Reply-To: <1205.160.91.212.176.1112136732.squirrel@callisto.ornl.gov> References: <1205.160.91.212.176.1112136732.squirrel@callisto.ornl.gov> Message-ID: On Mar 29 2005, David Jung wrote: > Hello. > I was wondering if someone could suggest the best way to > clone a Sequence (since it isn't a Cloneable). I'd like > to clone the raw sequence, annotations and features. > I know how to clone the raw sequence and annotations, > but aren't sure how to clone features, including > all nested features recursively. > (This is so that I can add features to the clone > and they won't appear on the original Sequence) If persistence of the cloned sequence is not required, I think it is possible to create a ViewSequence on the original. All features of the original are visible in the view but features added to the view are not visible on the original. Regards, David Huen