From brault at embl.de Mon Aug 4 07:47:28 2008 From: brault at embl.de (brault at embl.de) Date: Mon, 4 Aug 2008 13:47:28 +0200 Subject: [Biojava-l] Blast Parsing MultiQuery Message-ID: <2ca8e1f10808040447ue17ddc0jafc38dcef06cea95@mail.gmail.com> Hello, I would like to know how I can parse a xml file from blast multiQuery. With blast parser from http://biojava.org/wiki/BioJava:CookBook:Blast:Parser I don't find where the tag is catch. Cheers, From gwaldon at geneinfinity.org Mon Aug 4 20:22:20 2008 From: gwaldon at geneinfinity.org (George Waldon) Date: Mon, 04 Aug 2008 17:22:20 -0700 Subject: [Biojava-l] Short names for Amino acid symbols Message-ID: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> The link http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does not seem to work. Did you try SymbolTonenization? Something like: Symbol s; SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); String s = tok.tokenizeSymbol(s); Should give you the short name of any given symbol. - George > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > Sent: Sunday, July 27, 2008 8:57 AM > To: biojava-l at lists.open-bio.org > Subject: [Biojava-l] Short names for Amino acid symbols > > Hi, > > thanks to all on the list who helped me get started with Biojava, and by > the way, the online documents are quite helpful! > > I am trying to develop some code to look for signs of positive selection > in human sequences by making multiple alignments of protein sequences > and mapping the nucleotide sequences onto this alignment and checking > synonymous and nonsynonymous nucleotide substitutions in several species > (etc). > > A few small questions; > 1) I have written a class to encapsulate all I need from a given Genbank > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > sequence. I have some methods such as the following: > > private void setCDSSequence() { > Feature CDS = getCDSFeature(this.completeSequence); > Location loc = CDS.getLocation(); > SymbolList symL = this.completeSequence.subList(loc.getMin(), > loc.getMax()-3); //-3 to remove stop codon > this.CDS= symL; > } > > Question: Why is there (seemingly) no way in Biojava to create a > Sequence object instead of a SymbolList object? Or did I miss something? > > 2) I would then like to printout the protein alignment to check for > correctness, and it seems there is no way of getting from a symbol to > the one-letter aminoacid code. That is, > > proteinAlignment.get(j).symbolAt(k).getName() > > will return "Ala" instead of "A" etc. Is there a good way of getting the > short symbols? > > Thanks, Peter > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From community at struck.lu Tue Aug 5 08:04:29 2008 From: community at struck.lu (community at struck.lu) Date: Tue, 05 Aug 2008 14:04:29 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: The link should have been:http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiHope this time my webmail doesn't garble the message :-(Greetings,Daniel"George Waldon" <gwaldon at geneinfinity.org> wrote: > The link > http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does > not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > > -----Original Message----- > > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > > Sent: Sunday, July 27, 2008 8:57 AM > > To: biojava-l at lists.open-bio.org > > Subject: [Biojava-l] Short names for Amino acid symbols > > > > Hi, > > > > thanks to all on the list who helped me get started with Biojava, and by > > the way, the online documents are quite helpful! > > > > I am trying to develop some code to look for signs of positive selection > > in human sequences by making multiple alignments of protein sequences > > and mapping the nucleotide sequences onto this alignment and checking > > synonymous and nonsynonymous nucleotide substitutions in several species > > (etc). > > > > A few small questions; > > 1) I have written a class to encapsulate all I need from a given Genbank > > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > > sequence. I have some methods such as the following: > > > > private void setCDSSequence() { > > Feature CDS = getCDSFeature(this.completeSequence); > > Location loc = CDS.getLocation(); > > SymbolList symL = this.completeSequence.subList(loc.getMin(), > > loc.getMax()-3); //-3 to remove stop codon > > this.CDS= symL; > > } > > > > Question: Why is there (seemingly) no way in Biojava to create a > > Sequence object instead of a SymbolList object? Or did I miss something? > > > > 2) I would then like to printout the protein alignment to check for > > correctness, and it seems there is no way of getting from a symbol to > > the one-letter aminoacid code. That is, > > > > proteinAlignment.get(j).symbolAt(k).getName() > > > > will return "Ala" instead of "A" etc. Is there a good way of getting the > > short symbols? > > > > Thanks, Peter > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _________________________________________________________ Mail sent using root eSolutions Webmailer - www.root.lu From community at struck.lu Tue Aug 5 08:33:58 2008 From: community at struck.lu (community at struck.lu) Date: Tue, 05 Aug 2008 14:33:58 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: The link should have been: http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbi Hope this time my webmail doesn't garble the message :-( Greetings, Daniel "George Waldon" wrote: > The link > http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does > not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > > -----Original Message----- > > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > > Sent: Sunday, July 27, 2008 8:57 AM > > To: biojava-l at lists.open-bio.org > > Subject: [Biojava-l] Short names for Amino acid symbols > > > > Hi, > > > > thanks to all on the list who helped me get started with Biojava, and by > > the way, the online documents are quite helpful! > > > > I am trying to develop some code to look for signs of positive selection > > in human sequences by making multiple alignments of protein sequences > > and mapping the nucleotide sequences onto this alignment and checking > > synonymous and nonsynonymous nucleotide substitutions in several species > > (etc). > > > > A few small questions; > > 1) I have written a class to encapsulate all I need from a given Genbank > > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > > sequence. I have some methods such as the following: > > > > private void setCDSSequence() { > > Feature CDS = getCDSFeature(this.completeSequence); > > Location loc = CDS.getLocation(); > > SymbolList symL = this.completeSequence.subList(loc.getMin(), > > loc.getMax()-3); //-3 to remove stop codon > > this.CDS= symL; > > } > > > > Question: Why is there (seemingly) no way in Biojava to create a > > Sequence object instead of a SymbolList object? Or did I miss something? > > > > 2) I would then like to printout the protein alignment to check for > > correctness, and it seems there is no way of getting from a symbol to > > the one-letter aminoacid code. That is, > > > > proteinAlignment.get(j).symbolAt(k).getName() > > > > will return "Ala" instead of "A" etc. Is there a good way of getting the > > short symbols? > > > > Thanks, Peter > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _________________________________________________________ Mail sent using root eSolutions Webmailer - www.root.lu From peter.robinson at t-online.de Wed Aug 6 03:16:01 2008 From: peter.robinson at t-online.de (Peter Robinson) Date: Wed, 06 Aug 2008 09:16:01 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: <48994FB1.1020204@t-online.de> George Waldon wrote: > The link http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > Thanks George, I got things working using the code in http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbi which is equivalent to the above. -Peter >> -----Original Message----- >> From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- >> bounces at lists.open-bio.org] On Behalf Of Peter Robinson >> Sent: Sunday, July 27, 2008 8:57 AM >> To: biojava-l at lists.open-bio.org >> Subject: [Biojava-l] Short names for Amino acid symbols >> >> Hi, >> >> thanks to all on the list who helped me get started with Biojava, and by >> the way, the online documents are quite helpful! >> >> I am trying to develop some code to look for signs of positive selection >> in human sequences by making multiple alignments of protein sequences >> and mapping the nucleotide sequences onto this alignment and checking >> synonymous and nonsynonymous nucleotide substitutions in several species >> (etc). >> >> A few small questions; >> 1) I have written a class to encapsulate all I need from a given Genbank >> mRNA sequence; the entire mRNA, the CDS and the corresponding protein >> sequence. I have some methods such as the following: >> >> private void setCDSSequence() { >> Feature CDS = getCDSFeature(this.completeSequence); >> Location loc = CDS.getLocation(); >> SymbolList symL = this.completeSequence.subList(loc.getMin(), >> loc.getMax()-3); //-3 to remove stop codon >> this.CDS= symL; >> } >> >> Question: Why is there (seemingly) no way in Biojava to create a >> Sequence object instead of a SymbolList object? Or did I miss something? >> >> 2) I would then like to printout the protein alignment to check for >> correctness, and it seems there is no way of getting from a symbol to >> the one-letter aminoacid code. That is, >> >> proteinAlignment.get(j).symbolAt(k).getName() >> >> will return "Ala" instead of "A" etc. Is there a good way of getting the >> short symbols? >> >> Thanks, Peter >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From willishf at ufl.edu Sat Aug 9 16:13:38 2008 From: willishf at ufl.edu (Scooter Willis) Date: Sat, 09 Aug 2008 16:13:38 -0400 Subject: [Biojava-l] Quicktree Message-ID: <489DFA72.4040709@ufl.edu> I am searching for a Java implementation of Quicktree or other accepted methods for reconstructing phylogenies from aligned sequence data. I am currently using quicktree but it requires sequence data to be in stockholm format which requires a conversion step, followed by running quicktree in cygwin on windows and then parse the Newick/Hew Hampshire format. I use quicktree because it is fast against large sequences. I would like to integrate the tree construction step as part of my Java application. Anyone know of an accepted Java library for building trees? Plenty of tree viewers just can't seem to find anything to construct from aligned sequence data. Thanks Scooter From andreas at sdsc.edu Sun Aug 10 11:59:12 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 10 Aug 2008 08:59:12 -0700 Subject: [Biojava-l] biojava paper published Message-ID: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> Hi All, I am glad to announce that an Application Note describing BioJava has been accepted for publication in Bioinformatics. The advance access manuscript is available from: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn397v1?ijkey=jIKd6VUGPrgshbv&keytype=ref As alwyas, happy biojava-ing, Andreas From hlapp at gmx.net Sun Aug 10 13:21:57 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 10 Aug 2008 13:21:57 -0400 Subject: [Biojava-l] biojava paper published In-Reply-To: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> References: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> Message-ID: <7A99248B-DB41-4FCE-AFA2-CE1D0E083827@gmx.net> Congratulations!!! -hilmar On Aug 10, 2008, at 11:59 AM, Andreas Prlic wrote: > Hi All, > > I am glad to announce that an Application Note describing BioJava has > been accepted for publication in Bioinformatics. > The advance access manuscript is available from: > > http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn397v1?ijkey=jIKd6VUGPrgshbv&keytype=ref > > As alwyas, > > happy biojava-ing, > > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jimp at compbio.dundee.ac.uk Tue Aug 12 08:39:46 2008 From: jimp at compbio.dundee.ac.uk (James Procter) Date: Tue, 12 Aug 2008 13:39:46 +0100 Subject: [Biojava-l] Quicktree In-Reply-To: <489DFA72.4040709@ufl.edu> References: <489DFA72.4040709@ufl.edu> Message-ID: <48A18492.1020009@compbio.dundee.ac.uk> Hi Scooter Scooter Willis wrote: > I am searching for a Java implementation of Quicktree or other accepted > methods for reconstructing phylogenies from aligned sequence data. I am > currently using quicktree but it requires sequence data to be in > stockholm format which requires a conversion step, followed by running > quicktree in cygwin on windows and then parse the Newick/Hew Hampshire > format. I use quicktree because it is fast against large sequences. I > would like to integrate the tree construction step as part of my Java > application. Anyone know of an accepted Java library for building trees? > Plenty of tree viewers just can't seem to find anything to construct > from aligned sequence data. You could use the neighbour joining implementation from the Jalview source - since it is GPL. It will construct a tree from aligned data, but it uses the Jalview datamodel, which may cause you problems. Alternatively, there's a library called PAL, but I'm not sure if that's actually being supported now. Jim Procter. -- ------------------------------------------------------------------- J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk The University of Dundee is a Scottish Registered Charity, No. SC015096. From hlapp at gmx.net Tue Aug 12 22:02:33 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Aug 2008 22:02:33 -0400 Subject: [Biojava-l] Quicktree In-Reply-To: <48A18492.1020009@compbio.dundee.ac.uk> References: <489DFA72.4040709@ufl.edu> <48A18492.1020009@compbio.dundee.ac.uk> Message-ID: <3B68B799-0E14-4FCC-ABFB-1E51DC3B67F9@gmx.net> On Aug 12, 2008, at 8:39 AM, James Procter wrote: > Alternatively, there's a library called PAL, but I'm not sure if > that's > actually being supported now. There is a successor project called JEBL (http://jebl.sf.net). It sounds like it has a NJ implementation: http://jebl.sourceforge.net/doc/api/jebl/evolution/trees/NeighborJoiningTreeBuilder.html -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From watson at ebi.ac.uk Wed Aug 13 05:00:24 2008 From: watson at ebi.ac.uk (James Watson) Date: Wed, 13 Aug 2008 10:00:24 +0100 Subject: [Biojava-l] Hands-on course at the European Bioinformatics Institute - Programmatic access in Java: webservices and work flows Message-ID: <48A2A2A8.7010706@ebi.ac.uk> Dear colleagues, A hands-on course called "Programmatic access in Java: webservices & work flows" will be held on 24-27 November 2008 at the European Bioinformatics Institute in Hinxton, Cambridgeshire, UK. This course will give you the skills to leverage webservice technology to access and manipulate bioinformatics data resources and tools. You will start with simple scripts accessing individual services and then build upon this to create work flows to solve more complex problems in a reusable manner. Participants will be exposed to open standards such as Simple Object Access Protocol (SOAP); the Distributed Annotation System (DAS); REST services and the BioMart web service. Several examples of specific web services will be included, covering programmatic access to both databases and tools at the EBI. The course costs ?75 and interested candidates are encouraged to apply online at the URL below (the training is free, however we need to charge participants an administration fee of ?25 per day to cover food and materials, and participants need to pay their own travel and accommodation): www.ebi.ac.uk/training/handson/course_081124_javawebservices.html This course will have a maximum number of 40 participants on a first come first serve basis, so please register early to avoid disappointment. *The deadline for registering for this event is Monday 27 October 2008.* Best regards, James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 26-27 August 2008: Interactions and Pathways 1-3 September 2008: Joint EBI?ENFIN workshop - Protein function prediction tools 8-11 September 2008: Programmatic access in Perl: webservices & work flows 6-8 October 2008: 2-day dip into the EBI?s data resources: Understanding your data 24-27 November 2008: Programmatic access in Java: webservices & work flows From phidias51 at gmail.com Thu Aug 14 14:26:06 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Thu, 14 Aug 2008 11:26:06 -0700 Subject: [Biojava-l] BioGroovy Message-ID: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> I've been using the biojava library with groovy lately and I ran across the BioGroovy.org site. The site seems to be a placeholder and doesn't really have much information on it. I was wondering if it was an official Bio* site? Has anyone else been using Groovy (or any other scripting languages) with BioJava? Also has anyone looked at using Grails with BioSQL? It would seem like an easy way to get something started quickly. Regards, -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From andreas at sdsc.edu Thu Aug 14 23:15:00 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 14 Aug 2008 20:15:00 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> Message-ID: <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> Hi Mark, On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: > I've been using the biojava library with groovy lately and I ran > across the BioGroovy.org site. The site seems to be a placeholder and > doesn't really have much information on it. I was wondering if it was > an official Bio* site? I don't think it is an official site from the open bioinformatics foundation. If you do a whois for biogroovy.org it gives the address of a bioinformatics center from korea. In comparison the whois for biojava points to Chris Dagdigian from the obf. Cheers, Andreas From ayates at ebi.ac.uk Fri Aug 15 04:55:15 2008 From: ayates at ebi.ac.uk (Andy Yates) Date: Fri, 15 Aug 2008 09:55:15 +0100 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> Message-ID: <48A54473.1060006@ebi.ac.uk> Hi Mark, There has been talk in the past about a groovier version of BioJava or at the very least showing where Groovy can help to reduce the verbosity of some parts of the biojava framework. I've done a tiny bit but my work has only been into prototyping Java code & quickly asserting some assumptions I had about BioJava (as in how the framework works). What would be a brilliant step in a Groovier BioJava is to start levering the builders (http://groovy.codehaus.org/Builders). I mean imagine being able to write something like: def myReferences = getReferences(); new EmblBuilder().build { id('U00096') myReferences.each{ ref -> reference { // } } } I admit it's not a fully formed idea at the moment but hopefully you can see where I'm going with this :) WRT Grails; our supported BioSQL API is written in Hibernate; just the same as GORM (Grails' ORM solution). So technically I cannot see a reason why it wouldn't be possible; my only wonder is how Grails controls transaction boundaries and translating this to our BioSQL. Andy Mark Fortner wrote: > I've been using the biojava library with groovy lately and I ran > across the BioGroovy.org site. The site seems to be a placeholder and > doesn't really have much information on it. I was wondering if it was > an official Bio* site? Has anyone else been using Groovy (or any > other scripting languages) with BioJava? > > Also has anyone looked at using Grails with BioSQL? It would seem > like an easy way to get something started quickly. > > Regards, > From phidias51 at gmail.com Fri Aug 15 11:24:49 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:24:49 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> Message-ID: <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> Thanks Andreas. That confirmed my suspicions. It sounds like there's some interest in Groovy from the community. At some point it might be worth putting together a cookbook, although I'm not sure what site would be appropriate for it. Mark On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: > Hi Mark, > > On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >> I've been using the biojava library with groovy lately and I ran >> across the BioGroovy.org site. The site seems to be a placeholder and >> doesn't really have much information on it. I was wondering if it was >> an official Bio* site? > > > I don't think it is an official site from the open bioinformatics > foundation. If you do a whois for biogroovy.org it gives the address > of a bioinformatics center from korea. In comparison the whois for > biojava points to Chris Dagdigian from the obf. > > Cheers, > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From andreas at sdsc.edu Fri Aug 15 11:31:45 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 15 Aug 2008 08:31:45 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> Message-ID: <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> I can help you with the contacts to OBF. One possibillity is to ask the korean people if they would be interested in collaborating on this and perhaps transfering the domain to obf. Andreas On Fri, Aug 15, 2008 at 8:24 AM, Mark Fortner wrote: > Thanks Andreas. That confirmed my suspicions. > > It sounds like there's some interest in Groovy from the community. At > some point it might be worth putting together a cookbook, although I'm > not sure what site would be appropriate for it. > > Mark > > On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: >> Hi Mark, >> >> On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >>> I've been using the biojava library with groovy lately and I ran >>> across the BioGroovy.org site. The site seems to be a placeholder and >>> doesn't really have much information on it. I was wondering if it was >>> an official Bio* site? >> >> >> I don't think it is an official site from the open bioinformatics >> foundation. If you do a whois for biogroovy.org it gives the address >> of a bioinformatics center from korea. In comparison the whois for >> biojava points to Chris Dagdigian from the obf. >> >> Cheers, >> Andreas >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From phidias51 at gmail.com Fri Aug 15 11:41:49 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:41:49 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> Message-ID: <6e1d61f50808150841t1a8aceb3qba8df2048c0af22a@mail.gmail.com> Hi Andreas, I'll check with the owners and see what they say. I think at some point we'll need a code repository. I didn't see any links from the current page to either a cvs or svn repository. They may be interested in hosting the wiki, and perhaps the code could be hosted with the OBF. I'll let you know what I find out. Mark On Fri, Aug 15, 2008 at 8:31 AM, Andreas Prlic wrote: > I can help you with the contacts to OBF. One possibillity is to ask > the korean people if they would be interested in collaborating on this > and perhaps transfering the domain to obf. > > Andreas > > > > On Fri, Aug 15, 2008 at 8:24 AM, Mark Fortner wrote: >> Thanks Andreas. That confirmed my suspicions. >> >> It sounds like there's some interest in Groovy from the community. At >> some point it might be worth putting together a cookbook, although I'm >> not sure what site would be appropriate for it. >> >> Mark >> >> On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: >>> Hi Mark, >>> >>> On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >>>> I've been using the biojava library with groovy lately and I ran >>>> across the BioGroovy.org site. The site seems to be a placeholder and >>>> doesn't really have much information on it. I was wondering if it was >>>> an official Bio* site? >>> >>> >>> I don't think it is an official site from the open bioinformatics >>> foundation. If you do a whois for biogroovy.org it gives the address >>> of a bioinformatics center from korea. In comparison the whois for >>> biojava points to Chris Dagdigian from the obf. >>> >>> Cheers, >>> Andreas >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> >> >> >> -- >> Mark Fortner >> >> blog: http://feeds.feedburner.com/jroller/ideafactory >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From phidias51 at gmail.com Fri Aug 15 11:49:30 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:49:30 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <48A54473.1060006@ebi.ac.uk> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> Message-ID: <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> Hi Andy, The builders and closures definitely make it easier to use and cut down on the verbosity of the language. You also have built-in support for CLI, and can leverage libraries like ORO for regular expression handling. The XmlSlurper makes it easier to handle downloading and parsing XML. I created a roadmap for a series of blog articles on various common bioinformatics-related tasks. I started out with a couple of quick entries on using NCBI's EUtils with Groovy. If there's some interest, I'll see about posting the roadmap on a wiki somewhere (along with some of the "recipes" that I've written). Anyone who's interested could then contribute their own "recipes" to it. I'm just getting started with Grails. My initial thought was to identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, SimpleBioEntry, etc) as domain objects and have the grails ant script handle generating the gui and persistence stacks for them (perhaps using Derby). This might be overly-simplistic, but I'm looking for ways to make biojavax, and biosql more easily accessible. Mark On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > Hi Mark, > > There has been talk in the past about a groovier version of BioJava or at > the very least showing where Groovy can help to reduce the verbosity of some > parts of the biojava framework. I've done a tiny bit but my work has only > been into prototyping Java code & quickly asserting some assumptions I had > about BioJava (as in how the framework works). > > What would be a brilliant step in a Groovier BioJava is to start levering > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > able to write something like: > > def myReferences = getReferences(); > > new EmblBuilder().build { > id('U00096') > myReferences.each{ ref -> > reference { > // > } > } > } > > I admit it's not a fully formed idea at the moment but hopefully you can see > where I'm going with this :) > > WRT Grails; our supported BioSQL API is written in Hibernate; just the same > as GORM (Grails' ORM solution). So technically I cannot see a reason why it > wouldn't be possible; my only wonder is how Grails controls transaction > boundaries and translating this to our BioSQL. > > Andy > > Mark Fortner wrote: >> >> I've been using the biojava library with groovy lately and I ran >> across the BioGroovy.org site. The site seems to be a placeholder and >> doesn't really have much information on it. I was wondering if it was >> an official Bio* site? Has anyone else been using Groovy (or any >> other scripting languages) with BioJava? >> >> Also has anyone looked at using Grails with BioSQL? It would seem >> like an easy way to get something started quickly. >> >> Regards, >> > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From koen.bruynseels at cropdesign.com Fri Aug 15 12:10:27 2008 From: koen.bruynseels at cropdesign.com (koen.bruynseels at cropdesign.com) Date: Fri, 15 Aug 2008 18:10:27 +0200 Subject: [Biojava-l] Koen Bruynseels is out of the office. Message-ID: I will be out of the office starting 14/08/2008 and will not return until 01/09/2008. I will respond to your message when I return. From markjschreiber at gmail.com Sat Aug 16 06:11:35 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sat, 16 Aug 2008 18:11:35 +0800 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> Message-ID: <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> Hi - There are really 2 approaches you could take with Groovy. One would be to write and entire API. This would be a "BioGroovy". The other approach would be to use the BioJava API and use Groovy to string together the BioJava objects to write the programs. I tend to think the second is usually the better option for dynamic languages like Groovy. If you go for the second option the BioJava cookbook would be a suitable place for examples. Actually using Groovy to make programs with the BioJava library could smooth the learning curve of BioJava a little. It's worth being mindful of the performance of Groovy at this stage. While this will undoubtably improve with future versions you can currently expect Groovy code to run about 10x slower than Java so it might not be good to implement any kind of sequence alignment or HMM algorithm in Groovy. - Mark On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: > Hi Andy, > The builders and closures definitely make it easier to use and cut > down on the verbosity of the language. You also have built-in support > for CLI, and can leverage libraries like ORO for regular expression > handling. The XmlSlurper makes it easier to handle downloading and > parsing XML. > > I created a roadmap for a series of blog articles on various common > bioinformatics-related tasks. I started out with a couple of quick > entries on using NCBI's EUtils with Groovy. If there's some interest, > I'll see about posting the roadmap on a wiki somewhere (along with > some of the "recipes" that I've written). Anyone who's interested > could then contribute their own "recipes" to it. > > I'm just getting started with Grails. My initial thought was to > identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, > SimpleBioEntry, etc) as domain objects and have the grails ant script > handle generating the gui and persistence stacks for them (perhaps > using Derby). > > This might be overly-simplistic, but I'm looking for ways to make > biojavax, and biosql more easily accessible. > > Mark > > On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > > Hi Mark, > > > > There has been talk in the past about a groovier version of BioJava or at > > the very least showing where Groovy can help to reduce the verbosity of > some > > parts of the biojava framework. I've done a tiny bit but my work has only > > been into prototyping Java code & quickly asserting some assumptions I > had > > about BioJava (as in how the framework works). > > > > What would be a brilliant step in a Groovier BioJava is to start levering > > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > > able to write something like: > > > > def myReferences = getReferences(); > > > > new EmblBuilder().build { > > id('U00096') > > myReferences.each{ ref -> > > reference { > > // > > } > > } > > } > > > > I admit it's not a fully formed idea at the moment but hopefully you can > see > > where I'm going with this :) > > > > WRT Grails; our supported BioSQL API is written in Hibernate; just the > same > > as GORM (Grails' ORM solution). So technically I cannot see a reason why > it > > wouldn't be possible; my only wonder is how Grails controls transaction > > boundaries and translating this to our BioSQL. > > > > Andy > > > > Mark Fortner wrote: > >> > >> I've been using the biojava library with groovy lately and I ran > >> across the BioGroovy.org site. The site seems to be a placeholder and > >> doesn't really have much information on it. I was wondering if it was > >> an official Bio* site? Has anyone else been using Groovy (or any > >> other scripting languages) with BioJava? > >> > >> Also has anyone looked at using Grails with BioSQL? It would seem > >> like an easy way to get something started quickly. > >> > >> Regards, > >> > > > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From phidias51 at gmail.com Sat Aug 16 12:31:47 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Sat, 16 Aug 2008 09:31:47 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> Message-ID: <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> Hi Mark, So far, most of my experiments have made use of BioJava, and CDK. I'm hoping to try out a few things with the JRI (Java R Interface) and JEMBOSS as well. I agree, there's no point in simply rewriting BioJava code in Groovy. I'd rather leverage the appropriate libraries wherever possible. This brings up another question, if we're leveraging libraries other than BioJava does it make sense to post the BioGroovy cookbooks on the BioJava site? Also, as Andy intimated though, there is probably going to be a point where standard Groovy things (like Builders) aren't available in BioJava. When those things happen, some decision will need to be made as to whether the Builder should be implemented in Java or in Groovy. My tendency would be to make it available in BioJava (or BioJavaX), which would let others leverage the simplified syntax not only in Java but in scripting languages (such as JRuby and Jython) as well. As for performance, when you use the Eclipse Groovy plugin in automatically compiles the Groovy script to Java bytecode, so I haven't really noticed any difference in speed -- although most of what I've tried hasn't been computationally challenging either. Mark On Sat, Aug 16, 2008 at 3:11 AM, Mark Schreiber wrote: > Hi - > > There are really 2 approaches you could take with Groovy. One would be to > write and entire API. This would be a "BioGroovy". The other approach would > be to use the BioJava API and use Groovy to string together the BioJava > objects to write the programs. I tend to think the second is usually the > better option for dynamic languages like Groovy. If you go for the second > option the BioJava cookbook would be a suitable place for examples. > Actually using Groovy to make programs with the BioJava library could smooth > the learning curve of BioJava a little. > > It's worth being mindful of the performance of Groovy at this stage. While > this will undoubtably improve with future versions you can currently expect > Groovy code to run about 10x slower than Java so it might not be good to > implement any kind of sequence alignment or HMM algorithm in Groovy. > > - Mark > > On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: >> >> Hi Andy, >> The builders and closures definitely make it easier to use and cut >> down on the verbosity of the language. You also have built-in support >> for CLI, and can leverage libraries like ORO for regular expression >> handling. The XmlSlurper makes it easier to handle downloading and >> parsing XML. >> >> I created a roadmap for a series of blog articles on various common >> bioinformatics-related tasks. I started out with a couple of quick >> entries on using NCBI's EUtils with Groovy. If there's some interest, >> I'll see about posting the roadmap on a wiki somewhere (along with >> some of the "recipes" that I've written). Anyone who's interested >> could then contribute their own "recipes" to it. >> >> I'm just getting started with Grails. My initial thought was to >> identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, >> SimpleBioEntry, etc) as domain objects and have the grails ant script >> handle generating the gui and persistence stacks for them (perhaps >> using Derby). >> >> This might be overly-simplistic, but I'm looking for ways to make >> biojavax, and biosql more easily accessible. >> >> Mark >> >> On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: >> > Hi Mark, >> > >> > There has been talk in the past about a groovier version of BioJava or >> > at >> > the very least showing where Groovy can help to reduce the verbosity of >> > some >> > parts of the biojava framework. I've done a tiny bit but my work has >> > only >> > been into prototyping Java code & quickly asserting some assumptions I >> > had >> > about BioJava (as in how the framework works). >> > >> > What would be a brilliant step in a Groovier BioJava is to start >> > levering >> > the builders (http://groovy.codehaus.org/Builders). I mean imagine being >> > able to write something like: >> > >> > def myReferences = getReferences(); >> > >> > new EmblBuilder().build { >> > id('U00096') >> > myReferences.each{ ref -> >> > reference { >> > // >> > } >> > } >> > } >> > >> > I admit it's not a fully formed idea at the moment but hopefully you can >> > see >> > where I'm going with this :) >> > >> > WRT Grails; our supported BioSQL API is written in Hibernate; just the >> > same >> > as GORM (Grails' ORM solution). So technically I cannot see a reason why >> > it >> > wouldn't be possible; my only wonder is how Grails controls transaction >> > boundaries and translating this to our BioSQL. >> > >> > Andy >> > >> > Mark Fortner wrote: >> >> >> >> I've been using the biojava library with groovy lately and I ran >> >> across the BioGroovy.org site. The site seems to be a placeholder and >> >> doesn't really have much information on it. I was wondering if it was >> >> an official Bio* site? Has anyone else been using Groovy (or any >> >> other scripting languages) with BioJava? >> >> >> >> Also has anyone looked at using Grails with BioSQL? It would seem >> >> like an easy way to get something started quickly. >> >> >> >> Regards, >> >> >> > >> >> >> >> -- >> Mark Fortner >> >> blog: http://feeds.feedburner.com/jroller/ideafactory >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From markjschreiber at gmail.com Sat Aug 16 23:16:55 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sun, 17 Aug 2008 11:16:55 +0800 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> Message-ID: <93b45ca50808162016l2f5160b3qf13ca97e02299154@mail.gmail.com> > This brings up another question, if we're leveraging libraries other > than BioJava does it make sense to post the BioGroovy cookbooks on the > BioJava site? > Sure. Maybe put them in their own section but they should go into the BioJava cookbook. > Also, as Andy intimated though, there is probably going to be a point > where standard Groovy things (like Builders) aren't available in > BioJava. When those things happen, some decision will need to be made > as to whether the Builder should be implemented in Java or in Groovy. > My tendency would be to make it available in BioJava (or BioJavaX), > which would let others leverage the simplified syntax not only in Java > but in scripting languages (such as JRuby and Jython) as well. > I think in future versions of BioJava that Groovy could be incorporated. Java and groovy are very complementary so there is no real reason not to. I wouldn't be too surprised if Groovy got absorbed into the JDK at some point. > As for performance, when you use the Eclipse Groovy plugin in > automatically compiles the Groovy script to Java bytecode, so I > haven't really noticed any difference in speed -- although most of > what I've tried hasn't been computationally challenging either. Yes. This helps in many cases. It also helps (for compiled applications) if you define types where you can. This means the runtime doesn't have to do so much introspection. - Mark > > Mark > > On Sat, Aug 16, 2008 at 3:11 AM, Mark Schreiber > wrote: > > Hi - > > > > There are really 2 approaches you could take with Groovy. One would be to > > write and entire API. This would be a "BioGroovy". The other approach would > > be to use the BioJava API and use Groovy to string together the BioJava > > objects to write the programs. I tend to think the second is usually the > > better option for dynamic languages like Groovy. If you go for the second > > option the BioJava cookbook would be a suitable place for examples. > > Actually using Groovy to make programs with the BioJava library could smooth > > the learning curve of BioJava a little. > > > > It's worth being mindful of the performance of Groovy at this stage. While > > this will undoubtably improve with future versions you can currently expect > > Groovy code to run about 10x slower than Java so it might not be good to > > implement any kind of sequence alignment or HMM algorithm in Groovy. > > > > - Mark > > > > On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: > >> > >> Hi Andy, > >> The builders and closures definitely make it easier to use and cut > >> down on the verbosity of the language. You also have built-in support > >> for CLI, and can leverage libraries like ORO for regular expression > >> handling. The XmlSlurper makes it easier to handle downloading and > >> parsing XML. > >> > >> I created a roadmap for a series of blog articles on various common > >> bioinformatics-related tasks. I started out with a couple of quick > >> entries on using NCBI's EUtils with Groovy. If there's some interest, > >> I'll see about posting the roadmap on a wiki somewhere (along with > >> some of the "recipes" that I've written). Anyone who's interested > >> could then contribute their own "recipes" to it. > >> > >> I'm just getting started with Grails. My initial thought was to > >> identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, > >> SimpleBioEntry, etc) as domain objects and have the grails ant script > >> handle generating the gui and persistence stacks for them (perhaps > >> using Derby). > >> > >> This might be overly-simplistic, but I'm looking for ways to make > >> biojavax, and biosql more easily accessible. > >> > >> Mark > >> > >> On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > >> > Hi Mark, > >> > > >> > There has been talk in the past about a groovier version of BioJava or > >> > at > >> > the very least showing where Groovy can help to reduce the verbosity of > >> > some > >> > parts of the biojava framework. I've done a tiny bit but my work has > >> > only > >> > been into prototyping Java code & quickly asserting some assumptions I > >> > had > >> > about BioJava (as in how the framework works). > >> > > >> > What would be a brilliant step in a Groovier BioJava is to start > >> > levering > >> > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > >> > able to write something like: > >> > > >> > def myReferences = getReferences(); > >> > > >> > new EmblBuilder().build { > >> > id('U00096') > >> > myReferences.each{ ref -> > >> > reference { > >> > // > >> > } > >> > } > >> > } > >> > > >> > I admit it's not a fully formed idea at the moment but hopefully you can > >> > see > >> > where I'm going with this :) > >> > > >> > WRT Grails; our supported BioSQL API is written in Hibernate; just the > >> > same > >> > as GORM (Grails' ORM solution). So technically I cannot see a reason why > >> > it > >> > wouldn't be possible; my only wonder is how Grails controls transaction > >> > boundaries and translating this to our BioSQL. > >> > > >> > Andy > >> > > >> > Mark Fortner wrote: > >> >> > >> >> I've been using the biojava library with groovy lately and I ran > >> >> across the BioGroovy.org site. The site seems to be a placeholder and > >> >> doesn't really have much information on it. I was wondering if it was > >> >> an official Bio* site? Has anyone else been using Groovy (or any > >> >> other scripting languages) with BioJava? > >> >> > >> >> Also has anyone looked at using Grails with BioSQL? It would seem > >> >> like an easy way to get something started quickly. > >> >> > >> >> Regards, > >> >> > >> > > >> > >> > >> > >> -- > >> Mark Fortner > >> > >> blog: http://feeds.feedburner.com/jroller/ideafactory > >> _______________________________________________ > >> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From augustovmail-java at yahoo.com.br Wed Aug 20 09:36:27 2008 From: augustovmail-java at yahoo.com.br (Augusto Fernandes Vellozo) Date: Wed, 20 Aug 2008 15:36:27 +0200 Subject: [Biojava-l] Exception org.hibernate.NonUniqueObjectException Message-ID: <381a3e850808200636n50e8700ap21d54a4554dd2fb5@mail.gmail.com> Hi, I am trying to load a lot of features from one file to MYSQL and i am having problems to do this with BIOJAVA/hibernate. If I don't do the flush/clear in the session, i have one exception like OutOfMemory. But, after I do the flush/clear, the second query throws the exception: org.hibernate.NonUniqueObjectException: a different object with the same identifier value was already associated with the session: [Term#23755] I've already tried to clean the RichObjectFactory, but it doesn't work. Please, some one knows what could be happening? Some suggestion? The code is below. Thanks, -- Augusto F. Vellozo import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.util.TreeSet; import org.biojava.bio.BioException; import org.biojavax.RichObjectFactory; import org.biojavax.SimpleRichAnnotation; import org.biojavax.bio.seq.RichFeature; import org.biojavax.bio.seq.SimplePosition; import org.biojavax.bio.seq.SimpleRichFeature; import org.biojavax.bio.seq.SimpleRichLocation; import org.biojavax.bio.seq.RichLocation.Strand; import org.biojavax.bio.taxa.NCBITaxon; import org.biojavax.ontology.SimpleComparableOntology; import org.hibernate.Session; import org.hibernate.SessionFactory; import org.hibernate.Transaction; import org.hibernate.cfg.Configuration; public class LoadORFVRTest { public static void main(String[] args) { SessionFactory sessionFactory = new Configuration().configure("hibernate.cfg.xml").buildSessionFactory(); Session session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultNamespaceName(Messages.getString("nameSpaceDefault")); Transaction tx = session.beginTransaction(); try { //file orfs File fileOrfs; fileOrfs = new File(args[0]); String orfName, geneName = ""; BufferedReader br = new BufferedReader(new FileReader(fileOrfs)); String line, line2, line3, lineAmino; int countOrfs = 0; int beginPos = -1, endPos = -1, nextPos = -1; int strand = 0; int stepORF = Integer.parseInt(Messages.getString("LoadORFVR.printORF")); while ((line = br.readLine()) != null) { if (line.length() > 0) { if (line.startsWith(">")) { //ORF heading //new ORF //save last ORF if (strand != 0) { saveORF(session, strand, beginPos, endPos, nextPos - 1, geneName, Integer.parseInt(args[1])); countOrfs++; } if (countOrfs % stepORF == 0) { System.out.println(countOrfs); session.flush(); tx.commit(); session.clear(); session.close(); RichObjectFactory.clearLRUCache(); session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultNamespaceName(Messages.getString("nameSpaceDefault")); tx = session.beginTransaction(); } orfName = line.substring(1); geneName = orfName.substring(0, orfName.indexOf("_")); line = br.readLine(); if (line.startsWith("Reading frame: ")) { strand = Integer.parseInt(line.substring(15)); if (strand == 0) { System.out.println("Format error, strand = 0"); } else { nextPos = 1; beginPos = -1; endPos = -1; } } else { System.out.println("Format error in line 'Reading frame':" + line); strand = 0; } br.readLine(); // empty line } else if (strand != 0) { //ORF sequence line2 = br.readLine(); line3 = br.readLine(); br.readLine(); // empty line if (strand < 0) { lineAmino = line3; } else { lineAmino = line; } lineAmino = lineAmino.substring(3, lineAmino.length() - 1); if (lineAmino.trim().length() != 0) { if (beginPos < 0) { beginPos = nextPos + firstPosNotSpace(lineAmino) - 1; } endPos = nextPos + lastPosNotSpace(lineAmino) + 1; } nextPos += lineAmino.length(); } } } if (strand != 0) { saveORF(session, strand, beginPos, endPos, nextPos - 1, geneName, Integer.parseInt(args[1])); } session.flush(); tx.commit(); session.clear(); } catch (Exception e) { e.printStackTrace(); } finally { if (tx.isActive()) { tx.rollback(); } session.close(); } } public static void saveORF(Session session, int strand, int beginPos, int endPos, int lastPos, String geneName, int ncbiTaxonId) throws BioException { SimplePosition beginPosition, endPosition; if (strand < 0 && beginPos < 4) { beginPosition = new SimplePosition(true, false, beginPos); } else { beginPosition = new SimplePosition(beginPos); } if (strand > 0 && (endPos == lastPos)) { endPosition = new SimplePosition(false, true, endPos); } else { endPosition = new SimplePosition(endPos); } // save; NCBITaxon taxon = (NCBITaxon) session.createQuery("from Taxon where ncbi_taxon_id=:ncbiTaxonNumber").setInteger( "ncbiTaxonNumber", ncbiTaxonId).uniqueResult(); SimpleComparableOntology ontFeatures = (SimpleComparableOntology) RichObjectFactory.getObject( SimpleComparableOntology.class, new Object[] {Messages.getString("ontologyFeatures")}); SimpleComparableOntology ontGeneral = ((SimpleComparableOntology) RichObjectFactory.getObject( SimpleComparableOntology.class, new Object[] {Messages.getString("ontologyGeneral")})); SimpleRichFeature featureGene = (SimpleRichFeature) session.createQuery( "select f from Feature as f join f.parent as b where " + "f.name=:geneName and f.typeTerm=:geneTerm and b.taxon=:taxonId ").setString("geneName", geneName).setParameter( "taxonId", taxon).setParameter("geneTerm", ontFeatures.getOrCreateTerm(Messages.getString("termGene"))).uniqueResult(); RichFeature.Template ft = new RichFeature.Template(); ft.location = featureGene.getLocation().translate(0); ft.sourceTerm = ontGeneral.getOrCreateTerm(Messages.getString("termVR")); ft.typeTerm = ontFeatures.getOrCreateTerm(Messages.getString("termMRNA")); ft.annotation = new SimpleRichAnnotation(); ft.featureRelationshipSet = new TreeSet(); ft.rankedCrossRefs = new TreeSet(); SimpleRichFeature featureMRNA = (SimpleRichFeature) featureGene.createFeature(ft); featureMRNA.setName(geneName); ft = new RichFeature.Template(); if (strand < 0) { ft.location = new SimpleRichLocation(beginPosition, endPosition, 0, Strand.NEGATIVE_STRAND); } else { ft.location = new SimpleRichLocation(beginPosition, endPosition, 0, Strand.POSITIVE_STRAND); } ft.sourceTerm = ontGeneral.getOrCreateTerm(Messages.getString("termVR")); ft.typeTerm = ontFeatures.getOrCreateTerm(Messages.getString("termORF")); ft.annotation = new SimpleRichAnnotation(); ft.featureRelationshipSet = new TreeSet(); ft.rankedCrossRefs = new TreeSet(); SimpleRichFeature featureORF = (SimpleRichFeature) featureMRNA.createFeature(ft); featureORF.setName(geneName); } public static int firstPosNotSpace(String str) { int i = 0; while (i < str.length() && str.charAt(i) == ' ') { i++; } return i; } public static int lastPosNotSpace(String str) { int i = str.length() - 1; while (i >= 0 && str.charAt(i) == ' ') { i--; } return i; } } From John.Kneisler at USPTO.GOV Wed Aug 20 21:22:28 2008 From: John.Kneisler at USPTO.GOV (Kneisler, John (Raytheon)) Date: Wed, 20 Aug 2008 21:22:28 -0400 Subject: [Biojava-l] reverse translation Message-ID: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> I am new to BioJava and I am trying to write a reverse translation function. Find the most probable DNA symbolList using a given a protein symbolList and a translation table. I tried using an RNA alphabet and protein alphabet and then I tried to use the SimpleReversibleTranslationTable's untranslate method and I got an "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create translation table as the alphabets were different sizes:". Is there any way to create a reverse translation function using the BioJava framework? Thanks John From holland at eaglegenomics.com Tue Aug 26 10:20:50 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 26 Aug 2008 15:20:50 +0100 Subject: [Biojava-l] reverse translation In-Reply-To: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> References: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> Message-ID: Have you tried using the Protein-Term alphabet instead? cheers, Richard 2008/8/21 Kneisler, John (Raytheon) : > I am new to BioJava and I am trying to write a reverse translation > function. Find the most probable DNA symbolList using a given a > protein symbolList and a translation table. I tried using an RNA > alphabet and protein alphabet and then I tried to use the > SimpleReversibleTranslationTable's untranslate method and I got an > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes:". Is there any > way to create a reverse translation function using the BioJava > framework? > > Thanks > > John > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ From dmitry.repchevski at bsc.es Thu Aug 28 05:14:08 2008 From: dmitry.repchevski at bsc.es (Dmitry Repchevsky) Date: Thu, 28 Aug 2008 11:14:08 +0200 Subject: [Biojava-l] Structure.getChains() Message-ID: <48B66C60.8000104@bsc.es> Hello! I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). A Structure object contains "models" and "compounds". "models" contains ALL elements (aminoacid chains AND solvents) and "compaunds" only aminoacids. The method Structure.getChains() returns me ALL elements... is it ok? I mean that when I'm asking for a "chain" I do not expect to get a solvent... Just a curiosity, Dmitry From gabrielle_doan at gmx.net Thu Aug 28 10:16:52 2008 From: gabrielle_doan at gmx.net (Gabrielle Doan) Date: Thu, 28 Aug 2008 16:16:52 +0200 Subject: [Biojava-l] Problems with adding miRNA to sequence Message-ID: <48B6B354.6010307@gmx.net> Hi all, I would like to insert new features (miRNA) into my exitsting BioSQL database. At the moment the database contains the chromosomes 1-22, X, Y and MT downloaded from ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/. And now I have tried to add the information about miRNA from http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl into my database with following code: private void makeAFeature(String id, String chr, int startpos, int endpos, Strand strand, float score, String gene) throws ChangeVetoExceptionIllegalSymbolException { RichSequence rs = chromosomes.get(chr); if (rs == null) { rs = db.SearchForSequence(chr); chromosomes.put(chr, rs); } RichFeature feat = RichFeature.Tools.makeEmptyFeature(); feat.setName(id); RichLocation rl = new SimpleRichLocation(new SimplePosition(startpos),new SimplePosition(endpos), 1,strand); feat.setLocation(rl); try { feat.setTypeTerm(RichObjectFactory.getDefaultOntology() .getOrCreateTerm("miRNA")); feat.setType(feat.getTypeTerm().getName()); } catch (InvalidTermException e) { // TODO Auto-generated catch block e.printStackTrace(); } feat.getAnnotation().setProperty("score", Float.valueOf(score)); feat.getAnnotation().setProperty("gene", gene); feat.setParent(rs); rs.getFeatureSet().add(feat); } I successfully inserted the information for chromosome 3-22, X, Y and MT. But when I try to deal with chromosome 1, 2 in the same way I get following message: org.hibernate.exception.DataException: could not insert: [Feature] at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:77) at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:40) at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2163) at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2643) at org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:51) at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279) at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:298) at org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181) at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:107) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:187) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:172) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:94) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70) at org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507) at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499) at org.hibernate.engine.CascadingAction$5.cascade(CascadingAction.java:218) at org.hibernate.engine.Cascade.cascadeToOne(Cascade.java:268) at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:216) at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169) at org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296) at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242) at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219) at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169) at org.hibernate.engine.Cascade.cascade(Cascade.java:130) at org.hibernate.event.def.AbstractFlushingEventListener.cascadeOnFlush(AbstractFlushingEventListener.java:131) at org.hibernate.event.def.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:122) at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:65) at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:26) at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000) at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338) at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:106) at org.viewer.db.HBioSQLDB.updateSequence(HBioSQLDB.java:254) at org.viewer.io.MakeMiRNA.splitLine(MakeMiRNA.java:220) at org.viewer.io.MakeMiRNA.main(MakeMiRNA.java:57) Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value adjusted for column 'rank' at row 1 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2973) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1600) at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1129) at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:681) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1368) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1283) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1268) at org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAndExtract(IdentityGenerator.java:73) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:33) ... 32 more It would be very nice if someone could help me. I am grateful for any hints. Thanks a lot. Cheers, Gabrielle From andreas at sdsc.edu Thu Aug 28 10:38:45 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 28 Aug 2008 07:38:45 -0700 Subject: [Biojava-l] Structure.getChains() In-Reply-To: <48B66C60.8000104@bsc.es> References: <48B66C60.8000104@bsc.es> Message-ID: <59a41c430808280738r2a84af03k21bdb7a228b25f92@mail.gmail.com> Hi Dmitry, The object model reflects the organization of data in PDB files http://www.wwpdb.org/docs.html. Chains can contain a mix of different groups of atoms. As such the BioJava object model allows you to distinguish between amino acids nucleotides and hetatoms on the Group level, rather than on the chain level. Andreas On Thu, Aug 28, 2008 at 2:14 AM, Dmitry Repchevsky wrote: > Hello! > > I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). > A Structure object contains "models" and "compounds". > "models" contains ALL elements (aminoacid chains AND solvents) and > "compaunds" only aminoacids. > The method Structure.getChains() returns me ALL elements... is it ok? I mean > that when I'm asking for a "chain" I do not expect to get a solvent... > > Just a curiosity, > > Dmitry > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Thu Aug 28 20:40:47 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 28 Aug 2008 17:40:47 -0700 Subject: [Biojava-l] Structure.getChains() In-Reply-To: <48B6C3A3.5010102@bsc.es> References: <48B66C60.8000104@bsc.es> <59a41c430808280714g272fe9cewb2307f759598e779@mail.gmail.com> <48B6BB40.3090406@bsc.es> <59a41c430808280755p1ff06afdl443629d343f4eb51@mail.gmail.com> <48B6C3A3.5010102@bsc.es> Message-ID: <59a41c430808281740v2a228dd1jf99aa185ad01f2fe@mail.gmail.com> Hi Dmitry, Yes you are right. The compound object just contains pointers to the Chains if the info is available from the COMPND records in the PDB headers. If you want to be sure to have a full set of chains, please access them via the structure.getChains() or structure.getChains(modelNr) methods. If you think that is helpful, I can add a note for this into the javadoc for the compound class. It is not having much javadoc anyway, feel free to provide a patch! ;-) Andreas On Thu, Aug 28, 2008 at 8:26 AM, Dmitry Repchevsky wrote: > Hello, > > Yeah, but you have: > > Structure > | > +-Model(s) > | | > | Chain(s) // ALL chains (ATOM/HETATM...) > | > +-Compound(s) > | > Chain(s) // Only those that are in "COMPND" > > Nothing wrong here. I just thought that Structure.getChains() returns chains > from COMPND (Aminoacids) > Cheers, > > Dmitry > > Andreas Prlic wrote: >> >> not sure if I understand you correctly. If you look at >> http://biojava.org/wiki/BioJava:CookBook:PDB:atoms >> you will see how the object model hierarchy looks like. >> Groups are below chain. >> >> A >> >> On Thu, Aug 28, 2008 at 7:50 AM, Dmitry Repchevsky >> wrote: >> >>> >>> Hello Andreas, >>> >>> I thought that chains are part of Compound (COMPND) so calling >>> Structure.getChains() would get them from Compound and not from "groups". >>> I was wrong. :-) >>> >>> It would be nice to put this in javadoc of the getChains() method... >>> >>> Thank you very much, >>> >>> Dmitry >>> >>> Andreas Prlic wrote: >>> >>>> >>>> Hi Dmitry, >>>> >>>> The object model reflects the organization of data in PDB files >>>> http://www.wwpdb.org/docs.html. >>>> >>>> Chains can contain a mix of different groups of atoms. As such the >>>> BioJava object model allows you to distinguish between amino acids >>>> nucleotides and hetatoms on the Group level, rather than on the chain >>>> level. >>>> >>>> Andreas >>>> >>>> >>>> >>>> On Thu, Aug 28, 2008 at 2:14 AM, Dmitry Repchevsky >>>> wrote: >>>> >>>> >>>>> >>>>> Hello! >>>>> >>>>> I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). >>>>> A Structure object contains "models" and "compounds". >>>>> "models" contains ALL elements (aminoacid chains AND solvents) and >>>>> "compaunds" only aminoacids. >>>>> The method Structure.getChains() returns me ALL elements... is it ok? I >>>>> mean >>>>> that when I'm asking for a "chain" I do not expect to get a solvent... >>>>> >>>>> Just a curiosity, >>>>> >>>>> Dmitry >>>>> _______________________________________________ >>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > > From holland at eaglegenomics.com Fri Aug 29 04:08:34 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 29 Aug 2008 09:08:34 +0100 Subject: [Biojava-l] Fwd: reverse translation In-Reply-To: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> References: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> Message-ID: I've forwarded this to the list in case someone comes up with an answer quicker than me. :) ---------- Forwarded message ---------- From: Kneisler, John (Raytheon) Date: 2008/8/28 Subject: RE: [Biojava-l] reverse translation To: holland at eaglegenomics.com Richard, Thanks for your reply. I think I was using the Protein-Term alphabet since I assigned a protein FiniteAlphabet variable to ProteinTools.getTAlphabet(). The problem I am having is matching the correct RNA alphabet to the correct protein alphabet. Here are the runtime exceptions I am getting: org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create translation table as the alphabets were different sizes: 22:PROTEIN64:(RNA x RNA x RNA) at org.biojava.bio.symbol.SimpleReversibleTranslationTable.(SimpleRev ersibleTranslationTable.java:88) I have tried using the RNATools.getCodonAlphabet(); method to get an RNA codon alphabet thinking that would match the protein alphabet when assigning a translation table. Unfortunately my lack of experience and understanding is showing. Any help would be appreciated. Thanks John Kneisler -----Original Message----- From: dicknetherlands at gmail.com [mailto:dicknetherlands at gmail.com] On Behalf Of Richard Holland Sent: Tuesday, August 26, 2008 10:21 AM To: Kneisler, John (Raytheon) Cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] reverse translation Have you tried using the Protein-Term alphabet instead? cheers, Richard 2008/8/21 Kneisler, John (Raytheon) : > I am new to BioJava and I am trying to write a reverse translation > function. Find the most probable DNA symbolList using a given a > protein symbolList and a translation table. I tried using an RNA > alphabet and protein alphabet and then I tried to use the > SimpleReversibleTranslationTable's untranslate method and I got an > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes:". Is there any > way to create a reverse translation function using the BioJava > framework? > > Thanks > > John > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ From markjschreiber at gmail.com Fri Aug 29 05:11:55 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 29 Aug 2008 17:11:55 +0800 Subject: [Biojava-l] Fwd: reverse translation In-Reply-To: References: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> Message-ID: <93b45ca50808290211h766b29a7uf1c039b11518817d@mail.gmail.com> Hi - The best approach will be to use the CodonPref and SimpleCodonPref classes. These let you determine the frequency of all synonymous codons for a given amino acid. There is a CodonPrefTools class that contains convenience methods and also has built in codon use tables for some common organisms. These classes are in the org.biojava.bio.symbol package. - Mark On 8/29/08, Richard Holland wrote: > I've forwarded this to the list in case someone comes up with an > answer quicker than me. :) > > > ---------- Forwarded message ---------- > From: Kneisler, John (Raytheon) > Date: 2008/8/28 > Subject: RE: [Biojava-l] reverse translation > To: holland at eaglegenomics.com > > > Richard, > Thanks for your reply. I think I was using the Protein-Term > alphabet since I assigned a protein FiniteAlphabet variable to > ProteinTools.getTAlphabet(). The problem I am having is matching the > correct RNA alphabet to the correct protein alphabet. Here are the > runtime exceptions I am getting: > org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes: > 22:PROTEIN64:(RNA x RNA x RNA) > at > org.biojava.bio.symbol.SimpleReversibleTranslationTable.(SimpleRev > ersibleTranslationTable.java:88) > > I have tried using the RNATools.getCodonAlphabet(); method to get an RNA > codon alphabet thinking that would match the protein alphabet when > assigning a translation table. Unfortunately my lack of experience and > understanding is showing. Any help would be appreciated. > > Thanks > John Kneisler > > -----Original Message----- > From: dicknetherlands at gmail.com [mailto:dicknetherlands at gmail.com] On > Behalf Of Richard Holland > Sent: Tuesday, August 26, 2008 10:21 AM > To: Kneisler, John (Raytheon) > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] reverse translation > > Have you tried using the Protein-Term alphabet instead? > > cheers, > Richard > > 2008/8/21 Kneisler, John (Raytheon) : > > I am new to BioJava and I am trying to write a reverse translation > > function. Find the most probable DNA symbolList using a given a > > protein symbolList and a translation table. I tried using an RNA > > alphabet and protein alphabet and then I tried to use the > > SimpleReversibleTranslationTable's untranslate method and I got an > > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > > translation table as the alphabets were different sizes:". Is there > any > > way to create a reverse translation function using the BioJava > > framework? > > > > Thanks > > > > John > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > -- > Richard Holland > Finance Director > Eagle Genomics > http://www.eaglegenomics.com/ > > > > > > -- > Richard Holland > Finance Director > Eagle Genomics > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From holland at eaglegenomics.com Sat Aug 30 15:00:37 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Sat, 30 Aug 2008 20:00:37 +0100 Subject: [Biojava-l] [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: BioJava does not include functions to calculate e-values - nor should it, as the e-value you wish to calculate depends entirely on the search algorithm the e-value is associated with. You will need to determine an algorithm of your own that will calculate e-values based on your own knowledge of the sequence search algorithm that you have developed. The simplest form of e-value is a function of the query sequence length and the total length of the sequences available to search. More complex forms include such things as analyses of the content of the sequences and the percent identity of the match, or in fact anything that may be appropriate and significant to the way the search algorithm works. It's basically up to you how to define it. Good luck! cheers, Richard 2008/8/30 simpleyrx : > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ From brault at embl.de Mon Aug 4 11:47:28 2008 From: brault at embl.de (brault at embl.de) Date: Mon, 4 Aug 2008 13:47:28 +0200 Subject: [Biojava-l] Blast Parsing MultiQuery Message-ID: <2ca8e1f10808040447ue17ddc0jafc38dcef06cea95@mail.gmail.com> Hello, I would like to know how I can parse a xml file from blast multiQuery. With blast parser from http://biojava.org/wiki/BioJava:CookBook:Blast:Parser I don't find where the tag is catch. Cheers, From gwaldon at geneinfinity.org Tue Aug 5 00:22:20 2008 From: gwaldon at geneinfinity.org (George Waldon) Date: Mon, 04 Aug 2008 17:22:20 -0700 Subject: [Biojava-l] Short names for Amino acid symbols Message-ID: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> The link http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does not seem to work. Did you try SymbolTonenization? Something like: Symbol s; SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); String s = tok.tokenizeSymbol(s); Should give you the short name of any given symbol. - George > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > Sent: Sunday, July 27, 2008 8:57 AM > To: biojava-l at lists.open-bio.org > Subject: [Biojava-l] Short names for Amino acid symbols > > Hi, > > thanks to all on the list who helped me get started with Biojava, and by > the way, the online documents are quite helpful! > > I am trying to develop some code to look for signs of positive selection > in human sequences by making multiple alignments of protein sequences > and mapping the nucleotide sequences onto this alignment and checking > synonymous and nonsynonymous nucleotide substitutions in several species > (etc). > > A few small questions; > 1) I have written a class to encapsulate all I need from a given Genbank > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > sequence. I have some methods such as the following: > > private void setCDSSequence() { > Feature CDS = getCDSFeature(this.completeSequence); > Location loc = CDS.getLocation(); > SymbolList symL = this.completeSequence.subList(loc.getMin(), > loc.getMax()-3); //-3 to remove stop codon > this.CDS= symL; > } > > Question: Why is there (seemingly) no way in Biojava to create a > Sequence object instead of a SymbolList object? Or did I miss something? > > 2) I would then like to printout the protein alignment to check for > correctness, and it seems there is no way of getting from a symbol to > the one-letter aminoacid code. That is, > > proteinAlignment.get(j).symbolAt(k).getName() > > will return "Ala" instead of "A" etc. Is there a good way of getting the > short symbols? > > Thanks, Peter > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From community at struck.lu Tue Aug 5 12:04:29 2008 From: community at struck.lu (community at struck.lu) Date: Tue, 05 Aug 2008 14:04:29 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: The link should have been:http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiHope this time my webmail doesn't garble the message :-(Greetings,Daniel"George Waldon" <gwaldon at geneinfinity.org> wrote: > The link > http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does > not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > > -----Original Message----- > > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > > Sent: Sunday, July 27, 2008 8:57 AM > > To: biojava-l at lists.open-bio.org > > Subject: [Biojava-l] Short names for Amino acid symbols > > > > Hi, > > > > thanks to all on the list who helped me get started with Biojava, and by > > the way, the online documents are quite helpful! > > > > I am trying to develop some code to look for signs of positive selection > > in human sequences by making multiple alignments of protein sequences > > and mapping the nucleotide sequences onto this alignment and checking > > synonymous and nonsynonymous nucleotide substitutions in several species > > (etc). > > > > A few small questions; > > 1) I have written a class to encapsulate all I need from a given Genbank > > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > > sequence. I have some methods such as the following: > > > > private void setCDSSequence() { > > Feature CDS = getCDSFeature(this.completeSequence); > > Location loc = CDS.getLocation(); > > SymbolList symL = this.completeSequence.subList(loc.getMin(), > > loc.getMax()-3); //-3 to remove stop codon > > this.CDS= symL; > > } > > > > Question: Why is there (seemingly) no way in Biojava to create a > > Sequence object instead of a SymbolList object? Or did I miss something? > > > > 2) I would then like to printout the protein alignment to check for > > correctness, and it seems there is no way of getting from a symbol to > > the one-letter aminoacid code. That is, > > > > proteinAlignment.get(j).symbolAt(k).getName() > > > > will return "Ala" instead of "A" etc. Is there a good way of getting the > > short symbols? > > > > Thanks, Peter > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _________________________________________________________ Mail sent using root eSolutions Webmailer - www.root.lu From community at struck.lu Tue Aug 5 12:33:58 2008 From: community at struck.lu (community at struck.lu) Date: Tue, 05 Aug 2008 14:33:58 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: The link should have been: http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbi Hope this time my webmail doesn't garble the message :-( Greetings, Daniel "George Waldon" wrote: > The link > http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does > not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > > -----Original Message----- > > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- > > bounces at lists.open-bio.org] On Behalf Of Peter Robinson > > Sent: Sunday, July 27, 2008 8:57 AM > > To: biojava-l at lists.open-bio.org > > Subject: [Biojava-l] Short names for Amino acid symbols > > > > Hi, > > > > thanks to all on the list who helped me get started with Biojava, and by > > the way, the online documents are quite helpful! > > > > I am trying to develop some code to look for signs of positive selection > > in human sequences by making multiple alignments of protein sequences > > and mapping the nucleotide sequences onto this alignment and checking > > synonymous and nonsynonymous nucleotide substitutions in several species > > (etc). > > > > A few small questions; > > 1) I have written a class to encapsulate all I need from a given Genbank > > mRNA sequence; the entire mRNA, the CDS and the corresponding protein > > sequence. I have some methods such as the following: > > > > private void setCDSSequence() { > > Feature CDS = getCDSFeature(this.completeSequence); > > Location loc = CDS.getLocation(); > > SymbolList symL = this.completeSequence.subList(loc.getMin(), > > loc.getMax()-3); //-3 to remove stop codon > > this.CDS= symL; > > } > > > > Question: Why is there (seemingly) no way in Biojava to create a > > Sequence object instead of a SymbolList object? Or did I miss something? > > > > 2) I would then like to printout the protein alignment to check for > > correctness, and it seems there is no way of getting from a symbol to > > the one-letter aminoacid code. That is, > > > > proteinAlignment.get(j).symbolAt(k).getName() > > > > will return "Ala" instead of "A" etc. Is there a good way of getting the > > short symbols? > > > > Thanks, Peter > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _________________________________________________________ Mail sent using root eSolutions Webmailer - www.root.lu From peter.robinson at t-online.de Wed Aug 6 07:16:01 2008 From: peter.robinson at t-online.de (Peter Robinson) Date: Wed, 06 Aug 2008 09:16:01 +0200 Subject: [Biojava-l] Short names for Amino acid symbols In-Reply-To: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> References: <20080805002220.23235.qmail@mmm1924.dulles19-verio.com> Message-ID: <48994FB1.1020204@t-online.de> George Waldon wrote: > The link http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbiDaniel does not seem to work. > > Did you try SymbolTonenization? Something like: > > Symbol s; > SymbolTokenization tok= ProteinTools.getTAlphabet().getTokenization("token"); > String s = tok.tokenizeSymbol(s); > > Should give you the short name of any given symbol. > > - George > > Thanks George, I got things working using the code in http://biojava.org/wiki/BioJava:Cookbook:Translation:OneLetterAmbi which is equivalent to the above. -Peter >> -----Original Message----- >> From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l- >> bounces at lists.open-bio.org] On Behalf Of Peter Robinson >> Sent: Sunday, July 27, 2008 8:57 AM >> To: biojava-l at lists.open-bio.org >> Subject: [Biojava-l] Short names for Amino acid symbols >> >> Hi, >> >> thanks to all on the list who helped me get started with Biojava, and by >> the way, the online documents are quite helpful! >> >> I am trying to develop some code to look for signs of positive selection >> in human sequences by making multiple alignments of protein sequences >> and mapping the nucleotide sequences onto this alignment and checking >> synonymous and nonsynonymous nucleotide substitutions in several species >> (etc). >> >> A few small questions; >> 1) I have written a class to encapsulate all I need from a given Genbank >> mRNA sequence; the entire mRNA, the CDS and the corresponding protein >> sequence. I have some methods such as the following: >> >> private void setCDSSequence() { >> Feature CDS = getCDSFeature(this.completeSequence); >> Location loc = CDS.getLocation(); >> SymbolList symL = this.completeSequence.subList(loc.getMin(), >> loc.getMax()-3); //-3 to remove stop codon >> this.CDS= symL; >> } >> >> Question: Why is there (seemingly) no way in Biojava to create a >> Sequence object instead of a SymbolList object? Or did I miss something? >> >> 2) I would then like to printout the protein alignment to check for >> correctness, and it seems there is no way of getting from a symbol to >> the one-letter aminoacid code. That is, >> >> proteinAlignment.get(j).symbolAt(k).getName() >> >> will return "Ala" instead of "A" etc. Is there a good way of getting the >> short symbols? >> >> Thanks, Peter >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From willishf at ufl.edu Sat Aug 9 20:13:38 2008 From: willishf at ufl.edu (Scooter Willis) Date: Sat, 09 Aug 2008 16:13:38 -0400 Subject: [Biojava-l] Quicktree Message-ID: <489DFA72.4040709@ufl.edu> I am searching for a Java implementation of Quicktree or other accepted methods for reconstructing phylogenies from aligned sequence data. I am currently using quicktree but it requires sequence data to be in stockholm format which requires a conversion step, followed by running quicktree in cygwin on windows and then parse the Newick/Hew Hampshire format. I use quicktree because it is fast against large sequences. I would like to integrate the tree construction step as part of my Java application. Anyone know of an accepted Java library for building trees? Plenty of tree viewers just can't seem to find anything to construct from aligned sequence data. Thanks Scooter From andreas at sdsc.edu Sun Aug 10 15:59:12 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 10 Aug 2008 08:59:12 -0700 Subject: [Biojava-l] biojava paper published Message-ID: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> Hi All, I am glad to announce that an Application Note describing BioJava has been accepted for publication in Bioinformatics. The advance access manuscript is available from: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn397v1?ijkey=jIKd6VUGPrgshbv&keytype=ref As alwyas, happy biojava-ing, Andreas From hlapp at gmx.net Sun Aug 10 17:21:57 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 10 Aug 2008 13:21:57 -0400 Subject: [Biojava-l] biojava paper published In-Reply-To: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> References: <59a41c430808100859k6c8dd460kd2ca8c65de650583@mail.gmail.com> Message-ID: <7A99248B-DB41-4FCE-AFA2-CE1D0E083827@gmx.net> Congratulations!!! -hilmar On Aug 10, 2008, at 11:59 AM, Andreas Prlic wrote: > Hi All, > > I am glad to announce that an Application Note describing BioJava has > been accepted for publication in Bioinformatics. > The advance access manuscript is available from: > > http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn397v1?ijkey=jIKd6VUGPrgshbv&keytype=ref > > As alwyas, > > happy biojava-ing, > > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jimp at compbio.dundee.ac.uk Tue Aug 12 12:39:46 2008 From: jimp at compbio.dundee.ac.uk (James Procter) Date: Tue, 12 Aug 2008 13:39:46 +0100 Subject: [Biojava-l] Quicktree In-Reply-To: <489DFA72.4040709@ufl.edu> References: <489DFA72.4040709@ufl.edu> Message-ID: <48A18492.1020009@compbio.dundee.ac.uk> Hi Scooter Scooter Willis wrote: > I am searching for a Java implementation of Quicktree or other accepted > methods for reconstructing phylogenies from aligned sequence data. I am > currently using quicktree but it requires sequence data to be in > stockholm format which requires a conversion step, followed by running > quicktree in cygwin on windows and then parse the Newick/Hew Hampshire > format. I use quicktree because it is fast against large sequences. I > would like to integrate the tree construction step as part of my Java > application. Anyone know of an accepted Java library for building trees? > Plenty of tree viewers just can't seem to find anything to construct > from aligned sequence data. You could use the neighbour joining implementation from the Jalview source - since it is GPL. It will construct a tree from aligned data, but it uses the Jalview datamodel, which may cause you problems. Alternatively, there's a library called PAL, but I'm not sure if that's actually being supported now. Jim Procter. -- ------------------------------------------------------------------- J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk The University of Dundee is a Scottish Registered Charity, No. SC015096. From hlapp at gmx.net Wed Aug 13 02:02:33 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 12 Aug 2008 22:02:33 -0400 Subject: [Biojava-l] Quicktree In-Reply-To: <48A18492.1020009@compbio.dundee.ac.uk> References: <489DFA72.4040709@ufl.edu> <48A18492.1020009@compbio.dundee.ac.uk> Message-ID: <3B68B799-0E14-4FCC-ABFB-1E51DC3B67F9@gmx.net> On Aug 12, 2008, at 8:39 AM, James Procter wrote: > Alternatively, there's a library called PAL, but I'm not sure if > that's > actually being supported now. There is a successor project called JEBL (http://jebl.sf.net). It sounds like it has a NJ implementation: http://jebl.sourceforge.net/doc/api/jebl/evolution/trees/NeighborJoiningTreeBuilder.html -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From watson at ebi.ac.uk Wed Aug 13 09:00:24 2008 From: watson at ebi.ac.uk (James Watson) Date: Wed, 13 Aug 2008 10:00:24 +0100 Subject: [Biojava-l] Hands-on course at the European Bioinformatics Institute - Programmatic access in Java: webservices and work flows Message-ID: <48A2A2A8.7010706@ebi.ac.uk> Dear colleagues, A hands-on course called "Programmatic access in Java: webservices & work flows" will be held on 24-27 November 2008 at the European Bioinformatics Institute in Hinxton, Cambridgeshire, UK. This course will give you the skills to leverage webservice technology to access and manipulate bioinformatics data resources and tools. You will start with simple scripts accessing individual services and then build upon this to create work flows to solve more complex problems in a reusable manner. Participants will be exposed to open standards such as Simple Object Access Protocol (SOAP); the Distributed Annotation System (DAS); REST services and the BioMart web service. Several examples of specific web services will be included, covering programmatic access to both databases and tools at the EBI. The course costs ?75 and interested candidates are encouraged to apply online at the URL below (the training is free, however we need to charge participants an administration fee of ?25 per day to cover food and materials, and participants need to pay their own travel and accommodation): www.ebi.ac.uk/training/handson/course_081124_javawebservices.html This course will have a maximum number of 40 participants on a first come first serve basis, so please register early to avoid disappointment. *The deadline for registering for this event is Monday 27 October 2008.* Best regards, James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 26-27 August 2008: Interactions and Pathways 1-3 September 2008: Joint EBI?ENFIN workshop - Protein function prediction tools 8-11 September 2008: Programmatic access in Perl: webservices & work flows 6-8 October 2008: 2-day dip into the EBI?s data resources: Understanding your data 24-27 November 2008: Programmatic access in Java: webservices & work flows From phidias51 at gmail.com Thu Aug 14 18:26:06 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Thu, 14 Aug 2008 11:26:06 -0700 Subject: [Biojava-l] BioGroovy Message-ID: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> I've been using the biojava library with groovy lately and I ran across the BioGroovy.org site. The site seems to be a placeholder and doesn't really have much information on it. I was wondering if it was an official Bio* site? Has anyone else been using Groovy (or any other scripting languages) with BioJava? Also has anyone looked at using Grails with BioSQL? It would seem like an easy way to get something started quickly. Regards, -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From andreas at sdsc.edu Fri Aug 15 03:15:00 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 14 Aug 2008 20:15:00 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> Message-ID: <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> Hi Mark, On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: > I've been using the biojava library with groovy lately and I ran > across the BioGroovy.org site. The site seems to be a placeholder and > doesn't really have much information on it. I was wondering if it was > an official Bio* site? I don't think it is an official site from the open bioinformatics foundation. If you do a whois for biogroovy.org it gives the address of a bioinformatics center from korea. In comparison the whois for biojava points to Chris Dagdigian from the obf. Cheers, Andreas From ayates at ebi.ac.uk Fri Aug 15 08:55:15 2008 From: ayates at ebi.ac.uk (Andy Yates) Date: Fri, 15 Aug 2008 09:55:15 +0100 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> Message-ID: <48A54473.1060006@ebi.ac.uk> Hi Mark, There has been talk in the past about a groovier version of BioJava or at the very least showing where Groovy can help to reduce the verbosity of some parts of the biojava framework. I've done a tiny bit but my work has only been into prototyping Java code & quickly asserting some assumptions I had about BioJava (as in how the framework works). What would be a brilliant step in a Groovier BioJava is to start levering the builders (http://groovy.codehaus.org/Builders). I mean imagine being able to write something like: def myReferences = getReferences(); new EmblBuilder().build { id('U00096') myReferences.each{ ref -> reference { // } } } I admit it's not a fully formed idea at the moment but hopefully you can see where I'm going with this :) WRT Grails; our supported BioSQL API is written in Hibernate; just the same as GORM (Grails' ORM solution). So technically I cannot see a reason why it wouldn't be possible; my only wonder is how Grails controls transaction boundaries and translating this to our BioSQL. Andy Mark Fortner wrote: > I've been using the biojava library with groovy lately and I ran > across the BioGroovy.org site. The site seems to be a placeholder and > doesn't really have much information on it. I was wondering if it was > an official Bio* site? Has anyone else been using Groovy (or any > other scripting languages) with BioJava? > > Also has anyone looked at using Grails with BioSQL? It would seem > like an easy way to get something started quickly. > > Regards, > From phidias51 at gmail.com Fri Aug 15 15:24:49 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:24:49 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> Message-ID: <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> Thanks Andreas. That confirmed my suspicions. It sounds like there's some interest in Groovy from the community. At some point it might be worth putting together a cookbook, although I'm not sure what site would be appropriate for it. Mark On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: > Hi Mark, > > On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >> I've been using the biojava library with groovy lately and I ran >> across the BioGroovy.org site. The site seems to be a placeholder and >> doesn't really have much information on it. I was wondering if it was >> an official Bio* site? > > > I don't think it is an official site from the open bioinformatics > foundation. If you do a whois for biogroovy.org it gives the address > of a bioinformatics center from korea. In comparison the whois for > biojava points to Chris Dagdigian from the obf. > > Cheers, > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From andreas at sdsc.edu Fri Aug 15 15:31:45 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 15 Aug 2008 08:31:45 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> Message-ID: <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> I can help you with the contacts to OBF. One possibillity is to ask the korean people if they would be interested in collaborating on this and perhaps transfering the domain to obf. Andreas On Fri, Aug 15, 2008 at 8:24 AM, Mark Fortner wrote: > Thanks Andreas. That confirmed my suspicions. > > It sounds like there's some interest in Groovy from the community. At > some point it might be worth putting together a cookbook, although I'm > not sure what site would be appropriate for it. > > Mark > > On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: >> Hi Mark, >> >> On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >>> I've been using the biojava library with groovy lately and I ran >>> across the BioGroovy.org site. The site seems to be a placeholder and >>> doesn't really have much information on it. I was wondering if it was >>> an official Bio* site? >> >> >> I don't think it is an official site from the open bioinformatics >> foundation. If you do a whois for biogroovy.org it gives the address >> of a bioinformatics center from korea. In comparison the whois for >> biojava points to Chris Dagdigian from the obf. >> >> Cheers, >> Andreas >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From phidias51 at gmail.com Fri Aug 15 15:41:49 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:41:49 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <59a41c430808142015q5c3a57fdn41b8010f2be10d6c@mail.gmail.com> <6e1d61f50808150824p261cdf74l4e4a61dfeb860d8c@mail.gmail.com> <59a41c430808150831m46e4558bx63d3adc13374bf23@mail.gmail.com> Message-ID: <6e1d61f50808150841t1a8aceb3qba8df2048c0af22a@mail.gmail.com> Hi Andreas, I'll check with the owners and see what they say. I think at some point we'll need a code repository. I didn't see any links from the current page to either a cvs or svn repository. They may be interested in hosting the wiki, and perhaps the code could be hosted with the OBF. I'll let you know what I find out. Mark On Fri, Aug 15, 2008 at 8:31 AM, Andreas Prlic wrote: > I can help you with the contacts to OBF. One possibillity is to ask > the korean people if they would be interested in collaborating on this > and perhaps transfering the domain to obf. > > Andreas > > > > On Fri, Aug 15, 2008 at 8:24 AM, Mark Fortner wrote: >> Thanks Andreas. That confirmed my suspicions. >> >> It sounds like there's some interest in Groovy from the community. At >> some point it might be worth putting together a cookbook, although I'm >> not sure what site would be appropriate for it. >> >> Mark >> >> On Thu, Aug 14, 2008 at 8:15 PM, Andreas Prlic wrote: >>> Hi Mark, >>> >>> On Thu, Aug 14, 2008 at 11:26 AM, Mark Fortner wrote: >>>> I've been using the biojava library with groovy lately and I ran >>>> across the BioGroovy.org site. The site seems to be a placeholder and >>>> doesn't really have much information on it. I was wondering if it was >>>> an official Bio* site? >>> >>> >>> I don't think it is an official site from the open bioinformatics >>> foundation. If you do a whois for biogroovy.org it gives the address >>> of a bioinformatics center from korea. In comparison the whois for >>> biojava points to Chris Dagdigian from the obf. >>> >>> Cheers, >>> Andreas >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> >> >> >> -- >> Mark Fortner >> >> blog: http://feeds.feedburner.com/jroller/ideafactory >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From phidias51 at gmail.com Fri Aug 15 15:49:30 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Fri, 15 Aug 2008 08:49:30 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <48A54473.1060006@ebi.ac.uk> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> Message-ID: <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> Hi Andy, The builders and closures definitely make it easier to use and cut down on the verbosity of the language. You also have built-in support for CLI, and can leverage libraries like ORO for regular expression handling. The XmlSlurper makes it easier to handle downloading and parsing XML. I created a roadmap for a series of blog articles on various common bioinformatics-related tasks. I started out with a couple of quick entries on using NCBI's EUtils with Groovy. If there's some interest, I'll see about posting the roadmap on a wiki somewhere (along with some of the "recipes" that I've written). Anyone who's interested could then contribute their own "recipes" to it. I'm just getting started with Grails. My initial thought was to identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, SimpleBioEntry, etc) as domain objects and have the grails ant script handle generating the gui and persistence stacks for them (perhaps using Derby). This might be overly-simplistic, but I'm looking for ways to make biojavax, and biosql more easily accessible. Mark On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > Hi Mark, > > There has been talk in the past about a groovier version of BioJava or at > the very least showing where Groovy can help to reduce the verbosity of some > parts of the biojava framework. I've done a tiny bit but my work has only > been into prototyping Java code & quickly asserting some assumptions I had > about BioJava (as in how the framework works). > > What would be a brilliant step in a Groovier BioJava is to start levering > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > able to write something like: > > def myReferences = getReferences(); > > new EmblBuilder().build { > id('U00096') > myReferences.each{ ref -> > reference { > // > } > } > } > > I admit it's not a fully formed idea at the moment but hopefully you can see > where I'm going with this :) > > WRT Grails; our supported BioSQL API is written in Hibernate; just the same > as GORM (Grails' ORM solution). So technically I cannot see a reason why it > wouldn't be possible; my only wonder is how Grails controls transaction > boundaries and translating this to our BioSQL. > > Andy > > Mark Fortner wrote: >> >> I've been using the biojava library with groovy lately and I ran >> across the BioGroovy.org site. The site seems to be a placeholder and >> doesn't really have much information on it. I was wondering if it was >> an official Bio* site? Has anyone else been using Groovy (or any >> other scripting languages) with BioJava? >> >> Also has anyone looked at using Grails with BioSQL? It would seem >> like an easy way to get something started quickly. >> >> Regards, >> > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From koen.bruynseels at cropdesign.com Fri Aug 15 16:10:27 2008 From: koen.bruynseels at cropdesign.com (koen.bruynseels at cropdesign.com) Date: Fri, 15 Aug 2008 18:10:27 +0200 Subject: [Biojava-l] Koen Bruynseels is out of the office. Message-ID: I will be out of the office starting 14/08/2008 and will not return until 01/09/2008. I will respond to your message when I return. From markjschreiber at gmail.com Sat Aug 16 10:11:35 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sat, 16 Aug 2008 18:11:35 +0800 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> Message-ID: <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> Hi - There are really 2 approaches you could take with Groovy. One would be to write and entire API. This would be a "BioGroovy". The other approach would be to use the BioJava API and use Groovy to string together the BioJava objects to write the programs. I tend to think the second is usually the better option for dynamic languages like Groovy. If you go for the second option the BioJava cookbook would be a suitable place for examples. Actually using Groovy to make programs with the BioJava library could smooth the learning curve of BioJava a little. It's worth being mindful of the performance of Groovy at this stage. While this will undoubtably improve with future versions you can currently expect Groovy code to run about 10x slower than Java so it might not be good to implement any kind of sequence alignment or HMM algorithm in Groovy. - Mark On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: > Hi Andy, > The builders and closures definitely make it easier to use and cut > down on the verbosity of the language. You also have built-in support > for CLI, and can leverage libraries like ORO for regular expression > handling. The XmlSlurper makes it easier to handle downloading and > parsing XML. > > I created a roadmap for a series of blog articles on various common > bioinformatics-related tasks. I started out with a couple of quick > entries on using NCBI's EUtils with Groovy. If there's some interest, > I'll see about posting the roadmap on a wiki somewhere (along with > some of the "recipes" that I've written). Anyone who's interested > could then contribute their own "recipes" to it. > > I'm just getting started with Grails. My initial thought was to > identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, > SimpleBioEntry, etc) as domain objects and have the grails ant script > handle generating the gui and persistence stacks for them (perhaps > using Derby). > > This might be overly-simplistic, but I'm looking for ways to make > biojavax, and biosql more easily accessible. > > Mark > > On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > > Hi Mark, > > > > There has been talk in the past about a groovier version of BioJava or at > > the very least showing where Groovy can help to reduce the verbosity of > some > > parts of the biojava framework. I've done a tiny bit but my work has only > > been into prototyping Java code & quickly asserting some assumptions I > had > > about BioJava (as in how the framework works). > > > > What would be a brilliant step in a Groovier BioJava is to start levering > > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > > able to write something like: > > > > def myReferences = getReferences(); > > > > new EmblBuilder().build { > > id('U00096') > > myReferences.each{ ref -> > > reference { > > // > > } > > } > > } > > > > I admit it's not a fully formed idea at the moment but hopefully you can > see > > where I'm going with this :) > > > > WRT Grails; our supported BioSQL API is written in Hibernate; just the > same > > as GORM (Grails' ORM solution). So technically I cannot see a reason why > it > > wouldn't be possible; my only wonder is how Grails controls transaction > > boundaries and translating this to our BioSQL. > > > > Andy > > > > Mark Fortner wrote: > >> > >> I've been using the biojava library with groovy lately and I ran > >> across the BioGroovy.org site. The site seems to be a placeholder and > >> doesn't really have much information on it. I was wondering if it was > >> an official Bio* site? Has anyone else been using Groovy (or any > >> other scripting languages) with BioJava? > >> > >> Also has anyone looked at using Grails with BioSQL? It would seem > >> like an easy way to get something started quickly. > >> > >> Regards, > >> > > > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From phidias51 at gmail.com Sat Aug 16 16:31:47 2008 From: phidias51 at gmail.com (Mark Fortner) Date: Sat, 16 Aug 2008 09:31:47 -0700 Subject: [Biojava-l] BioGroovy In-Reply-To: <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> Message-ID: <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> Hi Mark, So far, most of my experiments have made use of BioJava, and CDK. I'm hoping to try out a few things with the JRI (Java R Interface) and JEMBOSS as well. I agree, there's no point in simply rewriting BioJava code in Groovy. I'd rather leverage the appropriate libraries wherever possible. This brings up another question, if we're leveraging libraries other than BioJava does it make sense to post the BioGroovy cookbooks on the BioJava site? Also, as Andy intimated though, there is probably going to be a point where standard Groovy things (like Builders) aren't available in BioJava. When those things happen, some decision will need to be made as to whether the Builder should be implemented in Java or in Groovy. My tendency would be to make it available in BioJava (or BioJavaX), which would let others leverage the simplified syntax not only in Java but in scripting languages (such as JRuby and Jython) as well. As for performance, when you use the Eclipse Groovy plugin in automatically compiles the Groovy script to Java bytecode, so I haven't really noticed any difference in speed -- although most of what I've tried hasn't been computationally challenging either. Mark On Sat, Aug 16, 2008 at 3:11 AM, Mark Schreiber wrote: > Hi - > > There are really 2 approaches you could take with Groovy. One would be to > write and entire API. This would be a "BioGroovy". The other approach would > be to use the BioJava API and use Groovy to string together the BioJava > objects to write the programs. I tend to think the second is usually the > better option for dynamic languages like Groovy. If you go for the second > option the BioJava cookbook would be a suitable place for examples. > Actually using Groovy to make programs with the BioJava library could smooth > the learning curve of BioJava a little. > > It's worth being mindful of the performance of Groovy at this stage. While > this will undoubtably improve with future versions you can currently expect > Groovy code to run about 10x slower than Java so it might not be good to > implement any kind of sequence alignment or HMM algorithm in Groovy. > > - Mark > > On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: >> >> Hi Andy, >> The builders and closures definitely make it easier to use and cut >> down on the verbosity of the language. You also have built-in support >> for CLI, and can leverage libraries like ORO for regular expression >> handling. The XmlSlurper makes it easier to handle downloading and >> parsing XML. >> >> I created a roadmap for a series of blog articles on various common >> bioinformatics-related tasks. I started out with a couple of quick >> entries on using NCBI's EUtils with Groovy. If there's some interest, >> I'll see about posting the roadmap on a wiki somewhere (along with >> some of the "recipes" that I've written). Anyone who's interested >> could then contribute their own "recipes" to it. >> >> I'm just getting started with Grails. My initial thought was to >> identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, >> SimpleBioEntry, etc) as domain objects and have the grails ant script >> handle generating the gui and persistence stacks for them (perhaps >> using Derby). >> >> This might be overly-simplistic, but I'm looking for ways to make >> biojavax, and biosql more easily accessible. >> >> Mark >> >> On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: >> > Hi Mark, >> > >> > There has been talk in the past about a groovier version of BioJava or >> > at >> > the very least showing where Groovy can help to reduce the verbosity of >> > some >> > parts of the biojava framework. I've done a tiny bit but my work has >> > only >> > been into prototyping Java code & quickly asserting some assumptions I >> > had >> > about BioJava (as in how the framework works). >> > >> > What would be a brilliant step in a Groovier BioJava is to start >> > levering >> > the builders (http://groovy.codehaus.org/Builders). I mean imagine being >> > able to write something like: >> > >> > def myReferences = getReferences(); >> > >> > new EmblBuilder().build { >> > id('U00096') >> > myReferences.each{ ref -> >> > reference { >> > // >> > } >> > } >> > } >> > >> > I admit it's not a fully formed idea at the moment but hopefully you can >> > see >> > where I'm going with this :) >> > >> > WRT Grails; our supported BioSQL API is written in Hibernate; just the >> > same >> > as GORM (Grails' ORM solution). So technically I cannot see a reason why >> > it >> > wouldn't be possible; my only wonder is how Grails controls transaction >> > boundaries and translating this to our BioSQL. >> > >> > Andy >> > >> > Mark Fortner wrote: >> >> >> >> I've been using the biojava library with groovy lately and I ran >> >> across the BioGroovy.org site. The site seems to be a placeholder and >> >> doesn't really have much information on it. I was wondering if it was >> >> an official Bio* site? Has anyone else been using Groovy (or any >> >> other scripting languages) with BioJava? >> >> >> >> Also has anyone looked at using Grails with BioSQL? It would seem >> >> like an easy way to get something started quickly. >> >> >> >> Regards, >> >> >> > >> >> >> >> -- >> Mark Fortner >> >> blog: http://feeds.feedburner.com/jroller/ideafactory >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- Mark Fortner blog: http://feeds.feedburner.com/jroller/ideafactory From markjschreiber at gmail.com Sun Aug 17 03:16:55 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sun, 17 Aug 2008 11:16:55 +0800 Subject: [Biojava-l] BioGroovy In-Reply-To: <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> References: <6e1d61f50808141126h1ed50d50g2392c0816cfb1ab5@mail.gmail.com> <48A54473.1060006@ebi.ac.uk> <6e1d61f50808150849p1e2cf0eak8f6d614355f41a9e@mail.gmail.com> <93b45ca50808160311r3bdb8937n96a2cafed2641952@mail.gmail.com> <6e1d61f50808160931v34f040f8o24083f501c1c5800@mail.gmail.com> Message-ID: <93b45ca50808162016l2f5160b3qf13ca97e02299154@mail.gmail.com> > This brings up another question, if we're leveraging libraries other > than BioJava does it make sense to post the BioGroovy cookbooks on the > BioJava site? > Sure. Maybe put them in their own section but they should go into the BioJava cookbook. > Also, as Andy intimated though, there is probably going to be a point > where standard Groovy things (like Builders) aren't available in > BioJava. When those things happen, some decision will need to be made > as to whether the Builder should be implemented in Java or in Groovy. > My tendency would be to make it available in BioJava (or BioJavaX), > which would let others leverage the simplified syntax not only in Java > but in scripting languages (such as JRuby and Jython) as well. > I think in future versions of BioJava that Groovy could be incorporated. Java and groovy are very complementary so there is no real reason not to. I wouldn't be too surprised if Groovy got absorbed into the JDK at some point. > As for performance, when you use the Eclipse Groovy plugin in > automatically compiles the Groovy script to Java bytecode, so I > haven't really noticed any difference in speed -- although most of > what I've tried hasn't been computationally challenging either. Yes. This helps in many cases. It also helps (for compiled applications) if you define types where you can. This means the runtime doesn't have to do so much introspection. - Mark > > Mark > > On Sat, Aug 16, 2008 at 3:11 AM, Mark Schreiber > wrote: > > Hi - > > > > There are really 2 approaches you could take with Groovy. One would be to > > write and entire API. This would be a "BioGroovy". The other approach would > > be to use the BioJava API and use Groovy to string together the BioJava > > objects to write the programs. I tend to think the second is usually the > > better option for dynamic languages like Groovy. If you go for the second > > option the BioJava cookbook would be a suitable place for examples. > > Actually using Groovy to make programs with the BioJava library could smooth > > the learning curve of BioJava a little. > > > > It's worth being mindful of the performance of Groovy at this stage. While > > this will undoubtably improve with future versions you can currently expect > > Groovy code to run about 10x slower than Java so it might not be good to > > implement any kind of sequence alignment or HMM algorithm in Groovy. > > > > - Mark > > > > On Fri, Aug 15, 2008 at 11:49 PM, Mark Fortner wrote: > >> > >> Hi Andy, > >> The builders and closures definitely make it easier to use and cut > >> down on the verbosity of the language. You also have built-in support > >> for CLI, and can leverage libraries like ORO for regular expression > >> handling. The XmlSlurper makes it easier to handle downloading and > >> parsing XML. > >> > >> I created a roadmap for a series of blog articles on various common > >> bioinformatics-related tasks. I started out with a couple of quick > >> entries on using NCBI's EUtils with Groovy. If there's some interest, > >> I'll see about posting the roadmap on a wiki somewhere (along with > >> some of the "recipes" that I've written). Anyone who's interested > >> could then contribute their own "recipes" to it. > >> > >> I'm just getting started with Grails. My initial thought was to > >> identify the BioSQL objects (i.e. SimpleNamespace, SimpleNCBITaxon, > >> SimpleBioEntry, etc) as domain objects and have the grails ant script > >> handle generating the gui and persistence stacks for them (perhaps > >> using Derby). > >> > >> This might be overly-simplistic, but I'm looking for ways to make > >> biojavax, and biosql more easily accessible. > >> > >> Mark > >> > >> On Fri, Aug 15, 2008 at 1:55 AM, Andy Yates wrote: > >> > Hi Mark, > >> > > >> > There has been talk in the past about a groovier version of BioJava or > >> > at > >> > the very least showing where Groovy can help to reduce the verbosity of > >> > some > >> > parts of the biojava framework. I've done a tiny bit but my work has > >> > only > >> > been into prototyping Java code & quickly asserting some assumptions I > >> > had > >> > about BioJava (as in how the framework works). > >> > > >> > What would be a brilliant step in a Groovier BioJava is to start > >> > levering > >> > the builders (http://groovy.codehaus.org/Builders). I mean imagine being > >> > able to write something like: > >> > > >> > def myReferences = getReferences(); > >> > > >> > new EmblBuilder().build { > >> > id('U00096') > >> > myReferences.each{ ref -> > >> > reference { > >> > // > >> > } > >> > } > >> > } > >> > > >> > I admit it's not a fully formed idea at the moment but hopefully you can > >> > see > >> > where I'm going with this :) > >> > > >> > WRT Grails; our supported BioSQL API is written in Hibernate; just the > >> > same > >> > as GORM (Grails' ORM solution). So technically I cannot see a reason why > >> > it > >> > wouldn't be possible; my only wonder is how Grails controls transaction > >> > boundaries and translating this to our BioSQL. > >> > > >> > Andy > >> > > >> > Mark Fortner wrote: > >> >> > >> >> I've been using the biojava library with groovy lately and I ran > >> >> across the BioGroovy.org site. The site seems to be a placeholder and > >> >> doesn't really have much information on it. I was wondering if it was > >> >> an official Bio* site? Has anyone else been using Groovy (or any > >> >> other scripting languages) with BioJava? > >> >> > >> >> Also has anyone looked at using Grails with BioSQL? It would seem > >> >> like an easy way to get something started quickly. > >> >> > >> >> Regards, > >> >> > >> > > >> > >> > >> > >> -- > >> Mark Fortner > >> > >> blog: http://feeds.feedburner.com/jroller/ideafactory > >> _______________________________________________ > >> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > -- > Mark Fortner > > blog: http://feeds.feedburner.com/jroller/ideafactory > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From augustovmail-java at yahoo.com.br Wed Aug 20 13:36:27 2008 From: augustovmail-java at yahoo.com.br (Augusto Fernandes Vellozo) Date: Wed, 20 Aug 2008 15:36:27 +0200 Subject: [Biojava-l] Exception org.hibernate.NonUniqueObjectException Message-ID: <381a3e850808200636n50e8700ap21d54a4554dd2fb5@mail.gmail.com> Hi, I am trying to load a lot of features from one file to MYSQL and i am having problems to do this with BIOJAVA/hibernate. If I don't do the flush/clear in the session, i have one exception like OutOfMemory. But, after I do the flush/clear, the second query throws the exception: org.hibernate.NonUniqueObjectException: a different object with the same identifier value was already associated with the session: [Term#23755] I've already tried to clean the RichObjectFactory, but it doesn't work. Please, some one knows what could be happening? Some suggestion? The code is below. Thanks, -- Augusto F. Vellozo import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.util.TreeSet; import org.biojava.bio.BioException; import org.biojavax.RichObjectFactory; import org.biojavax.SimpleRichAnnotation; import org.biojavax.bio.seq.RichFeature; import org.biojavax.bio.seq.SimplePosition; import org.biojavax.bio.seq.SimpleRichFeature; import org.biojavax.bio.seq.SimpleRichLocation; import org.biojavax.bio.seq.RichLocation.Strand; import org.biojavax.bio.taxa.NCBITaxon; import org.biojavax.ontology.SimpleComparableOntology; import org.hibernate.Session; import org.hibernate.SessionFactory; import org.hibernate.Transaction; import org.hibernate.cfg.Configuration; public class LoadORFVRTest { public static void main(String[] args) { SessionFactory sessionFactory = new Configuration().configure("hibernate.cfg.xml").buildSessionFactory(); Session session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultNamespaceName(Messages.getString("nameSpaceDefault")); Transaction tx = session.beginTransaction(); try { //file orfs File fileOrfs; fileOrfs = new File(args[0]); String orfName, geneName = ""; BufferedReader br = new BufferedReader(new FileReader(fileOrfs)); String line, line2, line3, lineAmino; int countOrfs = 0; int beginPos = -1, endPos = -1, nextPos = -1; int strand = 0; int stepORF = Integer.parseInt(Messages.getString("LoadORFVR.printORF")); while ((line = br.readLine()) != null) { if (line.length() > 0) { if (line.startsWith(">")) { //ORF heading //new ORF //save last ORF if (strand != 0) { saveORF(session, strand, beginPos, endPos, nextPos - 1, geneName, Integer.parseInt(args[1])); countOrfs++; } if (countOrfs % stepORF == 0) { System.out.println(countOrfs); session.flush(); tx.commit(); session.clear(); session.close(); RichObjectFactory.clearLRUCache(); session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultNamespaceName(Messages.getString("nameSpaceDefault")); tx = session.beginTransaction(); } orfName = line.substring(1); geneName = orfName.substring(0, orfName.indexOf("_")); line = br.readLine(); if (line.startsWith("Reading frame: ")) { strand = Integer.parseInt(line.substring(15)); if (strand == 0) { System.out.println("Format error, strand = 0"); } else { nextPos = 1; beginPos = -1; endPos = -1; } } else { System.out.println("Format error in line 'Reading frame':" + line); strand = 0; } br.readLine(); // empty line } else if (strand != 0) { //ORF sequence line2 = br.readLine(); line3 = br.readLine(); br.readLine(); // empty line if (strand < 0) { lineAmino = line3; } else { lineAmino = line; } lineAmino = lineAmino.substring(3, lineAmino.length() - 1); if (lineAmino.trim().length() != 0) { if (beginPos < 0) { beginPos = nextPos + firstPosNotSpace(lineAmino) - 1; } endPos = nextPos + lastPosNotSpace(lineAmino) + 1; } nextPos += lineAmino.length(); } } } if (strand != 0) { saveORF(session, strand, beginPos, endPos, nextPos - 1, geneName, Integer.parseInt(args[1])); } session.flush(); tx.commit(); session.clear(); } catch (Exception e) { e.printStackTrace(); } finally { if (tx.isActive()) { tx.rollback(); } session.close(); } } public static void saveORF(Session session, int strand, int beginPos, int endPos, int lastPos, String geneName, int ncbiTaxonId) throws BioException { SimplePosition beginPosition, endPosition; if (strand < 0 && beginPos < 4) { beginPosition = new SimplePosition(true, false, beginPos); } else { beginPosition = new SimplePosition(beginPos); } if (strand > 0 && (endPos == lastPos)) { endPosition = new SimplePosition(false, true, endPos); } else { endPosition = new SimplePosition(endPos); } // save; NCBITaxon taxon = (NCBITaxon) session.createQuery("from Taxon where ncbi_taxon_id=:ncbiTaxonNumber").setInteger( "ncbiTaxonNumber", ncbiTaxonId).uniqueResult(); SimpleComparableOntology ontFeatures = (SimpleComparableOntology) RichObjectFactory.getObject( SimpleComparableOntology.class, new Object[] {Messages.getString("ontologyFeatures")}); SimpleComparableOntology ontGeneral = ((SimpleComparableOntology) RichObjectFactory.getObject( SimpleComparableOntology.class, new Object[] {Messages.getString("ontologyGeneral")})); SimpleRichFeature featureGene = (SimpleRichFeature) session.createQuery( "select f from Feature as f join f.parent as b where " + "f.name=:geneName and f.typeTerm=:geneTerm and b.taxon=:taxonId ").setString("geneName", geneName).setParameter( "taxonId", taxon).setParameter("geneTerm", ontFeatures.getOrCreateTerm(Messages.getString("termGene"))).uniqueResult(); RichFeature.Template ft = new RichFeature.Template(); ft.location = featureGene.getLocation().translate(0); ft.sourceTerm = ontGeneral.getOrCreateTerm(Messages.getString("termVR")); ft.typeTerm = ontFeatures.getOrCreateTerm(Messages.getString("termMRNA")); ft.annotation = new SimpleRichAnnotation(); ft.featureRelationshipSet = new TreeSet(); ft.rankedCrossRefs = new TreeSet(); SimpleRichFeature featureMRNA = (SimpleRichFeature) featureGene.createFeature(ft); featureMRNA.setName(geneName); ft = new RichFeature.Template(); if (strand < 0) { ft.location = new SimpleRichLocation(beginPosition, endPosition, 0, Strand.NEGATIVE_STRAND); } else { ft.location = new SimpleRichLocation(beginPosition, endPosition, 0, Strand.POSITIVE_STRAND); } ft.sourceTerm = ontGeneral.getOrCreateTerm(Messages.getString("termVR")); ft.typeTerm = ontFeatures.getOrCreateTerm(Messages.getString("termORF")); ft.annotation = new SimpleRichAnnotation(); ft.featureRelationshipSet = new TreeSet(); ft.rankedCrossRefs = new TreeSet(); SimpleRichFeature featureORF = (SimpleRichFeature) featureMRNA.createFeature(ft); featureORF.setName(geneName); } public static int firstPosNotSpace(String str) { int i = 0; while (i < str.length() && str.charAt(i) == ' ') { i++; } return i; } public static int lastPosNotSpace(String str) { int i = str.length() - 1; while (i >= 0 && str.charAt(i) == ' ') { i--; } return i; } } From John.Kneisler at USPTO.GOV Thu Aug 21 01:22:28 2008 From: John.Kneisler at USPTO.GOV (Kneisler, John (Raytheon)) Date: Wed, 20 Aug 2008 21:22:28 -0400 Subject: [Biojava-l] reverse translation Message-ID: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> I am new to BioJava and I am trying to write a reverse translation function. Find the most probable DNA symbolList using a given a protein symbolList and a translation table. I tried using an RNA alphabet and protein alphabet and then I tried to use the SimpleReversibleTranslationTable's untranslate method and I got an "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create translation table as the alphabets were different sizes:". Is there any way to create a reverse translation function using the BioJava framework? Thanks John From holland at eaglegenomics.com Tue Aug 26 14:20:50 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 26 Aug 2008 15:20:50 +0100 Subject: [Biojava-l] reverse translation In-Reply-To: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> References: <49B5900D63261343A9653545FA485DEF01F81457@EXCHANGE3.uspto.gov> Message-ID: Have you tried using the Protein-Term alphabet instead? cheers, Richard 2008/8/21 Kneisler, John (Raytheon) : > I am new to BioJava and I am trying to write a reverse translation > function. Find the most probable DNA symbolList using a given a > protein symbolList and a translation table. I tried using an RNA > alphabet and protein alphabet and then I tried to use the > SimpleReversibleTranslationTable's untranslate method and I got an > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes:". Is there any > way to create a reverse translation function using the BioJava > framework? > > Thanks > > John > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ From dmitry.repchevski at bsc.es Thu Aug 28 09:14:08 2008 From: dmitry.repchevski at bsc.es (Dmitry Repchevsky) Date: Thu, 28 Aug 2008 11:14:08 +0200 Subject: [Biojava-l] Structure.getChains() Message-ID: <48B66C60.8000104@bsc.es> Hello! I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). A Structure object contains "models" and "compounds". "models" contains ALL elements (aminoacid chains AND solvents) and "compaunds" only aminoacids. The method Structure.getChains() returns me ALL elements... is it ok? I mean that when I'm asking for a "chain" I do not expect to get a solvent... Just a curiosity, Dmitry From gabrielle_doan at gmx.net Thu Aug 28 14:16:52 2008 From: gabrielle_doan at gmx.net (Gabrielle Doan) Date: Thu, 28 Aug 2008 16:16:52 +0200 Subject: [Biojava-l] Problems with adding miRNA to sequence Message-ID: <48B6B354.6010307@gmx.net> Hi all, I would like to insert new features (miRNA) into my exitsting BioSQL database. At the moment the database contains the chromosomes 1-22, X, Y and MT downloaded from ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/. And now I have tried to add the information about miRNA from http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl into my database with following code: private void makeAFeature(String id, String chr, int startpos, int endpos, Strand strand, float score, String gene) throws ChangeVetoExceptionIllegalSymbolException { RichSequence rs = chromosomes.get(chr); if (rs == null) { rs = db.SearchForSequence(chr); chromosomes.put(chr, rs); } RichFeature feat = RichFeature.Tools.makeEmptyFeature(); feat.setName(id); RichLocation rl = new SimpleRichLocation(new SimplePosition(startpos),new SimplePosition(endpos), 1,strand); feat.setLocation(rl); try { feat.setTypeTerm(RichObjectFactory.getDefaultOntology() .getOrCreateTerm("miRNA")); feat.setType(feat.getTypeTerm().getName()); } catch (InvalidTermException e) { // TODO Auto-generated catch block e.printStackTrace(); } feat.getAnnotation().setProperty("score", Float.valueOf(score)); feat.getAnnotation().setProperty("gene", gene); feat.setParent(rs); rs.getFeatureSet().add(feat); } I successfully inserted the information for chromosome 3-22, X, Y and MT. But when I try to deal with chromosome 1, 2 in the same way I get following message: org.hibernate.exception.DataException: could not insert: [Feature] at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:77) at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:40) at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2163) at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2643) at org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:51) at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:279) at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:298) at org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:181) at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:107) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:187) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:172) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.performSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:94) at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:70) at org.hibernate.impl.SessionImpl.fireSaveOrUpdate(SessionImpl.java:507) at org.hibernate.impl.SessionImpl.saveOrUpdate(SessionImpl.java:499) at org.hibernate.engine.CascadingAction$5.cascade(CascadingAction.java:218) at org.hibernate.engine.Cascade.cascadeToOne(Cascade.java:268) at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:216) at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169) at org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296) at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242) at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219) at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169) at org.hibernate.engine.Cascade.cascade(Cascade.java:130) at org.hibernate.event.def.AbstractFlushingEventListener.cascadeOnFlush(AbstractFlushingEventListener.java:131) at org.hibernate.event.def.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:122) at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:65) at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:26) at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000) at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338) at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:106) at org.viewer.db.HBioSQLDB.updateSequence(HBioSQLDB.java:254) at org.viewer.io.MakeMiRNA.splitLine(MakeMiRNA.java:220) at org.viewer.io.MakeMiRNA.main(MakeMiRNA.java:57) Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value adjusted for column 'rank' at row 1 at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2973) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1600) at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1129) at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:681) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1368) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1283) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1268) at org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAndExtract(IdentityGenerator.java:73) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(AbstractReturningDelegate.java:33) ... 32 more It would be very nice if someone could help me. I am grateful for any hints. Thanks a lot. Cheers, Gabrielle From andreas at sdsc.edu Thu Aug 28 14:38:45 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 28 Aug 2008 07:38:45 -0700 Subject: [Biojava-l] Structure.getChains() In-Reply-To: <48B66C60.8000104@bsc.es> References: <48B66C60.8000104@bsc.es> Message-ID: <59a41c430808280738r2a84af03k21bdb7a228b25f92@mail.gmail.com> Hi Dmitry, The object model reflects the organization of data in PDB files http://www.wwpdb.org/docs.html. Chains can contain a mix of different groups of atoms. As such the BioJava object model allows you to distinguish between amino acids nucleotides and hetatoms on the Group level, rather than on the chain level. Andreas On Thu, Aug 28, 2008 at 2:14 AM, Dmitry Repchevsky wrote: > Hello! > > I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). > A Structure object contains "models" and "compounds". > "models" contains ALL elements (aminoacid chains AND solvents) and > "compaunds" only aminoacids. > The method Structure.getChains() returns me ALL elements... is it ok? I mean > that when I'm asking for a "chain" I do not expect to get a solvent... > > Just a curiosity, > > Dmitry > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Fri Aug 29 00:40:47 2008 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 28 Aug 2008 17:40:47 -0700 Subject: [Biojava-l] Structure.getChains() In-Reply-To: <48B6C3A3.5010102@bsc.es> References: <48B66C60.8000104@bsc.es> <59a41c430808280714g272fe9cewb2307f759598e779@mail.gmail.com> <48B6BB40.3090406@bsc.es> <59a41c430808280755p1ff06afdl443629d343f4eb51@mail.gmail.com> <48B6C3A3.5010102@bsc.es> Message-ID: <59a41c430808281740v2a228dd1jf99aa185ad01f2fe@mail.gmail.com> Hi Dmitry, Yes you are right. The compound object just contains pointers to the Chains if the info is available from the COMPND records in the PDB headers. If you want to be sure to have a full set of chains, please access them via the structure.getChains() or structure.getChains(modelNr) methods. If you think that is helpful, I can add a note for this into the javadoc for the compound class. It is not having much javadoc anyway, feel free to provide a patch! ;-) Andreas On Thu, Aug 28, 2008 at 8:26 AM, Dmitry Repchevsky wrote: > Hello, > > Yeah, but you have: > > Structure > | > +-Model(s) > | | > | Chain(s) // ALL chains (ATOM/HETATM...) > | > +-Compound(s) > | > Chain(s) // Only those that are in "COMPND" > > Nothing wrong here. I just thought that Structure.getChains() returns chains > from COMPND (Aminoacids) > Cheers, > > Dmitry > > Andreas Prlic wrote: >> >> not sure if I understand you correctly. If you look at >> http://biojava.org/wiki/BioJava:CookBook:PDB:atoms >> you will see how the object model hierarchy looks like. >> Groups are below chain. >> >> A >> >> On Thu, Aug 28, 2008 at 7:50 AM, Dmitry Repchevsky >> wrote: >> >>> >>> Hello Andreas, >>> >>> I thought that chains are part of Compound (COMPND) so calling >>> Structure.getChains() would get them from Compound and not from "groups". >>> I was wrong. :-) >>> >>> It would be nice to put this in javadoc of the getChains() method... >>> >>> Thank you very much, >>> >>> Dmitry >>> >>> Andreas Prlic wrote: >>> >>>> >>>> Hi Dmitry, >>>> >>>> The object model reflects the organization of data in PDB files >>>> http://www.wwpdb.org/docs.html. >>>> >>>> Chains can contain a mix of different groups of atoms. As such the >>>> BioJava object model allows you to distinguish between amino acids >>>> nucleotides and hetatoms on the Group level, rather than on the chain >>>> level. >>>> >>>> Andreas >>>> >>>> >>>> >>>> On Thu, Aug 28, 2008 at 2:14 AM, Dmitry Repchevsky >>>> wrote: >>>> >>>> >>>>> >>>>> Hello! >>>>> >>>>> I have a PDB (1pio) with 2 chains and 1 HETATM solvent (water). >>>>> A Structure object contains "models" and "compounds". >>>>> "models" contains ALL elements (aminoacid chains AND solvents) and >>>>> "compaunds" only aminoacids. >>>>> The method Structure.getChains() returns me ALL elements... is it ok? I >>>>> mean >>>>> that when I'm asking for a "chain" I do not expect to get a solvent... >>>>> >>>>> Just a curiosity, >>>>> >>>>> Dmitry >>>>> _______________________________________________ >>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > > From holland at eaglegenomics.com Fri Aug 29 08:08:34 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 29 Aug 2008 09:08:34 +0100 Subject: [Biojava-l] Fwd: reverse translation In-Reply-To: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> References: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> Message-ID: I've forwarded this to the list in case someone comes up with an answer quicker than me. :) ---------- Forwarded message ---------- From: Kneisler, John (Raytheon) Date: 2008/8/28 Subject: RE: [Biojava-l] reverse translation To: holland at eaglegenomics.com Richard, Thanks for your reply. I think I was using the Protein-Term alphabet since I assigned a protein FiniteAlphabet variable to ProteinTools.getTAlphabet(). The problem I am having is matching the correct RNA alphabet to the correct protein alphabet. Here are the runtime exceptions I am getting: org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create translation table as the alphabets were different sizes: 22:PROTEIN64:(RNA x RNA x RNA) at org.biojava.bio.symbol.SimpleReversibleTranslationTable.(SimpleRev ersibleTranslationTable.java:88) I have tried using the RNATools.getCodonAlphabet(); method to get an RNA codon alphabet thinking that would match the protein alphabet when assigning a translation table. Unfortunately my lack of experience and understanding is showing. Any help would be appreciated. Thanks John Kneisler -----Original Message----- From: dicknetherlands at gmail.com [mailto:dicknetherlands at gmail.com] On Behalf Of Richard Holland Sent: Tuesday, August 26, 2008 10:21 AM To: Kneisler, John (Raytheon) Cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] reverse translation Have you tried using the Protein-Term alphabet instead? cheers, Richard 2008/8/21 Kneisler, John (Raytheon) : > I am new to BioJava and I am trying to write a reverse translation > function. Find the most probable DNA symbolList using a given a > protein symbolList and a translation table. I tried using an RNA > alphabet and protein alphabet and then I tried to use the > SimpleReversibleTranslationTable's untranslate method and I got an > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes:". Is there any > way to create a reverse translation function using the BioJava > framework? > > Thanks > > John > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/ From markjschreiber at gmail.com Fri Aug 29 09:11:55 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 29 Aug 2008 17:11:55 +0800 Subject: [Biojava-l] Fwd: reverse translation In-Reply-To: References: <49B5900D63261343A9653545FA485DEF01F8149D@EXCHANGE3.uspto.gov> Message-ID: <93b45ca50808290211h766b29a7uf1c039b11518817d@mail.gmail.com> Hi - The best approach will be to use the CodonPref and SimpleCodonPref classes. These let you determine the frequency of all synonymous codons for a given amino acid. There is a CodonPrefTools class that contains convenience methods and also has built in codon use tables for some common organisms. These classes are in the org.biojava.bio.symbol package. - Mark On 8/29/08, Richard Holland wrote: > I've forwarded this to the list in case someone comes up with an > answer quicker than me. :) > > > ---------- Forwarded message ---------- > From: Kneisler, John (Raytheon) > Date: 2008/8/28 > Subject: RE: [Biojava-l] reverse translation > To: holland at eaglegenomics.com > > > Richard, > Thanks for your reply. I think I was using the Protein-Term > alphabet since I assigned a protein FiniteAlphabet variable to > ProteinTools.getTAlphabet(). The problem I am having is matching the > correct RNA alphabet to the correct protein alphabet. Here are the > runtime exceptions I am getting: > org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > translation table as the alphabets were different sizes: > 22:PROTEIN64:(RNA x RNA x RNA) > at > org.biojava.bio.symbol.SimpleReversibleTranslationTable.(SimpleRev > ersibleTranslationTable.java:88) > > I have tried using the RNATools.getCodonAlphabet(); method to get an RNA > codon alphabet thinking that would match the protein alphabet when > assigning a translation table. Unfortunately my lack of experience and > understanding is showing. Any help would be appreciated. > > Thanks > John Kneisler > > -----Original Message----- > From: dicknetherlands at gmail.com [mailto:dicknetherlands at gmail.com] On > Behalf Of Richard Holland > Sent: Tuesday, August 26, 2008 10:21 AM > To: Kneisler, John (Raytheon) > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] reverse translation > > Have you tried using the Protein-Term alphabet instead? > > cheers, > Richard > > 2008/8/21 Kneisler, John (Raytheon) : > > I am new to BioJava and I am trying to write a reverse translation > > function. Find the most probable DNA symbolList using a given a > > protein symbolList and a translation table. I tried using an RNA > > alphabet and protein alphabet and then I tried to use the > > SimpleReversibleTranslationTable's untranslate method and I got an > > "org.biojava.bio.symbol.IllegalAlphabetException: Couldn't create > > translation table as the alphabets were different sizes:". Is there > any > > way to create a reverse translation function using the BioJava > > framework? > > > > Thanks > > > > John > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > -- > Richard Holland > Finance Director > Eagle Genomics > http://www.eaglegenomics.com/ > > > > > > -- > Richard Holland > Finance Director > Eagle Genomics > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From holland at eaglegenomics.com Sat Aug 30 19:00:37 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Sat, 30 Aug 2008 20:00:37 +0100 Subject: [Biojava-l] [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: BioJava does not include functions to calculate e-values - nor should it, as the e-value you wish to calculate depends entirely on the search algorithm the e-value is associated with. You will need to determine an algorithm of your own that will calculate e-values based on your own knowledge of the sequence search algorithm that you have developed. The simplest form of e-value is a function of the query sequence length and the total length of the sequences available to search. More complex forms include such things as analyses of the content of the sequences and the percent identity of the match, or in fact anything that may be appropriate and significant to the way the search algorithm works. It's basically up to you how to define it. Good luck! cheers, Richard 2008/8/30 simpleyrx : > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- Richard Holland Finance Director Eagle Genomics http://www.eaglegenomics.com/