From hlapp at drycafe.net Sat Feb 5 18:45:47 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 5 Feb 2011 18:45:47 -0500 Subject: [Biojava-l] NESCent Seeks Hackathon Whitepapers In-Reply-To: <0D7D89E4-C0D4-4347-A94C-21800E927746@ad.unc.edu> References: <0D7D89E4-C0D4-4347-A94C-21800E927746@ad.unc.edu> Message-ID: <066A2391-6041-408C-B26E-9B867DE785C7@drycafe.net> The National Evolutionary Synthesis Center (NESCent), in keeping with its objective to promote collaborative development of open-source, reusable, and standards-supporting informatics resources, sponsors highly collaborative, face-to-face software development events, called "hackathons" (see [1]). To ensure that this program continues to be responsive to user needs and to tap into the expertise and creativity of the evolutionary biology community, NESCent is soliciting short whitepapers (2-6 pages) [2] on potential target areas for future hackathons. To further encourage submissions, we have now distilled specific guidelines for proposing hackathon events, based on the experiences gained from events we have sponsored in the past: http://informatics.nescent.org/wiki/Hackathon_Whitepaper_Guidelines The Center's Call for Informatics Whitepapers [3] includes not only hackathons, but also a large spectrum of other initiatives to be undertaken by the Center, including training, software development, collaborative ontology development, and coordination of data standards. Whitepapers are accepted at any time and reviewed on an on- going basis. URLs: [1] Collaborative cyberinfrastructure events and programs organized by NESCent: http://informatics.nescent.org/wiki/Main_Page [2] NESCent Call for Informatics Whitepapers http://www.nescent.org/informatics/whitepapers.php [3] Hackathon Whitepaper Guidelines: http://informatics.nescent.org/wiki/Hackathon_Whitepaper_Guidelines [4] Past NESCent-sponsored hackathons: http://informatics.nescent.org/wiki/Main_Page#Hackathons From jayunit100 at gmail.com Mon Feb 7 13:44:05 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Mon, 7 Feb 2011 13:44:05 -0500 Subject: [Biojava-l] average or minimal energy structure for nmr structures... Message-ID: Hi guys : Does anybody know a simple biojava way to calculate an average structure, or alternatively, find the lowest energy structure from an nmr structure with multiple models ? Im doing it using a set of nested for loops... I thought maybe there might be support in the api. for(Group g1 : c.getAtomGroups()) for(Atom a1 : g1.getAtoms()) { float xAvg,yAvg,zAvg; for(int i = 0 ; i < s.nrModels(); i++) { ///etc.... } } -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Tue Feb 8 01:38:14 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 7 Feb 2011 22:38:14 -0800 Subject: [Biojava-l] average or minimal energy structure for nmr structures... In-Reply-To: References: Message-ID: Hi Jay, You could calculate average coordinates and RMSD with BioJava. However in terms of energy calculations BioJava can't do much so far... Andreas On Mon, Feb 7, 2011 at 10:44 AM, Jay Vyas wrote: > Hi guys : Does anybody know a simple biojava way to calculate an average > structure, or alternatively, find the lowest energy structure from an nmr > structure with multiple models ?? > Im doing it using a set of nested for loops... I thought maybe there might > be support in the api. > > > ?for(Group g1 : c.getAtomGroups()) > > for(Atom a1 : g1.getAtoms()) > > { > > ? ? ? ? ? ? ? ? ? ? ? float xAvg,yAvg,zAvg; > > for(int i = 0 ; i < s.nrModels(); i++) > > { > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ///etc.... > > } > > > > } > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From uchathuranga at gmail.com Wed Feb 9 00:19:00 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Wed, 9 Feb 2011 10:49:00 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 Message-ID: hi, I am a student a University of Moratuwa , Sri lanka. I am planning to participate gsoc 2011 and I would like to work on a project regarding bioinfomatics. So I would like to know whether you are planning to be a mentor organization this year also. If so i would like to know what kinds of projects that you are planning to propose. Thanks Regard udana From andreas at sdsc.edu Wed Feb 9 00:49:23 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 8 Feb 2011 21:49:23 -0800 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Udana, It is a bit early, but given the success form last year I think it is for sure that BioJava (as part of the the Open Bioinformatics Foundation) will apply again to become a mentoring organisation. Google will announce the list of accepted mentoring organizations on March 18th, afterwards the time to discuss proposals will start. http://www.google-melange.com/document/show/gsoc_program/google/gsoc2011/faqs#timeline Last year we had several Mentor proposed projects. Google is also very interested in seeing student proposed projects and they recommend those due to their high success rate. If you want to prepare a proposal you could already start thinking of what kind of project you would like to work on. Similarly, everybody else who wants to propose a project and perhaps become a mentor can also start to think about that. Still we will have to wait until March 18th before we will know if there is actually any funding this year. Andreas On Tue, Feb 8, 2011 at 9:19 PM, udana chathuranga wrote: > hi, > > I am a student a University of Moratuwa , Sri lanka. I am planning to > participate gsoc 2011 and I would like to work on a project regarding > bioinfomatics. So I would like to know whether you are planning to be a > mentor organization this year also. If so i would like to know what kinds of > projects that you are planning to propose. > > Thanks > Regard > udana > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From uchathuranga at gmail.com Wed Feb 9 01:13:34 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Wed, 9 Feb 2011 11:43:34 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Andreas, Thanks for the quick reply.I appreciate your help and I want know how I can contribute to the project even if you are not participating for this year gsoc project. I have studied bioinfomatics as one of my subject in the university and I would like to apply those knowledge in to this project. I would be thankful if you can guide me to start to participate in the project. I have solid background on java and bioinomatics related areas like Biological Sequence data analysis - The Sequence assembly problem, The Gene Finding Problem, The Sequence Alignment Problem, The Genome Rearrangement problem, The Protein Folding Problem, The Motif Discovery Problem Phylogenetics Biological Network Motifs Above mention areas are some of the subjects that I have learned in our course and I would like to know how can I apply this knowledge to this project? Thanks Regards Udana From andreas at sdsc.edu Wed Feb 9 11:27:35 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 9 Feb 2011 08:27:35 -0800 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Udana, There is some overlap between your background and many aspects of BioJava. I recommend taking a look at the Cookbook for a start learning more about BioJava. To give you an idea of what was going on last year, the two biojava related projects that were funded were A) Development of an all Java multiple sequence alignment module B) Identification and classification of posttranslational protein modifications. Here the project pages for more details: http://biojava.org/wiki/Google_Summer_of_Code Andreas On Tue, Feb 8, 2011 at 10:13 PM, udana chathuranga wrote: > Hi Andreas, > > Thanks for the quick reply.I appreciate your help and I want know how I can > contribute to the project even if you are not participating for this year > gsoc project. I have studied bioinfomatics as one of my subject in the > university and I would like to apply those knowledge in to this project. I > would be thankful if you can guide me to start to participate in the > project. > I have solid background on java and bioinomatics related areas like > ? ? Biological Sequence data analysis - > > The Sequence assembly problem, The Gene Finding Problem, The Sequence > Alignment Problem, The Genome Rearrangement problem, The Protein Folding > Problem, The Motif Discovery Problem > > ??? Phylogenetics > > ??? Biological Network Motifs > > Above mention areas are some of the subjects that I have learned in our > course and I would like to know how can I apply this knowledge to this > project? > > > Thanks > > Regards > > Udana > From uchathuranga at gmail.com Wed Feb 9 13:45:55 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Thu, 10 Feb 2011 00:15:55 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Andreas, First of all thanks for the guide.i have started reading the CookBook in http://biojava.org/wiki/BioJava:CookBook. I have also did some sample coding using the biojava3-core and found this http://biojava.org/wiki/BioJava:CookBook:Core:FastaReadWrite and looked in details about the fasta file format.About fasta file format I studied about in the link http://www.ncbi.nlm.nih.gov/BLAST/fasta.shtml. As you pointed out I have also looked in the last years two gsoc projects.After going through those project I had this question what kinds of modules that I can work on. I would like hear some suggestions about modules that you are looking forward to. I also like to know what kinds of file formats that I should look into in the future in order to develop a good module to the system. Thanks regards Udana From jayunit100 at gmail.com Wed Feb 9 14:22:05 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Wed, 9 Feb 2011 14:22:05 -0500 Subject: [Biojava-l] average structure Message-ID: Hi Guys : I was also wondering if biojava supported creating an average or mean structure from a bundle........ -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Wed Feb 9 16:48:40 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 9 Feb 2011 13:48:40 -0800 Subject: [Biojava-l] average structure In-Reply-To: References: Message-ID: A geometric average should be easy to calculate. All NMR models have the same number of atoms, so you could just average over all alternative positions for an atom and use that to build up a mean structure... However I would take that with a grain of salt and make sure there are no distance violations by visual inspection and some quality control scripts. There are standard/ideal coordinates available for all amino acids, that you could use to compare with the distances between atoms in your average structure. Andreas On Wed, Feb 9, 2011 at 11:22 AM, Jay Vyas wrote: > Hi Guys : I was also wondering if biojava supported creating an average or > mean structure from a bundle........ > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From craig.adrian.berry at gmail.com Fri Feb 11 05:54:35 2011 From: craig.adrian.berry at gmail.com (Craig Berry) Date: Fri, 11 Feb 2011 10:54:35 +0000 Subject: [Biojava-l] GFF3 Reader Message-ID: Can I just have someone validate this logic in the GFF3Reader for me to see if this is a bug or not. If I have a GFF3 file with the following line: chrI SGD repeat_region 1 62 . - . ID=TEL01L-TR;Name=TEL01L-TR;Note=Terminal%20stretch%20of%20telomeric%20repeats%20on%20the%20left%20arm%20of%20Chromosome%20I;dbxref=SGD:S000028864 When parsing the file then, the class calls Location.fromBio using the start 1, end 62 and strand ?ve. Since the strand is ?ve it needs to convert the positions to negative values and reverse the start and end. However, as the javadocs explains: ?In biocoordinates, the start index of a range is represented in origin 1 (ie the very first index is 1, not 0), and end= start + length - 1.? So before the end is reassigned its value is reduced by 1 and then negated: e = - ( start ? 1) With a start value of 1 as in this case, the end then becomes 0, such that the range now runs -62 to 0. This causes a problem when adding this Feature to the feature collection since it considers a position of value 0 to be on the +ve strand, such that when Location.plus() is called the check for a negative location (i.e. both start and end being < 0) returns false and so you end up trying to create a Location with a ?ne start position but a +ve end, which throws an IllegalArgumentException. So is there something fishy here or not? I?m assuming that the GFF content is valid. Thanks in advance Craig From willishf at ufl.edu Fri Feb 11 09:14:08 2011 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 11 Feb 2011 09:14:08 -0500 Subject: [Biojava-l] GFF3 Reader In-Reply-To: References: Message-ID: Craig I think I fixed this problem but not sure if it is in the current released jar files. The < logic was wrong where 0 wasn't be treated properly. I will send you an update jar to see if the problem goes away. Scooter On Fri, Feb 11, 2011 at 5:54 AM, Craig Berry wrote: > Can I just have someone validate this logic in the GFF3Reader for me > to see if this is a bug or not. > > If I have a GFF3 file with the following line: > > chrI ? ?SGD ? ? repeat_region ? 1 ? ? ? 62 ? ? ?. ? ? ? - ? ? ? . ? ? ? ID=TEL01L-TR;Name=TEL01L-TR;Note=Terminal%20stretch%20of%20telomeric%20repeats%20on%20the%20left%20arm%20of%20Chromosome%20I;dbxref=SGD:S000028864 > > When parsing the file then, the class calls Location.fromBio using the > start 1, end 62 and strand ?ve. > Since the strand is ?ve it needs to convert the positions to negative > values and reverse the start and end. However, as the javadocs > explains: > > ?In biocoordinates, the start index of a range is represented in > origin 1 (ie the very first index is 1, not 0), ?and end= start + > length - 1.? > > So before the end is reassigned its value is reduced by 1 and then > negated: e = - ( start ? 1) With a start value of 1 as in this case, > the end then becomes 0, such that the range now runs -62 to 0. > > This causes a problem when adding this Feature to the feature > collection since it considers a position of value 0 to be on the +ve > strand, such that when Location.plus() is called the check for a > negative location (i.e. both start and end being < 0) returns false > and so you end up trying to create a Location with a ?ne start > position but a +ve end, which throws an IllegalArgumentException. > > So is there something fishy here or not? I?m assuming that the GFF > content is valid. > > Thanks in advance > > Craig > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From uchathuranga at gmail.com Fri Feb 11 10:15:19 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Fri, 11 Feb 2011 20:45:19 +0530 Subject: [Biojava-l] Problem with MultipleSequenceAlignment Message-ID: hi all, When I was going through the biojava cookbook as I was interested in this project. I tried the example in the page http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound exception for the line "Profile profile = Alignments. getMultipleSequenceAlignment(lst);". Error Message: Exception in thread "main" java.lang.NoClassDefFoundError: org/forester/phylogenyinference/DistanceMatrix at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) at CookbookMSA.main(CookbookMSA.java:18) Caused by: java.lang.ClassNotFoundException: org.forester.phylogenyinference.DistanceMatrix at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Is this a know issue or Am I doing something wrong with the code? Thanks Regards udana. From willishf at ufl.edu Fri Feb 11 11:28:32 2011 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 11 Feb 2011 11:28:32 -0500 Subject: [Biojava-l] Problem with MultipleSequenceAlignment In-Reply-To: References: Message-ID: You are probably missing a reference to the forester jar file located in the biojava3-phylo module. On Fri, Feb 11, 2011 at 10:15 AM, udana chathuranga wrote: > hi all, > > When I was going through the biojava cookbook as I was interested in this > project. I tried the example in the page > http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound > exception for the line "Profile profile > = Alignments. > getMultipleSequenceAlignment(lst);". > > Error Message: > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/forester/phylogenyinference/DistanceMatrix > ? ?at > org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) > ? ?at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) > ? ?at CookbookMSA.main(CookbookMSA.java:18) > Caused by: java.lang.ClassNotFoundException: > org.forester.phylogenyinference.DistanceMatrix > ? ?at java.net.URLClassLoader$1.run(Unknown Source) > ? ?at java.security.AccessController.doPrivileged(Native Method) > ? ?at java.net.URLClassLoader.findClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClass(Unknown Source) > ? ?at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClassInternal(Unknown Source) > > Is this a know issue or Am I doing something wrong with the code? > > Thanks > Regards > udana. > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From uchathuranga at gmail.com Fri Feb 11 11:58:48 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Fri, 11 Feb 2011 22:28:48 +0530 Subject: [Biojava-l] Problem with MultipleSequenceAlignment In-Reply-To: References: Message-ID: > > Hi Scooter, Thanks Scooter, that's works. Regards udana From mandarijnopw8 at gmail.com Sun Feb 13 09:32:14 2011 From: mandarijnopw8 at gmail.com (Shamanou van Leeuwen) Date: Sun, 13 Feb 2011 15:32:14 +0100 Subject: [Biojava-l] programming error Message-ID: <4D57EB6E.8010004@gmail.com> hi guys, i made am making a tool to translate dna to protein using biojava. But i am getting some errors that is do not fully understand. Can somebody please tell me what i am doing wrong? script: http://pastebin.com/TJSjkgqK errors: http://pastebin.com/iFXYWEZB From anantpossible at gmail.com Sun Feb 13 12:47:28 2011 From: anantpossible at gmail.com (Anant Jain) Date: Sun, 13 Feb 2011 23:17:28 +0530 Subject: [Biojava-l] programming error In-Reply-To: <4D57EB6E.8010004@gmail.com> References: <4D57EB6E.8010004@gmail.com> Message-ID: Hi Shamanou, I can see the error is around line no 192~193, I would suggest you to print the codons list first, like below. 192 SOP (CODON LIST); 193 prot = SymbolListViews.translate(codons,RNATools.getGeneticCode( "UNIVERSAL")); Paste result in mail thread then we can start debugging . Regards, Anant On Sun, Feb 13, 2011 at 8:02 PM, Shamanou van Leeuwen < mandarijnopw8 at gmail.com> wrote: > hi guys, > > i made am making a tool to translate dna to protein using biojava. > But i am getting some errors that is do not fully understand. > Can somebody please tell me what i am doing wrong? > > > script: > http://pastebin.com/TJSjkgqK > > errors: > http://pastebin.com/iFXYWEZB > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Anant Jain B.Tech Bioinformatics, RHCE Software Engineer, Persistent Systems Limited, Pune From willishf at ufl.edu Sun Feb 13 14:11:58 2011 From: willishf at ufl.edu (Scooter Willis) Date: Sun, 13 Feb 2011 14:11:58 -0500 Subject: [Biojava-l] programming error In-Reply-To: <4D57EB6E.8010004@gmail.com> References: <4D57EB6E.8010004@gmail.com> Message-ID: Depending on what you need you may want to try out biojava3. On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen wrote: > hi guys, > > i made am making a tool to translate dna to protein using biojava. > But ?i am getting some errors that is do not fully understand. > Can somebody please tell me what i am doing wrong? > > > script: > http://pastebin.com/TJSjkgqK > > errors: > http://pastebin.com/iFXYWEZB > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From mandarijnopw8 at gmail.com Sun Feb 13 21:27:32 2011 From: mandarijnopw8 at gmail.com (Shamanou van Leeuwen) Date: Mon, 14 Feb 2011 03:27:32 +0100 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> Message-ID: <4D589314.5040705@gmail.com> On 13-02-11 20:11, Scooter Willis wrote: > Depending on what you need you may want to try out biojava3. > > On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > wrote: >> hi guys, >> >> i made am making a tool to translate dna to protein using biojava. >> But i am getting some errors that is do not fully understand. >> Can somebody please tell me what i am doing wrong? >> >> >> script: >> http://pastebin.com/TJSjkgqK >> >> errors: >> http://pastebin.com/iFXYWEZB >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> i a, trying biojava 3 now but i am still doing something wrong. http://pastebin.com/uLz934gr From andreas at sdsc.edu Mon Feb 14 03:16:48 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 00:16:48 -0800 Subject: [Biojava-l] BioJava 3.0.1 released Message-ID: BioJava 3.0.1 has been released and is available from http://www.biojava.org/wiki/BioJava:Download . The 3.0.1 release is mainly a bug fix release for the recent 3.0 code base, which provides a major rewrite of the biojava. A couple of noteworthy improvements: - core: fixed an issue with sequence index positions, new utility methods for memory efficient parsing of large fasta files - structure: Fixed issues with PDB header parsing and more stability with non-standard PDB files. Added new algorithm to automatically infer protein domain boundaries. - web services: Fixed wrong dependency on old codebase and overall improvements in functionality of remote blast web service calls. - protein modifications: Minor bugfixes In parallel the biojava-legacy code base has been updated to version 1.8.1 and it contains a bug fix related to circular locations. Thanks to all contributors for making this release possible. About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy Biojava-ing Andreas From ayates at ebi.ac.uk Mon Feb 14 05:07:12 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 14 Feb 2011 10:07:12 +0000 Subject: [Biojava-l] programming error In-Reply-To: <4D589314.5040705@gmail.com> References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: If you want to do frame based translation then there is an easier way of accomplishing this. The TranscriptionEngine allows you to translate in multiple frames and retrieve that information in a Map such as: TranscriptionEngine te = TranscriptionEngine.getDefault(); Frame[] frames = Frame.getForwardFrames(); Map> results = te.multipleFrameTranslation(dna, frames); Change the static call on Frame to Frame.getAllFrames() then you will do a full 6 frame translation. Also I would avoid calling the getSequenceAsString() method until you need to output it to screen. The Sequence interface provides an adequate set of methods for testing length of a sequence. However if you want is the longest translation then I would replace that with (assuming the above code): List> translations = new ArrayList>(results.getValues()); Collections.sort(translations, new Comparator>() { public int compare(Sequence o1, Sequence o2) { Integer o1Length = o1.getLength(); Integer o2Length = o2.getLength(); return o1Length.compareTo(o2Length); } }); Sequence longest = translations.get(translations.size()-1); However I would like to see what errors you are pulling up from BioJava3 in case there is a scenario we are not currently taking into account Andy On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > On 13-02-11 20:11, Scooter Willis wrote: >> Depending on what you need you may want to try out biojava3. >> >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen >> wrote: >>> hi guys, >>> >>> i made am making a tool to translate dna to protein using biojava. >>> But i am getting some errors that is do not fully understand. >>> Can somebody please tell me what i am doing wrong? >>> >>> >>> script: >>> http://pastebin.com/TJSjkgqK >>> >>> errors: >>> http://pastebin.com/iFXYWEZB >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> > i a, trying biojava 3 now but i am still doing something wrong. > > http://pastebin.com/uLz934gr > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From andreas at sdsc.edu Mon Feb 14 10:42:57 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 07:42:57 -0800 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: Hi Andy, Could we get a Cookbook page for this? sounds like it would be good to have a bit more docu on this topic ... Thanks! Andreas On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > If you want to do frame based translation then there is an easier way of > accomplishing this. The TranscriptionEngine allows you to translate in > multiple frames and retrieve that information in a Map such as: > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > Frame[] frames = Frame.getForwardFrames(); > Map> results = > te.multipleFrameTranslation(dna, frames); > > Change the static call on Frame to Frame.getAllFrames() then you will do a > full 6 frame translation. > > Also I would avoid calling the getSequenceAsString() method until you need > to output it to screen. The Sequence interface provides an adequate set of > methods for testing length of a sequence. However if you want is the longest > translation then I would replace that with (assuming the above code): > > List> translations = new > ArrayList>(results.getValues()); > Collections.sort(translations, new > Comparator>() { > public int compare(Sequence o1, > Sequence o2) { > Integer o1Length = o1.getLength(); > Integer o2Length = o2.getLength(); > return o1Length.compareTo(o2Length); > } > }); > Sequence longest = > translations.get(translations.size()-1); > > However I would like to see what errors you are pulling up from BioJava3 in > case there is a scenario we are not currently taking into account > > Andy > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > On 13-02-11 20:11, Scooter Willis wrote: > >> Depending on what you need you may want to try out biojava3. > >> > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > >> wrote: > >>> hi guys, > >>> > >>> i made am making a tool to translate dna to protein using biojava. > >>> But i am getting some errors that is do not fully understand. > >>> Can somebody please tell me what i am doing wrong? > >>> > >>> > >>> script: > >>> http://pastebin.com/TJSjkgqK > >>> > >>> errors: > >>> http://pastebin.com/iFXYWEZB > >>> _______________________________________________ > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> > > i a, trying biojava 3 now but i am still doing something wrong. > > > > http://pastebin.com/uLz934gr > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From ayates at ebi.ac.uk Mon Feb 14 11:13:35 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 14 Feb 2011 16:13:35 +0000 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> I've added a bit to the original cookbook page we had. I can migrate these changes to the cookbook front page later on in the week Andy On 14 Feb 2011, at 15:42, Andreas Prlic wrote: > Hi Andy, > > Could we get a Cookbook page for this? sounds like it would be good to have a bit more docu on this topic ... > > Thanks! > > Andreas > > On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > If you want to do frame based translation then there is an easier way of accomplishing this. The TranscriptionEngine allows you to translate in multiple frames and retrieve that information in a Map such as: > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > Frame[] frames = Frame.getForwardFrames(); > Map> results = te.multipleFrameTranslation(dna, frames); > > Change the static call on Frame to Frame.getAllFrames() then you will do a full 6 frame translation. > > Also I would avoid calling the getSequenceAsString() method until you need to output it to screen. The Sequence interface provides an adequate set of methods for testing length of a sequence. However if you want is the longest translation then I would replace that with (assuming the above code): > > List> translations = new ArrayList>(results.getValues()); > Collections.sort(translations, new Comparator>() { > public int compare(Sequence o1, Sequence o2) { > Integer o1Length = o1.getLength(); > Integer o2Length = o2.getLength(); > return o1Length.compareTo(o2Length); > } > }); > Sequence longest = translations.get(translations.size()-1); > > However I would like to see what errors you are pulling up from BioJava3 in case there is a scenario we are not currently taking into account > > Andy > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > On 13-02-11 20:11, Scooter Willis wrote: > >> Depending on what you need you may want to try out biojava3. > >> > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > >> wrote: > >>> hi guys, > >>> > >>> i made am making a tool to translate dna to protein using biojava. > >>> But i am getting some errors that is do not fully understand. > >>> Can somebody please tell me what i am doing wrong? > >>> > >>> > >>> script: > >>> http://pastebin.com/TJSjkgqK > >>> > >>> errors: > >>> http://pastebin.com/iFXYWEZB > >>> _______________________________________________ > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> > > i a, trying biojava 3 now but i am still doing something wrong. > > > > http://pastebin.com/uLz934gr > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > ----------------------------------------------------------------------- > Dr. Andreas Prlic > Senior Scientist, RCSB PDB Protein Data Bank > University of California, San Diego > (+1) 858.246.0526 > ----------------------------------------------------------------------- -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From andreas at sdsc.edu Mon Feb 14 18:57:07 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 15:57:07 -0800 Subject: [Biojava-l] programming error In-Reply-To: <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> Message-ID: Thanks Andy, yea, would be good to add links on the cookbook front page, otherwise the sections are hard to find... Andreas On Mon, Feb 14, 2011 at 8:13 AM, Andy Yates wrote: > I've added a bit to the original cookbook page we had. I can migrate these > changes to the cookbook front page later on in the week > > Andy > > On 14 Feb 2011, at 15:42, Andreas Prlic wrote: > > > Hi Andy, > > > > Could we get a Cookbook page for this? sounds like it would be good to > have a bit more docu on this topic ... > > > > Thanks! > > > > Andreas > > > > On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > > If you want to do frame based translation then there is an easier way of > accomplishing this. The TranscriptionEngine allows you to translate in > multiple frames and retrieve that information in a Map such as: > > > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > > Frame[] frames = Frame.getForwardFrames(); > > Map> results = > te.multipleFrameTranslation(dna, frames); > > > > Change the static call on Frame to Frame.getAllFrames() then you will do > a full 6 frame translation. > > > > Also I would avoid calling the getSequenceAsString() method until you > need to output it to screen. The Sequence interface provides an adequate set > of methods for testing length of a sequence. However if you want is the > longest translation then I would replace that with (assuming the above > code): > > > > List> translations = new > ArrayList>(results.getValues()); > > Collections.sort(translations, new > Comparator>() { > > public int compare(Sequence o1, > Sequence o2) { > > Integer o1Length = o1.getLength(); > > Integer o2Length = o2.getLength(); > > return o1Length.compareTo(o2Length); > > } > > }); > > Sequence longest = > translations.get(translations.size()-1); > > > > However I would like to see what errors you are pulling up from BioJava3 > in case there is a scenario we are not currently taking into account > > > > Andy > > > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > > > On 13-02-11 20:11, Scooter Willis wrote: > > >> Depending on what you need you may want to try out biojava3. > > >> > > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > > >> wrote: > > >>> hi guys, > > >>> > > >>> i made am making a tool to translate dna to protein using biojava. > > >>> But i am getting some errors that is do not fully understand. > > >>> Can somebody please tell me what i am doing wrong? > > >>> > > >>> > > >>> script: > > >>> http://pastebin.com/TJSjkgqK > > >>> > > >>> errors: > > >>> http://pastebin.com/iFXYWEZB > > >>> _______________________________________________ > > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > > >>> > > >>> > > > i a, trying biojava 3 now but i am still doing something wrong. > > > > > > http://pastebin.com/uLz934gr > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > > Andrew Yates Ensembl Genomes Engineer > > EMBL-EBI Tel: +44-(0)1223-492538 > > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > -- > > ----------------------------------------------------------------------- > > Dr. Andreas Prlic > > Senior Scientist, RCSB PDB Protein Data Bank > > University of California, San Diego > > (+1) 858.246.0526 > > ----------------------------------------------------------------------- > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > From andreas at sdsc.edu Tue Feb 15 21:45:49 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 15 Feb 2011 18:45:49 -0800 Subject: [Biojava-l] GSoC status Message-ID: Hi Udana, I would like to motivate student proposed projects as much as possible this year. This means if you want to distinguish yourself with your application, it would be great if you would suggest a project by yourself that you find exciting. If our application to become a sponsoring organisation gets approved by Google, we will start discussing proposals around March 18th. We will probably suggest possible projects at some point, but I really would like to give preference to student proposed projects this year. To get ideas about possible student proposed projects I recommend getting a module that is related to your studies and try to do some work with it. This might give you ideas about what is missing and what could be added in a project. Andreas On Tue, Feb 15, 2011 at 11:35 AM, udana chathuranga wrote: > Hi Andreas, > > Any updates about project ideas to gsoc2011 ? > > > > > Thanks > Regards > Udana > From jw12 at sanger.ac.uk Thu Feb 17 09:30:08 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Feb 2011 14:30:08 +0000 Subject: [Biojava-l] DAS Workshop Registration Closing Soon Message-ID: <0BCCE860-9AEA-4377-A9D6-F28E264DE43A@sanger.ac.uk> Registration closes for the DAS workshop at 5pm this Friday GMT. Limited places still available. Please note that for the tutorials day (Day 1) it is advisable to know at least one of PERL, Java or Javascript. Further information and registration from here: http://www.ebi.ac.uk/training/onsite/110302DAS.html There are still a few places for short talks on the second day if you have anything to talk about of interest to the DAS community. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From watson at ebi.ac.uk Fri Feb 18 03:46:53 2011 From: watson at ebi.ac.uk (James Watson) Date: Fri, 18 Feb 2011 08:46:53 +0000 Subject: [Biojava-l] Hands-on training at EBI - Programmatic access to biological databases (Java) Message-ID: <4D5E31FD.6080905@ebi.ac.uk> *Date:* 9-13 May 2011 *Venue:* EMBL-EBI, Hinxton, Nr Cambridge, CB10 1SD, UK *Registration Deadline:* 9th April 2011 This Java-based course in programmatic access to biological databases is ideal for bioinformaticians and biological researchers looking to develop data analysis pipelines or access data in an automated manner for integration into their own applications. What will it cover? - Introduction to Web Services and resources available at the EBI - REST & SOAP - their application to EBI services and databases - BioMart and BioMart Web Services - Linking services together to construct workflows (Taverna, Enfin, Encore) For a more detailed programme and information on registration please go to http://www.ebi.ac.uk/training/handson/course_110509_progjava.html Many thanks, James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 28 Mar -- 1 Apr 2011: PSIMEx Workshop: Interactions and Pathways 9-13 May 2011: Programmatic access to biological databases (Java) 23-27 May 2011: FEBS: In silico systems biology for complex diseases: network reconstruction, analysis and network based modelling From bli0406 at gmail.com Wed Feb 23 02:16:29 2011 From: bli0406 at gmail.com (Bo Li) Date: Wed, 23 Feb 2011 02:16:29 -0500 Subject: [Biojava-l] question regarding MSA Message-ID: Hi, Sorry for the bothering. I tried the MSA feature by following the link: http://www.biojava.org/wiki/BioJava:CookBook3:MSA However, I can't see the symbols like ".", ":", and "*" like I can see from the output ClustalW. So is there a way for users to obtain such information in the output from MSA? Thanks, Bo Li From andreas at sdsc.edu Thu Feb 24 03:06:59 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 00:06:59 -0800 Subject: [Biojava-l] question regarding MSA In-Reply-To: References: Message-ID: Hi Bo Li, The printing method currently does not add those characters to the display of the aligned sequences. If you need it you would have to patch the printing method... Andreas On Tue, Feb 22, 2011 at 11:16 PM, Bo Li wrote: > Hi, > > Sorry for the bothering. ?I tried the MSA feature by following the link: > > http://www.biojava.org/wiki/BioJava:CookBook3:MSA > > However, I can't see the symbols like ".", ":", and "*" like I can see from > the output ClustalW. > > So is there a way for users to obtain such information in the output from > MSA? > > Thanks, > Bo Li > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From p.v.troshin at dundee.ac.uk Thu Feb 24 10:33:41 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Thu, 24 Feb 2011 15:33:41 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava Message-ID: <4D667A55.5040404@dundee.ac.uk> Hi, I've noticed that BioJava up to about version 1.7 had an org.biojava.bio.proteomics package, which had methods for isoelectric point and molecular weight calculations for peptides. I could not find this package in the BioJava 3.0.1 API. I?d like to use these methods and wonder if there are any equivalent methods available in the latest version of BioJava? Thank you for your help, Kind regards, Peter Dr Peter Troshin Bioinformatics Software Developer Phone: +44 (0)1382 388589 Fax: +44 (0)1382 385764 The Barton Group College of Life Sciences Medical Sciences Institute University of Dundee Dundee DD1 5EH UK From andreas at sdsc.edu Thu Feb 24 11:54:14 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 08:54:14 -0800 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <4D667A55.5040404@dundee.ac.uk> References: <4D667A55.5040404@dundee.ac.uk> Message-ID: Hi Peter, if you get a copy of biojava 1.8, it is still there. However I would like to port this to biojava 3 as well.. George do you want to help me with that, since you are one of the authors of this package? The basic support for chemistry in BioJava 3 is a bit better... (e.g. Element class) Andreas On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin wrote: > Hi, > > I've noticed that BioJava up to about version 1.7 had an > org.biojava.bio.proteomics package, which had methods for isoelectric point > and molecular weight calculations for peptides. I could not find this > package in the BioJava 3.0.1 API. I?d like to use these methods and wonder > if there are any equivalent methods available in the latest version of > BioJava? > > Thank you for your help, > > Kind regards, > Peter > > Dr Peter Troshin > Bioinformatics Software Developer > Phone: +44 (0)1382 388589 > Fax: +44 (0)1382 385764 > The Barton Group > College of Life Sciences > Medical Sciences Institute > University of Dundee > Dundee > DD1 5EH > UK > > > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From p.v.troshin at dundee.ac.uk Thu Feb 24 12:44:07 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Thu, 24 Feb 2011 17:44:07 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> Message-ID: <4D6698E7.3080202@dundee.ac.uk> Hi Andreas, In fact I'd be happy to help with the development of the tools for simple physico-chemical properties calculation for peptides. We could port George?s code (assuming he is happy with this) from BioJava 1.8 but we can also provide a few other methods. A couple of projects in the lab where I work would have benefited from having these calculations readily available. I was thinking about participation in the Google Summer of Code (GoSC) this year as a mentor, and I think this would be an easy project for a student. What do you think about this? Thank you for your prompt reply. Regards, Peter On 24/02/2011 16:54, Andreas Prlic wrote: > Hi Peter, > > if you get a copy of biojava 1.8, it is still there. However I would > like to port this to biojava 3 as well.. George do you want to help me > with that, since you are one of the authors of this package? The basic > support for chemistry in BioJava 3 is a bit better... (e.g. Element > class) > > Andreas > > On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin wrote: >> Hi, >> >> I've noticed that BioJava up to about version 1.7 had an >> org.biojava.bio.proteomics package, which had methods for isoelectric point >> and molecular weight calculations for peptides. I could not find this >> package in the BioJava 3.0.1 API. I?d like to use these methods and wonder >> if there are any equivalent methods available in the latest version of >> BioJava? >> >> Thank you for your help, >> >> Kind regards, >> Peter >> >> Dr Peter Troshin >> Bioinformatics Software Developer >> Phone: +44 (0)1382 388589 >> Fax: +44 (0)1382 385764 >> The Barton Group >> College of Life Sciences >> Medical Sciences Institute >> University of Dundee >> Dundee >> DD1 5EH >> UK >> >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> From gwaldon at geneinfinity.org Thu Feb 24 14:15:06 2011 From: gwaldon at geneinfinity.org (George Waldon) Date: Thu, 24 Feb 2011 13:15:06 -0600 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <4D6698E7.3080202@dundee.ac.uk> References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> Message-ID: <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Hello Peter & Andreas I effectively did some work on these methods, mostly fixing and adding the ExPASy algorithm that was kindly provided to me. I think it makes a lot of sense to port all physico-chemical property calculations related to amino acids and polypeptides to bj3, as suggested by Andreas, and I definitively support the effort. We could smoothly deprecate the bj1 package when this is done. Let me know how I could help. Thanks George Quoting Peter Troshin : > Hi Andreas, > > In fact I'd be happy to help with the development of the tools for > simple physico-chemical properties calculation for peptides. We > could port George?s code (assuming he is happy with this) from > BioJava 1.8 but we can also provide a few other methods. A couple of > projects in the lab where I work would have benefited from having > these calculations readily available. > > I was thinking about participation in the Google Summer of Code > (GoSC) this year as a mentor, and I think this would be an easy > project for a student. What do you think about this? > > Thank you for your prompt reply. > > Regards, > Peter > > > > On 24/02/2011 16:54, Andreas Prlic wrote: >> Hi Peter, >> >> if you get a copy of biojava 1.8, it is still there. However I would >> like to port this to biojava 3 as well.. George do you want to help me >> with that, since you are one of the authors of this package? The basic >> support for chemistry in BioJava 3 is a bit better... (e.g. Element >> class) >> >> Andreas >> >> On Thu, Feb 24, 2011 at 7:33 AM, Peter >> Troshin wrote: >>> Hi, >>> >>> I've noticed that BioJava up to about version 1.7 had an >>> org.biojava.bio.proteomics package, which had methods for isoelectric point >>> and molecular weight calculations for peptides. I could not find this >>> package in the BioJava 3.0.1 API. I?d like to use these methods and wonder >>> if there are any equivalent methods available in the latest version of >>> BioJava? >>> >>> Thank you for your help, >>> >>> Kind regards, >>> Peter >>> >>> Dr Peter Troshin >>> Bioinformatics Software Developer >>> Phone: +44 (0)1382 388589 >>> Fax: +44 (0)1382 385764 >>> The Barton Group >>> College of Life Sciences >>> Medical Sciences Institute >>> University of Dundee >>> Dundee >>> DD1 5EH >>> UK >>> >>> >>> >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > > From willishf at ufl.edu Thu Feb 24 23:08:53 2011 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 24 Feb 2011 23:08:53 -0500 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: We put in some basics regarding modeling amino acid properties in the core module but really didn't have any pressing use cases to drive the api beyond calculating the mass of a peptide. We currently have getMolecularWeight() as a method in AbstractCompound but never added a getSequenceMolecularWeight() to AbstractSequence. It would be great to get the attributes/features of amino acids properly modeled in core and extend when reasonable useful summary methods at higher levels. You should be able to query mass of a peptide and have it valid for an amino acid with a PTM which means the amino acid needs to support the ability to be modified in a flexible manner. I spent the last year+ developing a software suite for peptide detection in MS data for deuterium exchange where automated PTM detection was important. Would be great to get some focused attention on the core to make sure we can model nucleotides and amino acids with a chemistry friendly API. Thanks Scooter On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: > Hello Peter & Andreas > > I effectively did some work on these methods, mostly fixing and adding the > ExPASy algorithm that was kindly provided to me. I think it makes a lot of > sense to port all physico-chemical property calculations related to amino > acids and polypeptides to bj3, as suggested by Andreas, and I definitively > support the effort. We could smoothly deprecate the bj1 package when this is > done. Let me know how I could help. > > Thanks > George > > Quoting Peter Troshin : > >> Hi Andreas, >> >> In fact I'd be happy to help with the development of the tools for simple >> physico-chemical properties calculation for peptides. We could port George?s >> code (assuming he is happy with this) from BioJava 1.8 but we can also >> provide a few other methods. A couple of projects in the lab where I work >> would have benefited from having these calculations readily available. >> >> I was thinking about participation in the Google Summer of Code (GoSC) >> this year as a mentor, and I think this would be an easy project for a >> student. What do you think about this? >> >> Thank you for your prompt reply. >> >> Regards, >> Peter >> >> >> >> On 24/02/2011 16:54, Andreas Prlic wrote: >>> >>> Hi Peter, >>> >>> if you get a copy of biojava 1.8, it is still there. However I would >>> like to port this to biojava 3 as well.. George do you want to help me >>> with that, since you are one of the authors of this package? The basic >>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>> class) >>> >>> Andreas >>> >>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>> ?wrote: >>>> >>>> Hi, >>>> >>>> I've noticed that BioJava up to about version 1.7 had an >>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>> point >>>> and molecular weight calculations for peptides. I could not find this >>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>> wonder >>>> if there are any equivalent methods available in the latest version of >>>> BioJava? >>>> >>>> Thank you for your help, >>>> >>>> Kind regards, >>>> Peter >>>> >>>> Dr Peter Troshin >>>> Bioinformatics Software Developer >>>> Phone: +44 (0)1382 388589 >>>> Fax: +44 (0)1382 385764 >>>> The Barton Group >>>> College of Life Sciences >>>> Medical Sciences Institute >>>> University of Dundee >>>> Dundee >>>> DD1 5EH >>>> UK >>>> >>>> >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Fri Feb 25 00:12:17 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 21:12:17 -0800 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: Great, seems we have an agreement that we want to improve functionality for this. How complex is this going to be? From quickly checking the 1.8 source it looks like just a few classes that need to be converted and not too painful. What other functionality would you like to see that is currently not there? Andreas On Thu, Feb 24, 2011 at 8:08 PM, Scooter Willis wrote: > We put in some basics regarding modeling amino acid properties in the > core module but really didn't have any pressing use cases to drive the > api beyond calculating the mass of a peptide. We currently have > getMolecularWeight() as a method in AbstractCompound but never added a > getSequenceMolecularWeight() to AbstractSequence. It would be great to > get the attributes/features of amino acids properly modeled in core > and extend when reasonable useful summary methods at higher levels. > You should be able to query mass of a peptide and have it valid for an > amino acid with a PTM which means the amino acid needs to support the > ability to be modified in a flexible manner. I spent the last year+ > developing a software suite for peptide detection in MS data for > deuterium exchange where automated PTM detection was important. Would > be great to get some focused attention on the core to make sure we can > model nucleotides and amino acids with a chemistry friendly API. > > Thanks > > Scooter > > On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: >> Hello Peter & Andreas >> >> I effectively did some work on these methods, mostly fixing and adding the >> ExPASy algorithm that was kindly provided to me. I think it makes a lot of >> sense to port all physico-chemical property calculations related to amino >> acids and polypeptides to bj3, as suggested by Andreas, and I definitively >> support the effort. We could smoothly deprecate the bj1 package when this is >> done. Let me know how I could help. >> >> Thanks >> George >> >> Quoting Peter Troshin : >> >>> Hi Andreas, >>> >>> In fact I'd be happy to help with the development of the tools for simple >>> physico-chemical properties calculation for peptides. We could port George?s >>> code (assuming he is happy with this) from BioJava 1.8 but we can also >>> provide a few other methods. A couple of projects in the lab where I work >>> would have benefited from having these calculations readily available. >>> >>> I was thinking about participation in the Google Summer of Code (GoSC) >>> this year as a mentor, and I think this would be an easy project for a >>> student. What do you think about this? >>> >>> Thank you for your prompt reply. >>> >>> Regards, >>> Peter >>> >>> >>> >>> On 24/02/2011 16:54, Andreas Prlic wrote: >>>> >>>> Hi Peter, >>>> >>>> if you get a copy of biojava 1.8, it is still there. However I would >>>> like to port this to biojava 3 as well.. George do you want to help me >>>> with that, since you are one of the authors of this package? The basic >>>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>>> class) >>>> >>>> Andreas >>>> >>>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>>> ?wrote: >>>>> >>>>> Hi, >>>>> >>>>> I've noticed that BioJava up to about version 1.7 had an >>>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>>> point >>>>> and molecular weight calculations for peptides. I could not find this >>>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>>> wonder >>>>> if there are any equivalent methods available in the latest version of >>>>> BioJava? >>>>> >>>>> Thank you for your help, >>>>> >>>>> Kind regards, >>>>> Peter >>>>> >>>>> Dr Peter Troshin >>>>> Bioinformatics Software Developer >>>>> Phone: +44 (0)1382 388589 >>>>> Fax: +44 (0)1382 385764 >>>>> The Barton Group >>>>> College of Life Sciences >>>>> Medical Sciences Institute >>>>> University of Dundee >>>>> Dundee >>>>> DD1 5EH >>>>> UK >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>> >>> >> >> >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From flf.mib at gmail.com Fri Feb 25 17:21:19 2011 From: flf.mib at gmail.com (=?ISO-8859-1?Q?Fran=E7ois_Le_Fevre?=) Date: Fri, 25 Feb 2011 23:21:19 +0100 Subject: [Biojava-l] biojava3 and symbol Message-ID: <4D682B5F.60807@gmail.com> Hello I am a newbie in biojava3. I have installed the maven version. I have a question : is there a way to go from DNASequence to SymbolList? I would like to study codons and their frequency in several organisms. It was easy with biojava with "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" and "DistributionTrainerContext" But now I am a little lost between maven biojava3 and biojava. So if anyone could explain me how to get codons view from DNASequence, it could be great. Thanks a lot! Francois -- ---------------------- Francois LE FEVRE From andreas at sdsc.edu Mon Feb 28 01:04:46 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 27 Feb 2011 22:04:46 -0800 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <4D682B5F.60807@gmail.com> References: <4D682B5F.60807@gmail.com> Message-ID: Hi Francois, SymbolLists are part of BioJava 1.8 (the legacy version) and not 3.0. You might want to get the previous version installed. It is available from Maven as well... Andreas 2011/2/25 Fran?ois Le Fevre : > Hello > I am a newbie in biojava3. > I have installed the maven version. > > I have a question : is there a way to go from DNASequence to SymbolList? > I would like to study codons and their frequency in several organisms. > It was easy with biojava with > > "SymbolList codons ?= SymbolListViews.windowedSymbolList(seq, 3);" > and > "DistributionTrainerContext" > > But now I am a little lost between maven biojava3 and biojava. > So if anyone could explain me how to get codons view from DNASequence, it > could be great. > > Thanks a lot! > > Francois > > > -- > ---------------------- > Francois LE FEVRE > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From ayates at ebi.ac.uk Mon Feb 28 05:06:16 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 28 Feb 2011 10:06:16 +0000 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <4D682B5F.60807@gmail.com> References: <4D682B5F.60807@gmail.com> Message-ID: <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> Hi Francois, To get a windowed view over any sequence you should use the following: DNASequence seq = new DNASequence("ATGCTG"); Iterable> w = new WindowedSequence(seq, 3); for(SequenceView triplet: w) { System.out.println(triplet); } HTH Andy On 25 Feb 2011, at 22:21, Fran?ois Le Fevre wrote: > Hello > I am a newbie in biojava3. > I have installed the maven version. > > I have a question : is there a way to go from DNASequence to SymbolList? > I would like to study codons and their frequency in several organisms. > It was easy with biojava with > > "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" > and > "DistributionTrainerContext" > > But now I am a little lost between maven biojava3 and biojava. > So if anyone could explain me how to get codons view from DNASequence, it could be great. > > Thanks a lot! > > Francois > > > -- > ---------------------- > Francois LE FEVRE > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From flf.mib at gmail.com Mon Feb 28 06:12:51 2011 From: flf.mib at gmail.com (Francois Le Fevre) Date: Mon, 28 Feb 2011 12:12:51 +0100 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> References: <4D682B5F.60807@gmail.com> <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> Message-ID: perfect thank a lot. Have a good day Francois 2011/2/28 Andy Yates > Hi Francois, > > To get a windowed view over any sequence you should use the following: > > DNASequence seq = new DNASequence("ATGCTG"); > Iterable> w = new > WindowedSequence(seq, 3); > for(SequenceView triplet: w) { > System.out.println(triplet); > } > > HTH > > Andy > > On 25 Feb 2011, at 22:21, Fran?ois Le Fevre wrote: > > > Hello > > I am a newbie in biojava3. > > I have installed the maven version. > > > > I have a question : is there a way to go from DNASequence to SymbolList? > > I would like to study codons and their frequency in several organisms. > > It was easy with biojava with > > > > "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" > > and > > "DistributionTrainerContext" > > > > But now I am a little lost between maven biojava3 and biojava. > > So if anyone could explain me how to get codons view from DNASequence, it > could be great. > > > > Thanks a lot! > > > > Francois > > > > > > -- > > ---------------------- > > Francois LE FEVRE > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > -- Francois Le Fevre Management Informatique Innovation Biotechnologies Paris, France - Avant d'imprimer, pensez ? l'environnement From p.v.troshin at dundee.ac.uk Mon Feb 28 10:18:15 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Mon, 28 Feb 2011 15:18:15 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: <4D6BBCB7.3010203@dundee.ac.uk> >>>What other functionality would you >>>like to see that is currently not there? I think that the methods below would be a good starting point, then the Google Summer of Code student can propose something else that he/she would fancy implementing. Molecular weight Extinction coefficient Instability index Aliphatic index Grand Average of Hydropathy Isoelectric point Number of amino acids in the protein (His, Met, Cys) I know BioJava projects were managed under Open Bioinformatics Foundation (OBF) during last years GSoC. Is there a page for this year GSoC ideas somewhere? Regards, Peter On 25/02/2011 05:12, Andreas Prlic wrote: > Great, seems we have an agreement that we want to improve > functionality for this. How complex is this going to be? From quickly > checking the 1.8 source it looks like just a few classes that need to > be converted and not too painful. What other functionality would you > like to see that is currently not there? > > Andreas > > > On Thu, Feb 24, 2011 at 8:08 PM, Scooter Willis wrote: >> We put in some basics regarding modeling amino acid properties in the >> core module but really didn't have any pressing use cases to drive the >> api beyond calculating the mass of a peptide. We currently have >> getMolecularWeight() as a method in AbstractCompound but never added a >> getSequenceMolecularWeight() to AbstractSequence. It would be great to >> get the attributes/features of amino acids properly modeled in core >> and extend when reasonable useful summary methods at higher levels. >> You should be able to query mass of a peptide and have it valid for an >> amino acid with a PTM which means the amino acid needs to support the >> ability to be modified in a flexible manner. I spent the last year+ >> developing a software suite for peptide detection in MS data for >> deuterium exchange where automated PTM detection was important. Would >> be great to get some focused attention on the core to make sure we can >> model nucleotides and amino acids with a chemistry friendly API. >> >> Thanks >> >> Scooter >> >> On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: >>> Hello Peter& Andreas >>> >>> I effectively did some work on these methods, mostly fixing and adding the >>> ExPASy algorithm that was kindly provided to me. I think it makes a lot of >>> sense to port all physico-chemical property calculations related to amino >>> acids and polypeptides to bj3, as suggested by Andreas, and I definitively >>> support the effort. We could smoothly deprecate the bj1 package when this is >>> done. Let me know how I could help. >>> >>> Thanks >>> George >>> >>> Quoting Peter Troshin: >>> >>>> Hi Andreas, >>>> >>>> In fact I'd be happy to help with the development of the tools for simple >>>> physico-chemical properties calculation for peptides. We could port George?s >>>> code (assuming he is happy with this) from BioJava 1.8 but we can also >>>> provide a few other methods. A couple of projects in the lab where I work >>>> would have benefited from having these calculations readily available. >>>> >>>> I was thinking about participation in the Google Summer of Code (GoSC) >>>> this year as a mentor, and I think this would be an easy project for a >>>> student. What do you think about this? >>>> >>>> Thank you for your prompt reply. >>>> >>>> Regards, >>>> Peter >>>> >>>> >>>> >>>> On 24/02/2011 16:54, Andreas Prlic wrote: >>>>> Hi Peter, >>>>> >>>>> if you get a copy of biojava 1.8, it is still there. However I would >>>>> like to port this to biojava 3 as well.. George do you want to help me >>>>> with that, since you are one of the authors of this package? The basic >>>>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>>>> class) >>>>> >>>>> Andreas >>>>> >>>>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I've noticed that BioJava up to about version 1.7 had an >>>>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>>>> point >>>>>> and molecular weight calculations for peptides. I could not find this >>>>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>>>> wonder >>>>>> if there are any equivalent methods available in the latest version of >>>>>> BioJava? >>>>>> >>>>>> Thank you for your help, >>>>>> >>>>>> Kind regards, >>>>>> Peter >>>>>> >>>>>> Dr Peter Troshin >>>>>> Bioinformatics Software Developer >>>>>> Phone: +44 (0)1382 388589 >>>>>> Fax: +44 (0)1382 385764 >>>>>> The Barton Group >>>>>> College of Life Sciences >>>>>> Medical Sciences Institute >>>>>> University of Dundee >>>>>> Dundee >>>>>> DD1 5EH >>>>>> UK >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>>> >>>> >>> >>> >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > > From uchathuranga at gmail.com Thu Feb 10 12:01:57 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Thu, 10 Feb 2011 17:01:57 -0000 Subject: [Biojava-l] Problem with Multiple Sequence Alignment in BioJava Message-ID: hi all, When I was going through the biojava cookbook as I was interested in this project. I tried the example in the page http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound exception for the line "Profile profile = Alignments.getMultipleSequenceAlignment(lst);". Error Message: Exception in thread "main" java.lang.NoClassDefFoundError: org/forester/phylogenyinference/DistanceMatrix at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) at CookbookMSA.main(CookbookMSA.java:18) Caused by: java.lang.ClassNotFoundException: org.forester.phylogenyinference.DistanceMatrix at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Is this a know issue or Am I doing something wrong with the code? Help me on this I have attached the java source file that I have tried. Thanks Regards udana. -------------- next part -------------- A non-text attachment was scrubbed... Name: CookbookMSA.java Type: application/octet-stream Size: 1579 bytes Desc: not available URL: From hlapp at drycafe.net Sat Feb 5 23:45:47 2011 From: hlapp at drycafe.net (Hilmar Lapp) Date: Sat, 5 Feb 2011 18:45:47 -0500 Subject: [Biojava-l] NESCent Seeks Hackathon Whitepapers In-Reply-To: <0D7D89E4-C0D4-4347-A94C-21800E927746@ad.unc.edu> References: <0D7D89E4-C0D4-4347-A94C-21800E927746@ad.unc.edu> Message-ID: <066A2391-6041-408C-B26E-9B867DE785C7@drycafe.net> The National Evolutionary Synthesis Center (NESCent), in keeping with its objective to promote collaborative development of open-source, reusable, and standards-supporting informatics resources, sponsors highly collaborative, face-to-face software development events, called "hackathons" (see [1]). To ensure that this program continues to be responsive to user needs and to tap into the expertise and creativity of the evolutionary biology community, NESCent is soliciting short whitepapers (2-6 pages) [2] on potential target areas for future hackathons. To further encourage submissions, we have now distilled specific guidelines for proposing hackathon events, based on the experiences gained from events we have sponsored in the past: http://informatics.nescent.org/wiki/Hackathon_Whitepaper_Guidelines The Center's Call for Informatics Whitepapers [3] includes not only hackathons, but also a large spectrum of other initiatives to be undertaken by the Center, including training, software development, collaborative ontology development, and coordination of data standards. Whitepapers are accepted at any time and reviewed on an on- going basis. URLs: [1] Collaborative cyberinfrastructure events and programs organized by NESCent: http://informatics.nescent.org/wiki/Main_Page [2] NESCent Call for Informatics Whitepapers http://www.nescent.org/informatics/whitepapers.php [3] Hackathon Whitepaper Guidelines: http://informatics.nescent.org/wiki/Hackathon_Whitepaper_Guidelines [4] Past NESCent-sponsored hackathons: http://informatics.nescent.org/wiki/Main_Page#Hackathons From jayunit100 at gmail.com Mon Feb 7 18:44:05 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Mon, 7 Feb 2011 13:44:05 -0500 Subject: [Biojava-l] average or minimal energy structure for nmr structures... Message-ID: Hi guys : Does anybody know a simple biojava way to calculate an average structure, or alternatively, find the lowest energy structure from an nmr structure with multiple models ? Im doing it using a set of nested for loops... I thought maybe there might be support in the api. for(Group g1 : c.getAtomGroups()) for(Atom a1 : g1.getAtoms()) { float xAvg,yAvg,zAvg; for(int i = 0 ; i < s.nrModels(); i++) { ///etc.... } } -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Tue Feb 8 06:38:14 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 7 Feb 2011 22:38:14 -0800 Subject: [Biojava-l] average or minimal energy structure for nmr structures... In-Reply-To: References: Message-ID: Hi Jay, You could calculate average coordinates and RMSD with BioJava. However in terms of energy calculations BioJava can't do much so far... Andreas On Mon, Feb 7, 2011 at 10:44 AM, Jay Vyas wrote: > Hi guys : Does anybody know a simple biojava way to calculate an average > structure, or alternatively, find the lowest energy structure from an nmr > structure with multiple models ?? > Im doing it using a set of nested for loops... I thought maybe there might > be support in the api. > > > ?for(Group g1 : c.getAtomGroups()) > > for(Atom a1 : g1.getAtoms()) > > { > > ? ? ? ? ? ? ? ? ? ? ? float xAvg,yAvg,zAvg; > > for(int i = 0 ; i < s.nrModels(); i++) > > { > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ///etc.... > > } > > > > } > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From uchathuranga at gmail.com Wed Feb 9 05:19:00 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Wed, 9 Feb 2011 10:49:00 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 Message-ID: hi, I am a student a University of Moratuwa , Sri lanka. I am planning to participate gsoc 2011 and I would like to work on a project regarding bioinfomatics. So I would like to know whether you are planning to be a mentor organization this year also. If so i would like to know what kinds of projects that you are planning to propose. Thanks Regard udana From andreas at sdsc.edu Wed Feb 9 05:49:23 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 8 Feb 2011 21:49:23 -0800 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Udana, It is a bit early, but given the success form last year I think it is for sure that BioJava (as part of the the Open Bioinformatics Foundation) will apply again to become a mentoring organisation. Google will announce the list of accepted mentoring organizations on March 18th, afterwards the time to discuss proposals will start. http://www.google-melange.com/document/show/gsoc_program/google/gsoc2011/faqs#timeline Last year we had several Mentor proposed projects. Google is also very interested in seeing student proposed projects and they recommend those due to their high success rate. If you want to prepare a proposal you could already start thinking of what kind of project you would like to work on. Similarly, everybody else who wants to propose a project and perhaps become a mentor can also start to think about that. Still we will have to wait until March 18th before we will know if there is actually any funding this year. Andreas On Tue, Feb 8, 2011 at 9:19 PM, udana chathuranga wrote: > hi, > > I am a student a University of Moratuwa , Sri lanka. I am planning to > participate gsoc 2011 and I would like to work on a project regarding > bioinfomatics. So I would like to know whether you are planning to be a > mentor organization this year also. If so i would like to know what kinds of > projects that you are planning to propose. > > Thanks > Regard > udana > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From uchathuranga at gmail.com Wed Feb 9 06:13:34 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Wed, 9 Feb 2011 11:43:34 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Andreas, Thanks for the quick reply.I appreciate your help and I want know how I can contribute to the project even if you are not participating for this year gsoc project. I have studied bioinfomatics as one of my subject in the university and I would like to apply those knowledge in to this project. I would be thankful if you can guide me to start to participate in the project. I have solid background on java and bioinomatics related areas like Biological Sequence data analysis - The Sequence assembly problem, The Gene Finding Problem, The Sequence Alignment Problem, The Genome Rearrangement problem, The Protein Folding Problem, The Motif Discovery Problem Phylogenetics Biological Network Motifs Above mention areas are some of the subjects that I have learned in our course and I would like to know how can I apply this knowledge to this project? Thanks Regards Udana From andreas at sdsc.edu Wed Feb 9 16:27:35 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 9 Feb 2011 08:27:35 -0800 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Udana, There is some overlap between your background and many aspects of BioJava. I recommend taking a look at the Cookbook for a start learning more about BioJava. To give you an idea of what was going on last year, the two biojava related projects that were funded were A) Development of an all Java multiple sequence alignment module B) Identification and classification of posttranslational protein modifications. Here the project pages for more details: http://biojava.org/wiki/Google_Summer_of_Code Andreas On Tue, Feb 8, 2011 at 10:13 PM, udana chathuranga wrote: > Hi Andreas, > > Thanks for the quick reply.I appreciate your help and I want know how I can > contribute to the project even if you are not participating for this year > gsoc project. I have studied bioinfomatics as one of my subject in the > university and I would like to apply those knowledge in to this project. I > would be thankful if you can guide me to start to participate in the > project. > I have solid background on java and bioinomatics related areas like > ? ? Biological Sequence data analysis - > > The Sequence assembly problem, The Gene Finding Problem, The Sequence > Alignment Problem, The Genome Rearrangement problem, The Protein Folding > Problem, The Motif Discovery Problem > > ??? Phylogenetics > > ??? Biological Network Motifs > > Above mention areas are some of the subjects that I have learned in our > course and I would like to know how can I apply this knowledge to this > project? > > > Thanks > > Regards > > Udana > From uchathuranga at gmail.com Wed Feb 9 18:45:55 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Thu, 10 Feb 2011 00:15:55 +0530 Subject: [Biojava-l] Project Ideas GSOC 2011 In-Reply-To: References: Message-ID: Hi Andreas, First of all thanks for the guide.i have started reading the CookBook in http://biojava.org/wiki/BioJava:CookBook. I have also did some sample coding using the biojava3-core and found this http://biojava.org/wiki/BioJava:CookBook:Core:FastaReadWrite and looked in details about the fasta file format.About fasta file format I studied about in the link http://www.ncbi.nlm.nih.gov/BLAST/fasta.shtml. As you pointed out I have also looked in the last years two gsoc projects.After going through those project I had this question what kinds of modules that I can work on. I would like hear some suggestions about modules that you are looking forward to. I also like to know what kinds of file formats that I should look into in the future in order to develop a good module to the system. Thanks regards Udana From jayunit100 at gmail.com Wed Feb 9 19:22:05 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Wed, 9 Feb 2011 14:22:05 -0500 Subject: [Biojava-l] average structure Message-ID: Hi Guys : I was also wondering if biojava supported creating an average or mean structure from a bundle........ -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Wed Feb 9 21:48:40 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 9 Feb 2011 13:48:40 -0800 Subject: [Biojava-l] average structure In-Reply-To: References: Message-ID: A geometric average should be easy to calculate. All NMR models have the same number of atoms, so you could just average over all alternative positions for an atom and use that to build up a mean structure... However I would take that with a grain of salt and make sure there are no distance violations by visual inspection and some quality control scripts. There are standard/ideal coordinates available for all amino acids, that you could use to compare with the distances between atoms in your average structure. Andreas On Wed, Feb 9, 2011 at 11:22 AM, Jay Vyas wrote: > Hi Guys : I was also wondering if biojava supported creating an average or > mean structure from a bundle........ > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From craig.adrian.berry at gmail.com Fri Feb 11 10:54:35 2011 From: craig.adrian.berry at gmail.com (Craig Berry) Date: Fri, 11 Feb 2011 10:54:35 +0000 Subject: [Biojava-l] GFF3 Reader Message-ID: Can I just have someone validate this logic in the GFF3Reader for me to see if this is a bug or not. If I have a GFF3 file with the following line: chrI SGD repeat_region 1 62 . - . ID=TEL01L-TR;Name=TEL01L-TR;Note=Terminal%20stretch%20of%20telomeric%20repeats%20on%20the%20left%20arm%20of%20Chromosome%20I;dbxref=SGD:S000028864 When parsing the file then, the class calls Location.fromBio using the start 1, end 62 and strand ?ve. Since the strand is ?ve it needs to convert the positions to negative values and reverse the start and end. However, as the javadocs explains: ?In biocoordinates, the start index of a range is represented in origin 1 (ie the very first index is 1, not 0), and end= start + length - 1.? So before the end is reassigned its value is reduced by 1 and then negated: e = - ( start ? 1) With a start value of 1 as in this case, the end then becomes 0, such that the range now runs -62 to 0. This causes a problem when adding this Feature to the feature collection since it considers a position of value 0 to be on the +ve strand, such that when Location.plus() is called the check for a negative location (i.e. both start and end being < 0) returns false and so you end up trying to create a Location with a ?ne start position but a +ve end, which throws an IllegalArgumentException. So is there something fishy here or not? I?m assuming that the GFF content is valid. Thanks in advance Craig From willishf at ufl.edu Fri Feb 11 14:14:08 2011 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 11 Feb 2011 09:14:08 -0500 Subject: [Biojava-l] GFF3 Reader In-Reply-To: References: Message-ID: Craig I think I fixed this problem but not sure if it is in the current released jar files. The < logic was wrong where 0 wasn't be treated properly. I will send you an update jar to see if the problem goes away. Scooter On Fri, Feb 11, 2011 at 5:54 AM, Craig Berry wrote: > Can I just have someone validate this logic in the GFF3Reader for me > to see if this is a bug or not. > > If I have a GFF3 file with the following line: > > chrI ? ?SGD ? ? repeat_region ? 1 ? ? ? 62 ? ? ?. ? ? ? - ? ? ? . ? ? ? ID=TEL01L-TR;Name=TEL01L-TR;Note=Terminal%20stretch%20of%20telomeric%20repeats%20on%20the%20left%20arm%20of%20Chromosome%20I;dbxref=SGD:S000028864 > > When parsing the file then, the class calls Location.fromBio using the > start 1, end 62 and strand ?ve. > Since the strand is ?ve it needs to convert the positions to negative > values and reverse the start and end. However, as the javadocs > explains: > > ?In biocoordinates, the start index of a range is represented in > origin 1 (ie the very first index is 1, not 0), ?and end= start + > length - 1.? > > So before the end is reassigned its value is reduced by 1 and then > negated: e = - ( start ? 1) With a start value of 1 as in this case, > the end then becomes 0, such that the range now runs -62 to 0. > > This causes a problem when adding this Feature to the feature > collection since it considers a position of value 0 to be on the +ve > strand, such that when Location.plus() is called the check for a > negative location (i.e. both start and end being < 0) returns false > and so you end up trying to create a Location with a ?ne start > position but a +ve end, which throws an IllegalArgumentException. > > So is there something fishy here or not? I?m assuming that the GFF > content is valid. > > Thanks in advance > > Craig > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From uchathuranga at gmail.com Fri Feb 11 15:15:19 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Fri, 11 Feb 2011 20:45:19 +0530 Subject: [Biojava-l] Problem with MultipleSequenceAlignment Message-ID: hi all, When I was going through the biojava cookbook as I was interested in this project. I tried the example in the page http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound exception for the line "Profile profile = Alignments. getMultipleSequenceAlignment(lst);". Error Message: Exception in thread "main" java.lang.NoClassDefFoundError: org/forester/phylogenyinference/DistanceMatrix at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) at CookbookMSA.main(CookbookMSA.java:18) Caused by: java.lang.ClassNotFoundException: org.forester.phylogenyinference.DistanceMatrix at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Is this a know issue or Am I doing something wrong with the code? Thanks Regards udana. From willishf at ufl.edu Fri Feb 11 16:28:32 2011 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 11 Feb 2011 11:28:32 -0500 Subject: [Biojava-l] Problem with MultipleSequenceAlignment In-Reply-To: References: Message-ID: You are probably missing a reference to the forester jar file located in the biojava3-phylo module. On Fri, Feb 11, 2011 at 10:15 AM, udana chathuranga wrote: > hi all, > > When I was going through the biojava cookbook as I was interested in this > project. I tried the example in the page > http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound > exception for the line "Profile profile > = Alignments. > getMultipleSequenceAlignment(lst);". > > Error Message: > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/forester/phylogenyinference/DistanceMatrix > ? ?at > org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) > ? ?at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) > ? ?at CookbookMSA.main(CookbookMSA.java:18) > Caused by: java.lang.ClassNotFoundException: > org.forester.phylogenyinference.DistanceMatrix > ? ?at java.net.URLClassLoader$1.run(Unknown Source) > ? ?at java.security.AccessController.doPrivileged(Native Method) > ? ?at java.net.URLClassLoader.findClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClass(Unknown Source) > ? ?at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClass(Unknown Source) > ? ?at java.lang.ClassLoader.loadClassInternal(Unknown Source) > > Is this a know issue or Am I doing something wrong with the code? > > Thanks > Regards > udana. > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From uchathuranga at gmail.com Fri Feb 11 16:58:48 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Fri, 11 Feb 2011 22:28:48 +0530 Subject: [Biojava-l] Problem with MultipleSequenceAlignment In-Reply-To: References: Message-ID: > > Hi Scooter, Thanks Scooter, that's works. Regards udana From mandarijnopw8 at gmail.com Sun Feb 13 14:32:14 2011 From: mandarijnopw8 at gmail.com (Shamanou van Leeuwen) Date: Sun, 13 Feb 2011 15:32:14 +0100 Subject: [Biojava-l] programming error Message-ID: <4D57EB6E.8010004@gmail.com> hi guys, i made am making a tool to translate dna to protein using biojava. But i am getting some errors that is do not fully understand. Can somebody please tell me what i am doing wrong? script: http://pastebin.com/TJSjkgqK errors: http://pastebin.com/iFXYWEZB From anantpossible at gmail.com Sun Feb 13 17:47:28 2011 From: anantpossible at gmail.com (Anant Jain) Date: Sun, 13 Feb 2011 23:17:28 +0530 Subject: [Biojava-l] programming error In-Reply-To: <4D57EB6E.8010004@gmail.com> References: <4D57EB6E.8010004@gmail.com> Message-ID: Hi Shamanou, I can see the error is around line no 192~193, I would suggest you to print the codons list first, like below. 192 SOP (CODON LIST); 193 prot = SymbolListViews.translate(codons,RNATools.getGeneticCode( "UNIVERSAL")); Paste result in mail thread then we can start debugging . Regards, Anant On Sun, Feb 13, 2011 at 8:02 PM, Shamanou van Leeuwen < mandarijnopw8 at gmail.com> wrote: > hi guys, > > i made am making a tool to translate dna to protein using biojava. > But i am getting some errors that is do not fully understand. > Can somebody please tell me what i am doing wrong? > > > script: > http://pastebin.com/TJSjkgqK > > errors: > http://pastebin.com/iFXYWEZB > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Anant Jain B.Tech Bioinformatics, RHCE Software Engineer, Persistent Systems Limited, Pune From willishf at ufl.edu Sun Feb 13 19:11:58 2011 From: willishf at ufl.edu (Scooter Willis) Date: Sun, 13 Feb 2011 14:11:58 -0500 Subject: [Biojava-l] programming error In-Reply-To: <4D57EB6E.8010004@gmail.com> References: <4D57EB6E.8010004@gmail.com> Message-ID: Depending on what you need you may want to try out biojava3. On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen wrote: > hi guys, > > i made am making a tool to translate dna to protein using biojava. > But ?i am getting some errors that is do not fully understand. > Can somebody please tell me what i am doing wrong? > > > script: > http://pastebin.com/TJSjkgqK > > errors: > http://pastebin.com/iFXYWEZB > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From mandarijnopw8 at gmail.com Mon Feb 14 02:27:32 2011 From: mandarijnopw8 at gmail.com (Shamanou van Leeuwen) Date: Mon, 14 Feb 2011 03:27:32 +0100 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> Message-ID: <4D589314.5040705@gmail.com> On 13-02-11 20:11, Scooter Willis wrote: > Depending on what you need you may want to try out biojava3. > > On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > wrote: >> hi guys, >> >> i made am making a tool to translate dna to protein using biojava. >> But i am getting some errors that is do not fully understand. >> Can somebody please tell me what i am doing wrong? >> >> >> script: >> http://pastebin.com/TJSjkgqK >> >> errors: >> http://pastebin.com/iFXYWEZB >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> i a, trying biojava 3 now but i am still doing something wrong. http://pastebin.com/uLz934gr From andreas at sdsc.edu Mon Feb 14 08:16:48 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 00:16:48 -0800 Subject: [Biojava-l] BioJava 3.0.1 released Message-ID: BioJava 3.0.1 has been released and is available from http://www.biojava.org/wiki/BioJava:Download . The 3.0.1 release is mainly a bug fix release for the recent 3.0 code base, which provides a major rewrite of the biojava. A couple of noteworthy improvements: - core: fixed an issue with sequence index positions, new utility methods for memory efficient parsing of large fasta files - structure: Fixed issues with PDB header parsing and more stability with non-standard PDB files. Added new algorithm to automatically infer protein domain boundaries. - web services: Fixed wrong dependency on old codebase and overall improvements in functionality of remote blast web service calls. - protein modifications: Minor bugfixes In parallel the biojava-legacy code base has been updated to version 1.8.1 and it contains a bug fix related to circular locations. Thanks to all contributors for making this release possible. About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy Biojava-ing Andreas From ayates at ebi.ac.uk Mon Feb 14 10:07:12 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 14 Feb 2011 10:07:12 +0000 Subject: [Biojava-l] programming error In-Reply-To: <4D589314.5040705@gmail.com> References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: If you want to do frame based translation then there is an easier way of accomplishing this. The TranscriptionEngine allows you to translate in multiple frames and retrieve that information in a Map such as: TranscriptionEngine te = TranscriptionEngine.getDefault(); Frame[] frames = Frame.getForwardFrames(); Map> results = te.multipleFrameTranslation(dna, frames); Change the static call on Frame to Frame.getAllFrames() then you will do a full 6 frame translation. Also I would avoid calling the getSequenceAsString() method until you need to output it to screen. The Sequence interface provides an adequate set of methods for testing length of a sequence. However if you want is the longest translation then I would replace that with (assuming the above code): List> translations = new ArrayList>(results.getValues()); Collections.sort(translations, new Comparator>() { public int compare(Sequence o1, Sequence o2) { Integer o1Length = o1.getLength(); Integer o2Length = o2.getLength(); return o1Length.compareTo(o2Length); } }); Sequence longest = translations.get(translations.size()-1); However I would like to see what errors you are pulling up from BioJava3 in case there is a scenario we are not currently taking into account Andy On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > On 13-02-11 20:11, Scooter Willis wrote: >> Depending on what you need you may want to try out biojava3. >> >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen >> wrote: >>> hi guys, >>> >>> i made am making a tool to translate dna to protein using biojava. >>> But i am getting some errors that is do not fully understand. >>> Can somebody please tell me what i am doing wrong? >>> >>> >>> script: >>> http://pastebin.com/TJSjkgqK >>> >>> errors: >>> http://pastebin.com/iFXYWEZB >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> > i a, trying biojava 3 now but i am still doing something wrong. > > http://pastebin.com/uLz934gr > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From andreas at sdsc.edu Mon Feb 14 15:42:57 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 07:42:57 -0800 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: Hi Andy, Could we get a Cookbook page for this? sounds like it would be good to have a bit more docu on this topic ... Thanks! Andreas On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > If you want to do frame based translation then there is an easier way of > accomplishing this. The TranscriptionEngine allows you to translate in > multiple frames and retrieve that information in a Map such as: > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > Frame[] frames = Frame.getForwardFrames(); > Map> results = > te.multipleFrameTranslation(dna, frames); > > Change the static call on Frame to Frame.getAllFrames() then you will do a > full 6 frame translation. > > Also I would avoid calling the getSequenceAsString() method until you need > to output it to screen. The Sequence interface provides an adequate set of > methods for testing length of a sequence. However if you want is the longest > translation then I would replace that with (assuming the above code): > > List> translations = new > ArrayList>(results.getValues()); > Collections.sort(translations, new > Comparator>() { > public int compare(Sequence o1, > Sequence o2) { > Integer o1Length = o1.getLength(); > Integer o2Length = o2.getLength(); > return o1Length.compareTo(o2Length); > } > }); > Sequence longest = > translations.get(translations.size()-1); > > However I would like to see what errors you are pulling up from BioJava3 in > case there is a scenario we are not currently taking into account > > Andy > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > On 13-02-11 20:11, Scooter Willis wrote: > >> Depending on what you need you may want to try out biojava3. > >> > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > >> wrote: > >>> hi guys, > >>> > >>> i made am making a tool to translate dna to protein using biojava. > >>> But i am getting some errors that is do not fully understand. > >>> Can somebody please tell me what i am doing wrong? > >>> > >>> > >>> script: > >>> http://pastebin.com/TJSjkgqK > >>> > >>> errors: > >>> http://pastebin.com/iFXYWEZB > >>> _______________________________________________ > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> > > i a, trying biojava 3 now but i am still doing something wrong. > > > > http://pastebin.com/uLz934gr > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From ayates at ebi.ac.uk Mon Feb 14 16:13:35 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 14 Feb 2011 16:13:35 +0000 Subject: [Biojava-l] programming error In-Reply-To: References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> Message-ID: <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> I've added a bit to the original cookbook page we had. I can migrate these changes to the cookbook front page later on in the week Andy On 14 Feb 2011, at 15:42, Andreas Prlic wrote: > Hi Andy, > > Could we get a Cookbook page for this? sounds like it would be good to have a bit more docu on this topic ... > > Thanks! > > Andreas > > On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > If you want to do frame based translation then there is an easier way of accomplishing this. The TranscriptionEngine allows you to translate in multiple frames and retrieve that information in a Map such as: > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > Frame[] frames = Frame.getForwardFrames(); > Map> results = te.multipleFrameTranslation(dna, frames); > > Change the static call on Frame to Frame.getAllFrames() then you will do a full 6 frame translation. > > Also I would avoid calling the getSequenceAsString() method until you need to output it to screen. The Sequence interface provides an adequate set of methods for testing length of a sequence. However if you want is the longest translation then I would replace that with (assuming the above code): > > List> translations = new ArrayList>(results.getValues()); > Collections.sort(translations, new Comparator>() { > public int compare(Sequence o1, Sequence o2) { > Integer o1Length = o1.getLength(); > Integer o2Length = o2.getLength(); > return o1Length.compareTo(o2Length); > } > }); > Sequence longest = translations.get(translations.size()-1); > > However I would like to see what errors you are pulling up from BioJava3 in case there is a scenario we are not currently taking into account > > Andy > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > On 13-02-11 20:11, Scooter Willis wrote: > >> Depending on what you need you may want to try out biojava3. > >> > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > >> wrote: > >>> hi guys, > >>> > >>> i made am making a tool to translate dna to protein using biojava. > >>> But i am getting some errors that is do not fully understand. > >>> Can somebody please tell me what i am doing wrong? > >>> > >>> > >>> script: > >>> http://pastebin.com/TJSjkgqK > >>> > >>> errors: > >>> http://pastebin.com/iFXYWEZB > >>> _______________________________________________ > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> > > i a, trying biojava 3 now but i am still doing something wrong. > > > > http://pastebin.com/uLz934gr > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > ----------------------------------------------------------------------- > Dr. Andreas Prlic > Senior Scientist, RCSB PDB Protein Data Bank > University of California, San Diego > (+1) 858.246.0526 > ----------------------------------------------------------------------- -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From andreas at sdsc.edu Mon Feb 14 23:57:07 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 14 Feb 2011 15:57:07 -0800 Subject: [Biojava-l] programming error In-Reply-To: <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> References: <4D57EB6E.8010004@gmail.com> <4D589314.5040705@gmail.com> <7CBE613F-7F05-47C6-B917-8A1A3EF82E4C@ebi.ac.uk> Message-ID: Thanks Andy, yea, would be good to add links on the cookbook front page, otherwise the sections are hard to find... Andreas On Mon, Feb 14, 2011 at 8:13 AM, Andy Yates wrote: > I've added a bit to the original cookbook page we had. I can migrate these > changes to the cookbook front page later on in the week > > Andy > > On 14 Feb 2011, at 15:42, Andreas Prlic wrote: > > > Hi Andy, > > > > Could we get a Cookbook page for this? sounds like it would be good to > have a bit more docu on this topic ... > > > > Thanks! > > > > Andreas > > > > On Mon, Feb 14, 2011 at 2:07 AM, Andy Yates wrote: > > If you want to do frame based translation then there is an easier way of > accomplishing this. The TranscriptionEngine allows you to translate in > multiple frames and retrieve that information in a Map such as: > > > > TranscriptionEngine te = TranscriptionEngine.getDefault(); > > Frame[] frames = Frame.getForwardFrames(); > > Map> results = > te.multipleFrameTranslation(dna, frames); > > > > Change the static call on Frame to Frame.getAllFrames() then you will do > a full 6 frame translation. > > > > Also I would avoid calling the getSequenceAsString() method until you > need to output it to screen. The Sequence interface provides an adequate set > of methods for testing length of a sequence. However if you want is the > longest translation then I would replace that with (assuming the above > code): > > > > List> translations = new > ArrayList>(results.getValues()); > > Collections.sort(translations, new > Comparator>() { > > public int compare(Sequence o1, > Sequence o2) { > > Integer o1Length = o1.getLength(); > > Integer o2Length = o2.getLength(); > > return o1Length.compareTo(o2Length); > > } > > }); > > Sequence longest = > translations.get(translations.size()-1); > > > > However I would like to see what errors you are pulling up from BioJava3 > in case there is a scenario we are not currently taking into account > > > > Andy > > > > On 14 Feb 2011, at 02:27, Shamanou van Leeuwen wrote: > > > > > On 13-02-11 20:11, Scooter Willis wrote: > > >> Depending on what you need you may want to try out biojava3. > > >> > > >> On Sun, Feb 13, 2011 at 9:32 AM, Shamanou van Leeuwen > > >> wrote: > > >>> hi guys, > > >>> > > >>> i made am making a tool to translate dna to protein using biojava. > > >>> But i am getting some errors that is do not fully understand. > > >>> Can somebody please tell me what i am doing wrong? > > >>> > > >>> > > >>> script: > > >>> http://pastebin.com/TJSjkgqK > > >>> > > >>> errors: > > >>> http://pastebin.com/iFXYWEZB > > >>> _______________________________________________ > > >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > > >>> http://lists.open-bio.org/mailman/listinfo/biojava-l > > >>> > > >>> > > > i a, trying biojava 3 now but i am still doing something wrong. > > > > > > http://pastebin.com/uLz934gr > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > > Andrew Yates Ensembl Genomes Engineer > > EMBL-EBI Tel: +44-(0)1223-492538 > > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > -- > > ----------------------------------------------------------------------- > > Dr. Andreas Prlic > > Senior Scientist, RCSB PDB Protein Data Bank > > University of California, San Diego > > (+1) 858.246.0526 > > ----------------------------------------------------------------------- > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > From andreas at sdsc.edu Wed Feb 16 02:45:49 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 15 Feb 2011 18:45:49 -0800 Subject: [Biojava-l] GSoC status Message-ID: Hi Udana, I would like to motivate student proposed projects as much as possible this year. This means if you want to distinguish yourself with your application, it would be great if you would suggest a project by yourself that you find exciting. If our application to become a sponsoring organisation gets approved by Google, we will start discussing proposals around March 18th. We will probably suggest possible projects at some point, but I really would like to give preference to student proposed projects this year. To get ideas about possible student proposed projects I recommend getting a module that is related to your studies and try to do some work with it. This might give you ideas about what is missing and what could be added in a project. Andreas On Tue, Feb 15, 2011 at 11:35 AM, udana chathuranga wrote: > Hi Andreas, > > Any updates about project ideas to gsoc2011 ? > > > > > Thanks > Regards > Udana > From jw12 at sanger.ac.uk Thu Feb 17 14:30:08 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Feb 2011 14:30:08 +0000 Subject: [Biojava-l] DAS Workshop Registration Closing Soon Message-ID: <0BCCE860-9AEA-4377-A9D6-F28E264DE43A@sanger.ac.uk> Registration closes for the DAS workshop at 5pm this Friday GMT. Limited places still available. Please note that for the tutorials day (Day 1) it is advisable to know at least one of PERL, Java or Javascript. Further information and registration from here: http://www.ebi.ac.uk/training/onsite/110302DAS.html There are still a few places for short talks on the second day if you have anything to talk about of interest to the DAS community. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From watson at ebi.ac.uk Fri Feb 18 08:46:53 2011 From: watson at ebi.ac.uk (James Watson) Date: Fri, 18 Feb 2011 08:46:53 +0000 Subject: [Biojava-l] Hands-on training at EBI - Programmatic access to biological databases (Java) Message-ID: <4D5E31FD.6080905@ebi.ac.uk> *Date:* 9-13 May 2011 *Venue:* EMBL-EBI, Hinxton, Nr Cambridge, CB10 1SD, UK *Registration Deadline:* 9th April 2011 This Java-based course in programmatic access to biological databases is ideal for bioinformaticians and biological researchers looking to develop data analysis pipelines or access data in an automated manner for integration into their own applications. What will it cover? - Introduction to Web Services and resources available at the EBI - REST & SOAP - their application to EBI services and databases - BioMart and BioMart Web Services - Linking services together to construct workflows (Taverna, Enfin, Encore) For a more detailed programme and information on registration please go to http://www.ebi.ac.uk/training/handson/course_110509_progjava.html Many thanks, James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 28 Mar -- 1 Apr 2011: PSIMEx Workshop: Interactions and Pathways 9-13 May 2011: Programmatic access to biological databases (Java) 23-27 May 2011: FEBS: In silico systems biology for complex diseases: network reconstruction, analysis and network based modelling From bli0406 at gmail.com Wed Feb 23 07:16:29 2011 From: bli0406 at gmail.com (Bo Li) Date: Wed, 23 Feb 2011 02:16:29 -0500 Subject: [Biojava-l] question regarding MSA Message-ID: Hi, Sorry for the bothering. I tried the MSA feature by following the link: http://www.biojava.org/wiki/BioJava:CookBook3:MSA However, I can't see the symbols like ".", ":", and "*" like I can see from the output ClustalW. So is there a way for users to obtain such information in the output from MSA? Thanks, Bo Li From andreas at sdsc.edu Thu Feb 24 08:06:59 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 00:06:59 -0800 Subject: [Biojava-l] question regarding MSA In-Reply-To: References: Message-ID: Hi Bo Li, The printing method currently does not add those characters to the display of the aligned sequences. If you need it you would have to patch the printing method... Andreas On Tue, Feb 22, 2011 at 11:16 PM, Bo Li wrote: > Hi, > > Sorry for the bothering. ?I tried the MSA feature by following the link: > > http://www.biojava.org/wiki/BioJava:CookBook3:MSA > > However, I can't see the symbols like ".", ":", and "*" like I can see from > the output ClustalW. > > So is there a way for users to obtain such information in the output from > MSA? > > Thanks, > Bo Li > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From p.v.troshin at dundee.ac.uk Thu Feb 24 15:33:41 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Thu, 24 Feb 2011 15:33:41 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava Message-ID: <4D667A55.5040404@dundee.ac.uk> Hi, I've noticed that BioJava up to about version 1.7 had an org.biojava.bio.proteomics package, which had methods for isoelectric point and molecular weight calculations for peptides. I could not find this package in the BioJava 3.0.1 API. I?d like to use these methods and wonder if there are any equivalent methods available in the latest version of BioJava? Thank you for your help, Kind regards, Peter Dr Peter Troshin Bioinformatics Software Developer Phone: +44 (0)1382 388589 Fax: +44 (0)1382 385764 The Barton Group College of Life Sciences Medical Sciences Institute University of Dundee Dundee DD1 5EH UK From andreas at sdsc.edu Thu Feb 24 16:54:14 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 08:54:14 -0800 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <4D667A55.5040404@dundee.ac.uk> References: <4D667A55.5040404@dundee.ac.uk> Message-ID: Hi Peter, if you get a copy of biojava 1.8, it is still there. However I would like to port this to biojava 3 as well.. George do you want to help me with that, since you are one of the authors of this package? The basic support for chemistry in BioJava 3 is a bit better... (e.g. Element class) Andreas On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin wrote: > Hi, > > I've noticed that BioJava up to about version 1.7 had an > org.biojava.bio.proteomics package, which had methods for isoelectric point > and molecular weight calculations for peptides. I could not find this > package in the BioJava 3.0.1 API. I?d like to use these methods and wonder > if there are any equivalent methods available in the latest version of > BioJava? > > Thank you for your help, > > Kind regards, > Peter > > Dr Peter Troshin > Bioinformatics Software Developer > Phone: +44 (0)1382 388589 > Fax: +44 (0)1382 385764 > The Barton Group > College of Life Sciences > Medical Sciences Institute > University of Dundee > Dundee > DD1 5EH > UK > > > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From p.v.troshin at dundee.ac.uk Thu Feb 24 17:44:07 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Thu, 24 Feb 2011 17:44:07 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> Message-ID: <4D6698E7.3080202@dundee.ac.uk> Hi Andreas, In fact I'd be happy to help with the development of the tools for simple physico-chemical properties calculation for peptides. We could port George?s code (assuming he is happy with this) from BioJava 1.8 but we can also provide a few other methods. A couple of projects in the lab where I work would have benefited from having these calculations readily available. I was thinking about participation in the Google Summer of Code (GoSC) this year as a mentor, and I think this would be an easy project for a student. What do you think about this? Thank you for your prompt reply. Regards, Peter On 24/02/2011 16:54, Andreas Prlic wrote: > Hi Peter, > > if you get a copy of biojava 1.8, it is still there. However I would > like to port this to biojava 3 as well.. George do you want to help me > with that, since you are one of the authors of this package? The basic > support for chemistry in BioJava 3 is a bit better... (e.g. Element > class) > > Andreas > > On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin wrote: >> Hi, >> >> I've noticed that BioJava up to about version 1.7 had an >> org.biojava.bio.proteomics package, which had methods for isoelectric point >> and molecular weight calculations for peptides. I could not find this >> package in the BioJava 3.0.1 API. I?d like to use these methods and wonder >> if there are any equivalent methods available in the latest version of >> BioJava? >> >> Thank you for your help, >> >> Kind regards, >> Peter >> >> Dr Peter Troshin >> Bioinformatics Software Developer >> Phone: +44 (0)1382 388589 >> Fax: +44 (0)1382 385764 >> The Barton Group >> College of Life Sciences >> Medical Sciences Institute >> University of Dundee >> Dundee >> DD1 5EH >> UK >> >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> From gwaldon at geneinfinity.org Thu Feb 24 19:15:06 2011 From: gwaldon at geneinfinity.org (George Waldon) Date: Thu, 24 Feb 2011 13:15:06 -0600 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <4D6698E7.3080202@dundee.ac.uk> References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> Message-ID: <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Hello Peter & Andreas I effectively did some work on these methods, mostly fixing and adding the ExPASy algorithm that was kindly provided to me. I think it makes a lot of sense to port all physico-chemical property calculations related to amino acids and polypeptides to bj3, as suggested by Andreas, and I definitively support the effort. We could smoothly deprecate the bj1 package when this is done. Let me know how I could help. Thanks George Quoting Peter Troshin : > Hi Andreas, > > In fact I'd be happy to help with the development of the tools for > simple physico-chemical properties calculation for peptides. We > could port George?s code (assuming he is happy with this) from > BioJava 1.8 but we can also provide a few other methods. A couple of > projects in the lab where I work would have benefited from having > these calculations readily available. > > I was thinking about participation in the Google Summer of Code > (GoSC) this year as a mentor, and I think this would be an easy > project for a student. What do you think about this? > > Thank you for your prompt reply. > > Regards, > Peter > > > > On 24/02/2011 16:54, Andreas Prlic wrote: >> Hi Peter, >> >> if you get a copy of biojava 1.8, it is still there. However I would >> like to port this to biojava 3 as well.. George do you want to help me >> with that, since you are one of the authors of this package? The basic >> support for chemistry in BioJava 3 is a bit better... (e.g. Element >> class) >> >> Andreas >> >> On Thu, Feb 24, 2011 at 7:33 AM, Peter >> Troshin wrote: >>> Hi, >>> >>> I've noticed that BioJava up to about version 1.7 had an >>> org.biojava.bio.proteomics package, which had methods for isoelectric point >>> and molecular weight calculations for peptides. I could not find this >>> package in the BioJava 3.0.1 API. I?d like to use these methods and wonder >>> if there are any equivalent methods available in the latest version of >>> BioJava? >>> >>> Thank you for your help, >>> >>> Kind regards, >>> Peter >>> >>> Dr Peter Troshin >>> Bioinformatics Software Developer >>> Phone: +44 (0)1382 388589 >>> Fax: +44 (0)1382 385764 >>> The Barton Group >>> College of Life Sciences >>> Medical Sciences Institute >>> University of Dundee >>> Dundee >>> DD1 5EH >>> UK >>> >>> >>> >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > > From willishf at ufl.edu Fri Feb 25 04:08:53 2011 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 24 Feb 2011 23:08:53 -0500 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: We put in some basics regarding modeling amino acid properties in the core module but really didn't have any pressing use cases to drive the api beyond calculating the mass of a peptide. We currently have getMolecularWeight() as a method in AbstractCompound but never added a getSequenceMolecularWeight() to AbstractSequence. It would be great to get the attributes/features of amino acids properly modeled in core and extend when reasonable useful summary methods at higher levels. You should be able to query mass of a peptide and have it valid for an amino acid with a PTM which means the amino acid needs to support the ability to be modified in a flexible manner. I spent the last year+ developing a software suite for peptide detection in MS data for deuterium exchange where automated PTM detection was important. Would be great to get some focused attention on the core to make sure we can model nucleotides and amino acids with a chemistry friendly API. Thanks Scooter On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: > Hello Peter & Andreas > > I effectively did some work on these methods, mostly fixing and adding the > ExPASy algorithm that was kindly provided to me. I think it makes a lot of > sense to port all physico-chemical property calculations related to amino > acids and polypeptides to bj3, as suggested by Andreas, and I definitively > support the effort. We could smoothly deprecate the bj1 package when this is > done. Let me know how I could help. > > Thanks > George > > Quoting Peter Troshin : > >> Hi Andreas, >> >> In fact I'd be happy to help with the development of the tools for simple >> physico-chemical properties calculation for peptides. We could port George?s >> code (assuming he is happy with this) from BioJava 1.8 but we can also >> provide a few other methods. A couple of projects in the lab where I work >> would have benefited from having these calculations readily available. >> >> I was thinking about participation in the Google Summer of Code (GoSC) >> this year as a mentor, and I think this would be an easy project for a >> student. What do you think about this? >> >> Thank you for your prompt reply. >> >> Regards, >> Peter >> >> >> >> On 24/02/2011 16:54, Andreas Prlic wrote: >>> >>> Hi Peter, >>> >>> if you get a copy of biojava 1.8, it is still there. However I would >>> like to port this to biojava 3 as well.. George do you want to help me >>> with that, since you are one of the authors of this package? The basic >>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>> class) >>> >>> Andreas >>> >>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>> ?wrote: >>>> >>>> Hi, >>>> >>>> I've noticed that BioJava up to about version 1.7 had an >>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>> point >>>> and molecular weight calculations for peptides. I could not find this >>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>> wonder >>>> if there are any equivalent methods available in the latest version of >>>> BioJava? >>>> >>>> Thank you for your help, >>>> >>>> Kind regards, >>>> Peter >>>> >>>> Dr Peter Troshin >>>> Bioinformatics Software Developer >>>> Phone: +44 (0)1382 388589 >>>> Fax: +44 (0)1382 385764 >>>> The Barton Group >>>> College of Life Sciences >>>> Medical Sciences Institute >>>> University of Dundee >>>> Dundee >>>> DD1 5EH >>>> UK >>>> >>>> >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Fri Feb 25 05:12:17 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 24 Feb 2011 21:12:17 -0800 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: Great, seems we have an agreement that we want to improve functionality for this. How complex is this going to be? From quickly checking the 1.8 source it looks like just a few classes that need to be converted and not too painful. What other functionality would you like to see that is currently not there? Andreas On Thu, Feb 24, 2011 at 8:08 PM, Scooter Willis wrote: > We put in some basics regarding modeling amino acid properties in the > core module but really didn't have any pressing use cases to drive the > api beyond calculating the mass of a peptide. We currently have > getMolecularWeight() as a method in AbstractCompound but never added a > getSequenceMolecularWeight() to AbstractSequence. It would be great to > get the attributes/features of amino acids properly modeled in core > and extend when reasonable useful summary methods at higher levels. > You should be able to query mass of a peptide and have it valid for an > amino acid with a PTM which means the amino acid needs to support the > ability to be modified in a flexible manner. I spent the last year+ > developing a software suite for peptide detection in MS data for > deuterium exchange where automated PTM detection was important. Would > be great to get some focused attention on the core to make sure we can > model nucleotides and amino acids with a chemistry friendly API. > > Thanks > > Scooter > > On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: >> Hello Peter & Andreas >> >> I effectively did some work on these methods, mostly fixing and adding the >> ExPASy algorithm that was kindly provided to me. I think it makes a lot of >> sense to port all physico-chemical property calculations related to amino >> acids and polypeptides to bj3, as suggested by Andreas, and I definitively >> support the effort. We could smoothly deprecate the bj1 package when this is >> done. Let me know how I could help. >> >> Thanks >> George >> >> Quoting Peter Troshin : >> >>> Hi Andreas, >>> >>> In fact I'd be happy to help with the development of the tools for simple >>> physico-chemical properties calculation for peptides. We could port George?s >>> code (assuming he is happy with this) from BioJava 1.8 but we can also >>> provide a few other methods. A couple of projects in the lab where I work >>> would have benefited from having these calculations readily available. >>> >>> I was thinking about participation in the Google Summer of Code (GoSC) >>> this year as a mentor, and I think this would be an easy project for a >>> student. What do you think about this? >>> >>> Thank you for your prompt reply. >>> >>> Regards, >>> Peter >>> >>> >>> >>> On 24/02/2011 16:54, Andreas Prlic wrote: >>>> >>>> Hi Peter, >>>> >>>> if you get a copy of biojava 1.8, it is still there. However I would >>>> like to port this to biojava 3 as well.. George do you want to help me >>>> with that, since you are one of the authors of this package? The basic >>>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>>> class) >>>> >>>> Andreas >>>> >>>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>>> ?wrote: >>>>> >>>>> Hi, >>>>> >>>>> I've noticed that BioJava up to about version 1.7 had an >>>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>>> point >>>>> and molecular weight calculations for peptides. I could not find this >>>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>>> wonder >>>>> if there are any equivalent methods available in the latest version of >>>>> BioJava? >>>>> >>>>> Thank you for your help, >>>>> >>>>> Kind regards, >>>>> Peter >>>>> >>>>> Dr Peter Troshin >>>>> Bioinformatics Software Developer >>>>> Phone: +44 (0)1382 388589 >>>>> Fax: +44 (0)1382 385764 >>>>> The Barton Group >>>>> College of Life Sciences >>>>> Medical Sciences Institute >>>>> University of Dundee >>>>> Dundee >>>>> DD1 5EH >>>>> UK >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>> >>> >> >> >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From flf.mib at gmail.com Fri Feb 25 22:21:19 2011 From: flf.mib at gmail.com (=?ISO-8859-1?Q?Fran=E7ois_Le_Fevre?=) Date: Fri, 25 Feb 2011 23:21:19 +0100 Subject: [Biojava-l] biojava3 and symbol Message-ID: <4D682B5F.60807@gmail.com> Hello I am a newbie in biojava3. I have installed the maven version. I have a question : is there a way to go from DNASequence to SymbolList? I would like to study codons and their frequency in several organisms. It was easy with biojava with "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" and "DistributionTrainerContext" But now I am a little lost between maven biojava3 and biojava. So if anyone could explain me how to get codons view from DNASequence, it could be great. Thanks a lot! Francois -- ---------------------- Francois LE FEVRE From andreas at sdsc.edu Mon Feb 28 06:04:46 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 27 Feb 2011 22:04:46 -0800 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <4D682B5F.60807@gmail.com> References: <4D682B5F.60807@gmail.com> Message-ID: Hi Francois, SymbolLists are part of BioJava 1.8 (the legacy version) and not 3.0. You might want to get the previous version installed. It is available from Maven as well... Andreas 2011/2/25 Fran?ois Le Fevre : > Hello > I am a newbie in biojava3. > I have installed the maven version. > > I have a question : is there a way to go from DNASequence to SymbolList? > I would like to study codons and their frequency in several organisms. > It was easy with biojava with > > "SymbolList codons ?= SymbolListViews.windowedSymbolList(seq, 3);" > and > "DistributionTrainerContext" > > But now I am a little lost between maven biojava3 and biojava. > So if anyone could explain me how to get codons view from DNASequence, it > could be great. > > Thanks a lot! > > Francois > > > -- > ---------------------- > Francois LE FEVRE > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From ayates at ebi.ac.uk Mon Feb 28 10:06:16 2011 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 28 Feb 2011 10:06:16 +0000 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <4D682B5F.60807@gmail.com> References: <4D682B5F.60807@gmail.com> Message-ID: <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> Hi Francois, To get a windowed view over any sequence you should use the following: DNASequence seq = new DNASequence("ATGCTG"); Iterable> w = new WindowedSequence(seq, 3); for(SequenceView triplet: w) { System.out.println(triplet); } HTH Andy On 25 Feb 2011, at 22:21, Fran?ois Le Fevre wrote: > Hello > I am a newbie in biojava3. > I have installed the maven version. > > I have a question : is there a way to go from DNASequence to SymbolList? > I would like to study codons and their frequency in several organisms. > It was easy with biojava with > > "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" > and > "DistributionTrainerContext" > > But now I am a little lost between maven biojava3 and biojava. > So if anyone could explain me how to get codons view from DNASequence, it could be great. > > Thanks a lot! > > Francois > > > -- > ---------------------- > Francois LE FEVRE > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Andrew Yates Ensembl Genomes Engineer EMBL-EBI Tel: +44-(0)1223-492538 Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ From flf.mib at gmail.com Mon Feb 28 11:12:51 2011 From: flf.mib at gmail.com (Francois Le Fevre) Date: Mon, 28 Feb 2011 12:12:51 +0100 Subject: [Biojava-l] biojava3 and symbol In-Reply-To: <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> References: <4D682B5F.60807@gmail.com> <245D3308-6EB9-44C0-A954-DEA0D9D28F94@ebi.ac.uk> Message-ID: perfect thank a lot. Have a good day Francois 2011/2/28 Andy Yates > Hi Francois, > > To get a windowed view over any sequence you should use the following: > > DNASequence seq = new DNASequence("ATGCTG"); > Iterable> w = new > WindowedSequence(seq, 3); > for(SequenceView triplet: w) { > System.out.println(triplet); > } > > HTH > > Andy > > On 25 Feb 2011, at 22:21, Fran?ois Le Fevre wrote: > > > Hello > > I am a newbie in biojava3. > > I have installed the maven version. > > > > I have a question : is there a way to go from DNASequence to SymbolList? > > I would like to study codons and their frequency in several organisms. > > It was easy with biojava with > > > > "SymbolList codons = SymbolListViews.windowedSymbolList(seq, 3);" > > and > > "DistributionTrainerContext" > > > > But now I am a little lost between maven biojava3 and biojava. > > So if anyone could explain me how to get codons view from DNASequence, it > could be great. > > > > Thanks a lot! > > > > Francois > > > > > > -- > > ---------------------- > > Francois LE FEVRE > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Andrew Yates Ensembl Genomes Engineer > EMBL-EBI Tel: +44-(0)1223-492538 > Wellcome Trust Genome Campus Fax: +44-(0)1223-494468 > Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/ > > > > > -- Francois Le Fevre Management Informatique Innovation Biotechnologies Paris, France - Avant d'imprimer, pensez ? l'environnement From p.v.troshin at dundee.ac.uk Mon Feb 28 15:18:15 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Mon, 28 Feb 2011 15:18:15 +0000 Subject: [Biojava-l] Isoelectric point and molecular weight calculations with BioJava In-Reply-To: References: <4D667A55.5040404@dundee.ac.uk> <4D6698E7.3080202@dundee.ac.uk> <20110224131506.17104xy7rpe7n30g@gator1273.hostgator.com> Message-ID: <4D6BBCB7.3010203@dundee.ac.uk> >>>What other functionality would you >>>like to see that is currently not there? I think that the methods below would be a good starting point, then the Google Summer of Code student can propose something else that he/she would fancy implementing. Molecular weight Extinction coefficient Instability index Aliphatic index Grand Average of Hydropathy Isoelectric point Number of amino acids in the protein (His, Met, Cys) I know BioJava projects were managed under Open Bioinformatics Foundation (OBF) during last years GSoC. Is there a page for this year GSoC ideas somewhere? Regards, Peter On 25/02/2011 05:12, Andreas Prlic wrote: > Great, seems we have an agreement that we want to improve > functionality for this. How complex is this going to be? From quickly > checking the 1.8 source it looks like just a few classes that need to > be converted and not too painful. What other functionality would you > like to see that is currently not there? > > Andreas > > > On Thu, Feb 24, 2011 at 8:08 PM, Scooter Willis wrote: >> We put in some basics regarding modeling amino acid properties in the >> core module but really didn't have any pressing use cases to drive the >> api beyond calculating the mass of a peptide. We currently have >> getMolecularWeight() as a method in AbstractCompound but never added a >> getSequenceMolecularWeight() to AbstractSequence. It would be great to >> get the attributes/features of amino acids properly modeled in core >> and extend when reasonable useful summary methods at higher levels. >> You should be able to query mass of a peptide and have it valid for an >> amino acid with a PTM which means the amino acid needs to support the >> ability to be modified in a flexible manner. I spent the last year+ >> developing a software suite for peptide detection in MS data for >> deuterium exchange where automated PTM detection was important. Would >> be great to get some focused attention on the core to make sure we can >> model nucleotides and amino acids with a chemistry friendly API. >> >> Thanks >> >> Scooter >> >> On Thu, Feb 24, 2011 at 2:15 PM, George Waldon wrote: >>> Hello Peter& Andreas >>> >>> I effectively did some work on these methods, mostly fixing and adding the >>> ExPASy algorithm that was kindly provided to me. I think it makes a lot of >>> sense to port all physico-chemical property calculations related to amino >>> acids and polypeptides to bj3, as suggested by Andreas, and I definitively >>> support the effort. We could smoothly deprecate the bj1 package when this is >>> done. Let me know how I could help. >>> >>> Thanks >>> George >>> >>> Quoting Peter Troshin: >>> >>>> Hi Andreas, >>>> >>>> In fact I'd be happy to help with the development of the tools for simple >>>> physico-chemical properties calculation for peptides. We could port George?s >>>> code (assuming he is happy with this) from BioJava 1.8 but we can also >>>> provide a few other methods. A couple of projects in the lab where I work >>>> would have benefited from having these calculations readily available. >>>> >>>> I was thinking about participation in the Google Summer of Code (GoSC) >>>> this year as a mentor, and I think this would be an easy project for a >>>> student. What do you think about this? >>>> >>>> Thank you for your prompt reply. >>>> >>>> Regards, >>>> Peter >>>> >>>> >>>> >>>> On 24/02/2011 16:54, Andreas Prlic wrote: >>>>> Hi Peter, >>>>> >>>>> if you get a copy of biojava 1.8, it is still there. However I would >>>>> like to port this to biojava 3 as well.. George do you want to help me >>>>> with that, since you are one of the authors of this package? The basic >>>>> support for chemistry in BioJava 3 is a bit better... (e.g. Element >>>>> class) >>>>> >>>>> Andreas >>>>> >>>>> On Thu, Feb 24, 2011 at 7:33 AM, Peter Troshin >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I've noticed that BioJava up to about version 1.7 had an >>>>>> org.biojava.bio.proteomics package, which had methods for isoelectric >>>>>> point >>>>>> and molecular weight calculations for peptides. I could not find this >>>>>> package in the BioJava 3.0.1 API. I?d like to use these methods and >>>>>> wonder >>>>>> if there are any equivalent methods available in the latest version of >>>>>> BioJava? >>>>>> >>>>>> Thank you for your help, >>>>>> >>>>>> Kind regards, >>>>>> Peter >>>>>> >>>>>> Dr Peter Troshin >>>>>> Bioinformatics Software Developer >>>>>> Phone: +44 (0)1382 388589 >>>>>> Fax: +44 (0)1382 385764 >>>>>> The Barton Group >>>>>> College of Life Sciences >>>>>> Medical Sciences Institute >>>>>> University of Dundee >>>>>> Dundee >>>>>> DD1 5EH >>>>>> UK >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>>> >>>> >>> >>> >>> _______________________________________________ >>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > > From uchathuranga at gmail.com Thu Feb 10 17:01:57 2011 From: uchathuranga at gmail.com (udana chathuranga) Date: Thu, 10 Feb 2011 17:01:57 -0000 Subject: [Biojava-l] Problem with Multiple Sequence Alignment in BioJava Message-ID: hi all, When I was going through the biojava cookbook as I was interested in this project. I tried the example in the page http://biojava.org/wiki/BioJava:CookBook3:MSA and I got a classnotfound exception for the line "Profile profile = Alignments.getMultipleSequenceAlignment(lst);". Error Message: Exception in thread "main" java.lang.NoClassDefFoundError: org/forester/phylogenyinference/DistanceMatrix at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:176) at CookbookMSA.multipleSequenceAlignment(CookbookMSA.java:29) at CookbookMSA.main(CookbookMSA.java:18) Caused by: java.lang.ClassNotFoundException: org.forester.phylogenyinference.DistanceMatrix at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Is this a know issue or Am I doing something wrong with the code? Help me on this I have attached the java source file that I have tried. Thanks Regards udana. -------------- next part -------------- A non-text attachment was scrubbed... Name: CookbookMSA.java Type: application/octet-stream Size: 1579 bytes Desc: not available URL: