From andreas at sdsc.edu Wed Mar 7 14:39:38 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 7 Mar 2012 11:39:38 -0800 Subject: [Biojava-l] Google Summer of code preparations Message-ID: Hi, If you want to add any project ideas for BioJava's Google summer of code application for 2012, please add them to http://biojava.org/wiki/Google_Summer_of_Code_2012 ideally by tomorrow. Adding more details to the already existing projects is also good. Friday is the application deadline for organisations. Andreas From andreas.prlic at gmail.com Wed Mar 7 21:46:56 2012 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Wed, 7 Mar 2012 18:46:56 -0800 Subject: [Biojava-l] Biojava3 pairwise aligner result is different emboss' needle In-Reply-To: References: Message-ID: Hi, looks like the emboss' alignment is not penalising end gaps. You could try to use the smith waterman algorithm instead... Andreas On Wed, Mar 7, 2012 at 6:33 PM, ??? wrote: > Hello > My name is Oh jeongsu,?I am a student at Chungbuk National University in > korea. > > I've been run global alignments with biojava and needle. but biojava3 > pairwise aligner result is different emboss' needle. > > here my option and code >>query > AAAAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGCTGAGAAGTTAAACGGATTACCGCCTAAAGAATGAGCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG >>target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGAGAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > Gap opening penalty : -10 > Gap extension penalty : -0.5 > > SubstitutionMatrix : biojava - nuc4_4 , needle - ednafull > > biojava result > query > AA-------------------------------------------------------------------------------------------AAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGC----TGAG---AAG-TTAAACGGATTACCGCCTAAAGAATGA---GCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG------------------------------------------------------------------------------------------------------------------------------------------------------------ > target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGA------------GAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > needle result > query > -------------------------------------------------------------------------------AA------------AAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGC----TGAG---AAG-TTAAACGGATTACCGCCTAAAGAATGAGCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG------------------------------------------------------------------------------------------------------------------------------------------------------------ > target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGA---------GAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > > i want to same result , could you tell me what is wrong? > > thansks > > Oh jeongsu From andreas at sdsc.edu Thu Mar 8 14:43:50 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 8 Mar 2012 11:43:50 -0800 Subject: [Biojava-l] BioJava 3.0.3. release plan Message-ID: Hi, We are planning to release BioJava 3.0.3 ( and the legacy BioJava 1.8.2.) next Friday, March 16th. The code freeze will happen on Thursday March 15th, please commit any code before that. Also please make sure all documentation is up-to-date, both in javadoc and on the wiki.. Andreas From tariq_cp at hotmail.com Mon Mar 12 02:19:52 2012 From: tariq_cp at hotmail.com (Muhammad Tariq Pervez) Date: Mon, 12 Mar 2012 06:19:52 +0000 Subject: [Biojava-l] Biojava-l Digest, Vol 110, Issue 2 In-Reply-To: References: Message-ID: A great news. Muhammad Tariq Pervez Ph.D Bioinformatics Scholar Department of Computer Science Virtual University of Pakistan, Lahore Tel: (042) 9203114-7 URL: www.vu.edu.pk Mobile: +923364120541, +923214602694 > From: biojava-l-request at lists.open-bio.org > Subject: Biojava-l Digest, Vol 110, Issue 2 > To: biojava-l at lists.open-bio.org > Date: Fri, 9 Mar 2012 12:00:02 -0500 > > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. BioJava 3.0.3. release plan (Andreas Prlic) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 8 Mar 2012 11:43:50 -0800 > From: Andreas Prlic > Subject: [Biojava-l] BioJava 3.0.3. release plan > To: biojava-dev , Biojava > > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi, > > We are planning to release BioJava 3.0.3 ( and the legacy BioJava > 1.8.2.) next Friday, March 16th. > > The code freeze will happen on Thursday March 15th, please commit any > code before that. Also please make sure all documentation is > up-to-date, both in javadoc and on the wiki.. > > Andreas > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 110, Issue 2 > ***************************************** From nickengland at gmail.com Mon Mar 12 11:04:05 2012 From: nickengland at gmail.com (Nick England) Date: Mon, 12 Mar 2012 15:04:05 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? Message-ID: Hello all, I am trying to read some .scf files into BioJava3. I have found the example code for 1.3 (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I can't find any classes in the 3.0 API which look at all related to those ones. Is it possible, or should I downgrade to 1.3 if I want to be able to read .scf files? Thanks, Nick From andreas at sdsc.edu Mon Mar 12 12:43:43 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 09:43:43 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Hi Nick, this feature has not been ported to biojava3 so far and you can get it via biojava 1.8.1 Andreas On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: > Hello all, > > I am trying to read some .scf files into BioJava3. I have found the > example code for 1.3 > (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I > can't find any classes in the 3.0 API which look at all related to > those ones. Is it possible, or should I downgrade to 1.3 if I want to > be able to read .scf files? > > Thanks, > > Nick > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From nickengland at gmail.com Mon Mar 12 13:39:55 2012 From: nickengland at gmail.com (Nick England) Date: Mon, 12 Mar 2012 17:39:55 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Thanks for the help, 1.8.1 seems to have the classes I need. But I haven't been able to get maven to work with biojava 1.8. It works fine with 3.X, but when I try to use the dependency: org.biojava biojava-legacy 1.8.1 it fails, even though there appears to be the corresponding pom on the repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ I have the repo at : biojava-maven-repo BioJava repository http://www.biojava.org/download/maven/ Am I doing something wrong, or has the release version 1.8.1 somehow broken on the repository? Thanks, Nick On 12 March 2012 16:43, Andreas Prlic wrote: > Hi Nick, > > this feature has not been ported to biojava3 so far and you can get it > via biojava 1.8.1 > > Andreas > > On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >> Hello all, >> >> I am trying to read some .scf files into BioJava3. I have found the >> example code for 1.3 >> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >> can't find any classes in the 3.0 API which look at all related to >> those ones. Is it possible, or should I downgrade to 1.3 if I want to >> be able to read .scf files? >> >> Thanks, >> >> Nick >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 12 13:54:29 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 10:54:29 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: I am not aware of any issues with that. Does either 1.8 or 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to refresh your Maven dependencies? Andreas On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: > Thanks for the help, 1.8.1 seems to have the classes I need. But I > haven't been able to get maven to work with biojava 1.8. It works fine > with 3.X, but when I try to use the dependency: > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ?org.biojava > ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy > ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 > ? ? ? ? ? ? ? ? > it fails, even though there appears to be the corresponding pom on the > repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ > > I have the repo at : > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo > ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository > ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ > ? ? ? ? ? ? ? ? > > Am I doing something wrong, or has the release version 1.8.1 somehow > broken on the repository? > > Thanks, > > Nick > > On 12 March 2012 16:43, Andreas Prlic wrote: >> Hi Nick, >> >> this feature has not been ported to biojava3 so far and you can get it >> via biojava 1.8.1 >> >> Andreas >> >> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>> Hello all, >>> >>> I am trying to read some .scf files into BioJava3. I have found the >>> example code for 1.3 >>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>> can't find any classes in the 3.0 API which look at all related to >>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>> be able to read .scf files? >>> >>> Thanks, >>> >>> Nick >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l From heuermh at gmail.com Mon Mar 12 16:03:35 2012 From: heuermh at gmail.com (Michael Heuer) Date: Mon, 12 Mar 2012 15:03:35 -0500 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Hello Andreas, I noticed this in the pom for das 1.8.2-SNAPSHOT org.biojava biojava3-structure 3.0-alpha1 compile I don't think that would cause trouble as reported below, but it is odd in any case. 1.8.x shouldn't have a dependency on 3.x. Hello Nick, What kind of maven failure are you seeing? michael On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: > I am not aware of any issues with that. Does either 1.8 or > 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to > refresh your Maven dependencies? > > Andreas > > > > On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >> Thanks for the help, 1.8.1 seems to have the classes I need. But I >> haven't been able to get maven to work with biojava 1.8. It works fine >> with 3.X, but when I try to use the dependency: >> ? ? ? ? ? ? ? ? >> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >> ? ? ? ? ? ? ? ? >> it fails, even though there appears to be the corresponding pom on the >> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >> >> I have the repo at : >> ? ? ? ? ? ? ? ? >> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >> ? ? ? ? ? ? ? ? >> >> Am I doing something wrong, or has the release version 1.8.1 somehow >> broken on the repository? >> >> Thanks, >> >> Nick >> >> On 12 March 2012 16:43, Andreas Prlic wrote: >>> Hi Nick, >>> >>> this feature has not been ported to biojava3 so far and you can get it >>> via biojava 1.8.1 >>> >>> Andreas >>> >>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>> Hello all, >>>> >>>> I am trying to read some .scf files into BioJava3. I have found the >>>> example code for 1.3 >>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>> can't find any classes in the 3.0 API which look at all related to >>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>> be able to read .scf files? >>>> >>>> Thanks, >>>> >>>> Nick >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 12 16:38:25 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 13:38:25 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: > I don't think that would cause trouble as reported below, but it is > odd in any case. ?1.8.x shouldn't have a dependency on 3.x. In principle I agree, however there are historic reasons for this. If this config causes a problem then I would vote for removing it. The biojava-legacy project contains no structure sources any more since it was completely upgraded into the 3.X series. The API is compatible to a large extent. There is no reason for anybody to still be using a 1.X version of structure. Andreas > Hello Nick, > > What kind of maven failure are you seeing? > > ? michael > > > On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >> I am not aware of any issues with that. Does either 1.8 or >> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >> refresh your Maven dependencies? >> >> Andreas >> >> >> >> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>> haven't been able to get maven to work with biojava 1.8. It works fine >>> with 3.X, but when I try to use the dependency: >>> ? ? ? ? ? ? ? ? >>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>> ? ? ? ? ? ? ? ? >>> it fails, even though there appears to be the corresponding pom on the >>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>> >>> I have the repo at : >>> ? ? ? ? ? ? ? ? >>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>> ? ? ? ? ? ? ? ? >>> >>> Am I doing something wrong, or has the release version 1.8.1 somehow >>> broken on the repository? >>> >>> Thanks, >>> >>> Nick >>> >>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>> Hi Nick, >>>> >>>> this feature has not been ported to biojava3 so far and you can get it >>>> via biojava 1.8.1 >>>> >>>> Andreas >>>> >>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>> Hello all, >>>>> >>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>> example code for 1.3 >>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>> can't find any classes in the 3.0 API which look at all related to >>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>> be able to read .scf files? >>>>> >>>>> Thanks, >>>>> >>>>> Nick >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l From biojava at hannes.oib.com Tue Mar 13 01:56:29 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 13 Mar 2012 06:56:29 +0100 Subject: [Biojava-l] From : Chromatagraph reading in BioJava3? Message-ID: On Mon, Mar 12, 2012 at 16:04, Nick England wrote: > Hello all, > > I am trying to read some .scf files into BioJava3. I have found the > example code for 1.3 > (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I > can't find any classes in the 3.0 API which look at all related to > those ones. Is it possible, or should I downgrade to 1.3 if I want to > be able to read .scf files? > > Thanks, > > Nick Should we port that to 3.0.x after the next release? Or for 3.0.3? Sounds like another nice addition to org.biojava3.genome.parsers - I'll take a look at it to see what info is stored in these scf files. Hannes From nickengland at gmail.com Tue Mar 13 06:25:26 2012 From: nickengland at gmail.com (Nick England) Date: Tue, 13 Mar 2012 10:25:26 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: If I try to run maven I got the following: [INFO] Building sequence analysis 0.0.1-SNAPSHOT [INFO] ------------------------------------------------------------------------ Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, no dependency information available Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE I get this for 1.8 and 1.8.1 as well: Could not find artifact org.biojava:biojava-legacy:jar:1.8 in biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help 1] Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help 1] If I try to install it from source, I'm getting some class cast exceptions during the unit tests for some tests in core: Failed tests: testGetNoteSet(org.biojavax.bio.SimpleBioEntryTest) testGetRankedCrossRefs(org.biojavax.bio.SimpleBioEntryTest) 3.0 works fine, but lacks the class I want!. Presumably BioJava 1.8 hasn't always been called biojava-legacy? Since maven releases should be immutable, it should still be available from the original release maven repo? Thanks for the help, Nick On 12 March 2012 20:38, Andreas Prlic wrote: >> I don't think that would cause trouble as reported below, but it is >> odd in any case. ?1.8.x shouldn't have a dependency on 3.x. > > In principle I agree, however there are historic reasons for this. ?If > this config causes a problem then I would vote for removing it. The > biojava-legacy project contains no structure sources any more since it > was completely upgraded into the 3.X series. The API is compatible to > a large extent. ?There is no reason for anybody to still be using a > 1.X version of structure. > > Andreas > > > >> Hello Nick, >> >> What kind of maven failure are you seeing? >> >> ? michael >> >> >> On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >>> I am not aware of any issues with that. Does either 1.8 or >>> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >>> refresh your Maven dependencies? >>> >>> Andreas >>> >>> >>> >>> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>>> haven't been able to get maven to work with biojava 1.8. It works fine >>>> with 3.X, but when I try to use the dependency: >>>> ? ? ? ? ? ? ? ? >>>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>>> ? ? ? ? ? ? ? ? >>>> it fails, even though there appears to be the corresponding pom on the >>>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>>> >>>> I have the repo at : >>>> ? ? ? ? ? ? ? ? >>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>>> ? ? ? ? ? ? ? ? >>>> >>>> Am I doing something wrong, or has the release version 1.8.1 somehow >>>> broken on the repository? >>>> >>>> Thanks, >>>> >>>> Nick >>>> >>>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>>> Hi Nick, >>>>> >>>>> this feature has not been ported to biojava3 so far and you can get it >>>>> via biojava 1.8.1 >>>>> >>>>> Andreas >>>>> >>>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>>> Hello all, >>>>>> >>>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>>> example code for 1.3 >>>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>>> can't find any classes in the 3.0 API which look at all related to >>>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>>> be able to read .scf files? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Nick >>>>>> _______________________________________________ >>>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l From nickengland at gmail.com Tue Mar 13 06:48:11 2012 From: nickengland at gmail.com (Nick England) Date: Tue, 13 Mar 2012 10:48:11 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Aha, solved the test problem, seems the unit tests fail under JDK 7 as a TreeSet now fails if you add a single non-comparable object, while before it failed only when you added a second one and a comparison was actually needed. Since new Object() doesn't implement comparable, the tests were failing under maven for me. I can at least locally install 1.8 now, but would be nice to have a maven repo dependency on 1.8 work in the future! Cheers, Nick On 13 March 2012 10:25, Nick England wrote: > If I try to run maven I got the following: > > [INFO] Building sequence analysis 0.0.1-SNAPSHOT > [INFO] ------------------------------------------------------------------------ > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, > no dependency information available > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > [INFO] ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > > I get this for 1.8 and 1.8.1 as well: > > Could not find artifact org.biojava:biojava-legacy:jar:1.8 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > > If I try to install it from source, I'm getting some class cast > exceptions during the unit tests for some tests in core: > > Failed tests: > ?testGetNoteSet(org.biojavax.bio.SimpleBioEntryTest) > ?testGetRankedCrossRefs(org.biojavax.bio.SimpleBioEntryTest) > > > 3.0 works fine, but lacks the class I want!. > Presumably BioJava 1.8 hasn't always been called biojava-legacy? Since > maven releases should be immutable, it should still be available from > the original release maven repo? > > Thanks for the help, > > Nick > > > On 12 March 2012 20:38, Andreas Prlic wrote: >>> I don't think that would cause trouble as reported below, but it is >>> odd in any case. ?1.8.x shouldn't have a dependency on 3.x. >> >> In principle I agree, however there are historic reasons for this. ?If >> this config causes a problem then I would vote for removing it. The >> biojava-legacy project contains no structure sources any more since it >> was completely upgraded into the 3.X series. The API is compatible to >> a large extent. ?There is no reason for anybody to still be using a >> 1.X version of structure. >> >> Andreas >> >> >> >>> Hello Nick, >>> >>> What kind of maven failure are you seeing? >>> >>> ? michael >>> >>> >>> On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >>>> I am not aware of any issues with that. Does either 1.8 or >>>> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >>>> refresh your Maven dependencies? >>>> >>>> Andreas >>>> >>>> >>>> >>>> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>>>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>>>> haven't been able to get maven to work with biojava 1.8. It works fine >>>>> with 3.X, but when I try to use the dependency: >>>>> ? ? ? ? ? ? ? ? >>>>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>>>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>>>> ? ? ? ? ? ? ? ? >>>>> it fails, even though there appears to be the corresponding pom on the >>>>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>>>> >>>>> I have the repo at : >>>>> ? ? ? ? ? ? ? ? >>>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>>>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>>>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>>>> ? ? ? ? ? ? ? ? >>>>> >>>>> Am I doing something wrong, or has the release version 1.8.1 somehow >>>>> broken on the repository? >>>>> >>>>> Thanks, >>>>> >>>>> Nick >>>>> >>>>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>>>> Hi Nick, >>>>>> >>>>>> this feature has not been ported to biojava3 so far and you can get it >>>>>> via biojava 1.8.1 >>>>>> >>>>>> Andreas >>>>>> >>>>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>>>> Hello all, >>>>>>> >>>>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>>>> example code for 1.3 >>>>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>>>> can't find any classes in the 3.0 API which look at all related to >>>>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>>>> be able to read .scf files? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Nick >>>>>>> _______________________________________________ >>>>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l From biojava at hannes.oib.com Tue Mar 13 08:23:37 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 13 Mar 2012 13:23:37 +0100 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 11:25, Nick England wrote: > If I try to run maven I got the following: > > [INFO] Building sequence analysis 0.0.1-SNAPSHOT > [INFO] ------------------------------------------------------------------------ > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, > no dependency information available > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > [INFO] ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > > I get this for 1.8 and 1.8.1 as well: > > Could not find artifact org.biojava:biojava-legacy:jar:1.8 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] I just tried it myself, I see the same results. My maven tool usually helps me filling out the pom.xml file and suggests valid entries. In the artifactId filed, i only get: biojava biojava3-alignment biojava3-core biojava3-phylo and when selecting biojava, I only get 3.0 and 3.0.2 in the version field -> looks linke something is seriously wrong with the maven repo. Note: when I enter it manually, the 1.8.1 pom is downloaded, but the jar fails. Hannes From komalsnehal1991 at gmail.com Wed Mar 14 15:27:50 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Thu, 15 Mar 2012 00:57:50 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 Message-ID: Hi Everyone, I am Komal, an undergraduate student from IT-BHU, India. I'm interested in working with BioJava for GSoC 2012. I am particularly interested in working on the 'Porting an Algorithm to Java' project. Kindly help me about how I should proceed. Thanks, Komal From andreas at sdsc.edu Wed Mar 14 15:47:32 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 14 Mar 2012 12:47:32 -0700 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi Komal, stay tuned to this list, we still don't know if we will get funded from Google this year. Andreas On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev wrote: > Hi Everyone, > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested in > working with BioJava for GSoC 2012. I am particularly interested in working > on the 'Porting an Algorithm to Java' project. > Kindly help me about how I should proceed. > > Thanks, > Komal > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From jmsallen12 at gmail.com Thu Mar 15 12:30:15 2012 From: jmsallen12 at gmail.com (James Allen) Date: Thu, 15 Mar 2012 12:30:15 -0400 Subject: [Biojava-l] World's biggest fake conference in computer science Message-ID: --- Better not to have a publication than to publish in WORLDCOMP and spoil the resume forever --- If you didn't know already, WORLDCOMP is the World's biggest fake conference in computer science http://sites.google.com/site/worlddump1 The next WORLDCOMP (consists of more than twenty different conferences) will be during July 16-19, 2012, USA. Hamid Arabnia (a professor in computer science from University of Georgia, USA) has been running this fake (bogus or junk or scam) conference business to collect registration fee for over a decade. He accepts almost all submitted papers but cheating that there is a review by "two experts". He accumulated millions of dollars by this business. If the above link didn't work, you may try http://copy-shake-paste.blogspot.com/2012/02/fake-conference-worldcomp.html If none of these links work, search internet using worldcomp bogus If you have a paper in WORLDCOMP 2011 or earlier, you may file a lawsuit against Hamid Arabnia because he cheated you about reviews, reviewers, acceptance policies and acceptance rates. Sincerely, James Allen LATEST NEWS (as of March 11, 2012): Hamid Arabnia removed his name and address (ie, University of Georgia, where he is working as a professor) from WORLDCOMP website. If the conference is not fake, why he removed them suddenly? Who is the chair/coordinator of this bogus conference now? There are no committee members, keynote speeches and sponsors. The draft paper submission date is quietly extended. Why is Hamid Arabnia still running the fake conference anonymously and for whose benefit? We ask him to answer all these questions. This is the first time (in the entire world) a conference is running without the chair/coordinator name listed. Hamid Arabnia may soon bring another person (just for name sake) as the chair of WORLDCOMP and still Hamdid Arabnia run it from behind the scenes forever. From superrubiroyd at gmail.com Fri Mar 16 19:57:33 2012 From: superrubiroyd at gmail.com (superrubiroyd) Date: Sat, 17 Mar 2012 01:57:33 +0200 Subject: [Biojava-l] GSOC 2012 Message-ID: <4F63D36D.1010401@gmail.com> Hi, I am the final year student in Ukraine. I have 2.5 year of Javaexperience. I would like to work with project 'New File Parsers for BioJava' during GSOC 2012. Can you explain little more what should be done in this project and can you give some advices how to make correct application. Thanks in advance. With best regards, Evgeniy Berlog From andreas at sdsc.edu Fri Mar 16 20:41:59 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Mar 2012 17:41:59 -0700 Subject: [Biojava-l] BioJava at at Google Summer of Code 2012 Message-ID: Hi All, The Open Bioinformatics foundation as an umbrella organisation for BioJava has been accepted to participate in this year's Google Summer of Code. See the announcement message below. This means we will again be able to offer mentoring through BioJava this year. Accepted students will get a stipend of 5,000$ from Google. Participation is possible from most countries in the world, as long as you are eligible to work in the country in which you'll reside throughout the duration of the program. If you are interested in working on a BioJava related project, now is the time to start preparing and discussing your proposals. For the last two years we had many applications for the projects proposed by mentors. If you want to distinguish your application I recommend to propose your own project. Don't forget to discuss any proposal with us before you submit them. We will try to provide feedback and match you with a suitable Mentor. Also see http://biojava.org/wiki/Google_Summer_of_Code and Google's FAQs: http://www.google-melange.com/document/show/gsoc_program/google/gsoc2012/faqs The student application deadline is April 6th. Google will announce which proposals got accepted on April 23rd. Andreas ---------- Forwarded message ---------- From: Robert Buels Date: Fri, Mar 16, 2012 at 12:47 PM Subject: [Open-bio-l] Google Summer of Code is *ON* for OBF projects! To: Open-Bio List Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). ? Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2012 FAQ at http://goo.gl/kNv48 Student applications are due April 6, 2012 at 19:00 UTC. ?Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and whom to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. ?Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2012 Administrator _______________________________________________ Open-Bio-l mailing list Open-Bio-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/open-bio-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From andreas at sdsc.edu Fri Mar 16 21:30:11 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Mar 2012 18:30:11 -0700 Subject: [Biojava-l] BioJava 3.0.3 released Message-ID: BioJava 3.0.3 has been released and is available from http://www.biojava.org/wiki/BioJava:Download as well as from the BioJava maven repository at http://www.biojava.org/download/maven/ . BioJava 3.0.3 adds several new features - Significant improvements for the web service module (ncbi blast and hmmer web services) - Fastq parser (ported from the biojava 1 series to version 3) - Support for SIFTS-PDB to UniProt mapping - Improved support for working with external protein domain definitions - Protmod module renamed to modfinder - Numerous improvements all over the place (several hundred commits since last release) - We are also working on an update for the legacy biojava 1.8 series. This release would not have been possible with contributions from numerous people, thanks to all for their support! About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy BioJava-ing, Andreas From anupam.aries19 at gmail.com Sat Mar 17 04:04:53 2012 From: anupam.aries19 at gmail.com (Anupam Singh) Date: Sat, 17 Mar 2012 13:34:53 +0530 Subject: [Biojava-l] gsoc Message-ID: Sir, I am Anupam Singh, a second year student pursuing a M.sc Tech degree in Information Systems from Bits Pilani in India.I went through the Biojava project details.I found the project very interesting and would genuinely like to work those features of Biojava3 like Cath parser and Genbank parser to make it more user friendly and versatile.I have been coding in java/c/c++ for the past 4 years and have a good command over the language to make this project possible. Regards, Anupam Singh From ritishalaungani at gmail.com Sat Mar 17 05:50:13 2012 From: ritishalaungani at gmail.com (Ritisha Laungani) Date: Sat, 17 Mar 2012 15:20:13 +0530 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java Message-ID: Hello, I am Ritisha Laungani, a pre-final year student currently persuing *MSc Tech. Information Systems* at Birla Institute of Technology and Science, Goa, India. I would like to apply for the BioJava project as i have worked into all the 3 fields this projects requires- C, Java and Bio! As far as i understand, in simple terms, the project's goal is to convert an existing HMMER source code, which is written in C, to a java code using language processing tools. Do correct me if i am wrong here! I must admit here that i am new to open source software development and also unaware of HMMER. But i would love to learn if given a chance and the correct resources! :) Eagerly awaiting a reply, which could guide me to the next step. Regards, Ritisha Laungani From komalsnehal1991 at gmail.com Sat Mar 17 18:59:06 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Sun, 18 Mar 2012 04:29:06 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi all, Introducing myself a bit more, I also work as a remote intern for DARNED. DARNED is a database of RNA Editing, and currently new features are being added into the project, one of which is incorporating the BLAST feature for sequence based search. Having worked on a similar project recently, I think I will be comfortable working with the 'Porting an Algorithm to Java' project. The following is what I understood about the project. Please correct me if I am wrong. This(link) is the current method used for BLAST, which accesses the NCBI website each time. The NCBIQBlastService class is currently used. The project aims at replacing this with code which will perform the search within Biojava. I downloaded the source codes of BLAST and HMMER. My job will be to convert these to Java. Regarding the C/C++ to Java converter, i found this on the internet: http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html but it is not free of cost. Apart from this, I saw that many people discourage the use of C/C++ to Java tools saying that they are not efficient. Does anyone know of any better tool which can do this? Regarding the JNI, would it not be better if the whole code was written in Java, rather than a part of it being in C/C++? I haven't used it before, but if it is better than converting the code, I don't have a problem working with it. Kindly clear my doubts. Thanks in advance, Komal On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > Hi Komal, > > stay tuned to this list, we still don't know if we will get funded > from Google this year. > > Andreas > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > wrote: > > Hi Everyone, > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > in > > working with BioJava for GSoC 2012. I am particularly interested in > working > > on the 'Porting an Algorithm to Java' project. > > Kindly help me about how I should proceed. > > > > Thanks, > > Komal > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > From amr_alhossary at hotmail.com Sun Mar 18 00:05:41 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Sun, 18 Mar 2012 06:05:41 +0200 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Dear Komal, As far as I know, The project is about porting the code to Java, not using existing C code within JNI. That means you should be able to digest the algorithm first, Build it using Java from scratch, depending on C code as a reference Implementation. Regards Amr -----Original Message----- From: Komal Sanjeev Sent: Sunday, March 18, 2012 12:59 AM To: Andreas Prlic ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 Hi all, Introducing myself a bit more, I also work as a remote intern for DARNED. DARNED is a database of RNA Editing, and currently new features are being added into the project, one of which is incorporating the BLAST feature for sequence based search. Having worked on a similar project recently, I think I will be comfortable working with the 'Porting an Algorithm to Java' project. The following is what I understood about the project. Please correct me if I am wrong. This(link) is the current method used for BLAST, which accesses the NCBI website each time. The NCBIQBlastService class is currently used. The project aims at replacing this with code which will perform the search within Biojava. I downloaded the source codes of BLAST and HMMER. My job will be to convert these to Java. Regarding the C/C++ to Java converter, i found this on the internet: http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html but it is not free of cost. Apart from this, I saw that many people discourage the use of C/C++ to Java tools saying that they are not efficient. Does anyone know of any better tool which can do this? Regarding the JNI, would it not be better if the whole code was written in Java, rather than a part of it being in C/C++? I haven't used it before, but if it is better than converting the code, I don't have a problem working with it. Kindly clear my doubts. Thanks in advance, Komal On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > Hi Komal, > > stay tuned to this list, we still don't know if we will get funded > from Google this year. > > Andreas > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > wrote: > > Hi Everyone, > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > in > > working with BioJava for GSoC 2012. I am particularly interested in > working > > on the 'Porting an Algorithm to Java' project. > > Kindly help me about how I should proceed. > > > > Thanks, > > Komal > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From komalsnehal1991 at gmail.com Sun Mar 18 03:42:29 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Sun, 18 Mar 2012 13:12:29 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi Amr, The following has been mentioned in the project description: "Converting C or C++ source code by hand is not a trivial undertaking and it is recommended that a C/C++ to Java conversion tool be used to do as much of the work as possible. It is also an option to consider a JNI approach for integrating these applications into Java." I am a bit confused. Kindly tell me what exactly has to be done in the project. Thanks, Komal On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary wrote: > Dear Komal, > > As far as I know, The project is about porting the code to Java, not using > existing C code within JNI. > That means you should be able to digest the algorithm first, Build it > using Java from scratch, depending on C code as a reference Implementation. > > Regards > > Amr > > -----Original Message----- > From: Komal Sanjeev > Sent: Sunday, March 18, 2012 12:59 AM > To: Andreas Prlic ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 > > Hi all, > > Introducing myself a bit more, I also work as a remote intern for > DARNED. > DARNED is a database of RNA Editing, and currently new features are being > added into the project, one of which is incorporating the BLAST feature for > sequence based search. Having worked on a similar project recently, I think > I will be comfortable working with the 'Porting an Algorithm to Java' > project. > > The following is what I understood about the project. Please correct me if > I am wrong. > This(link) > is > the current method used for BLAST, which accesses the NCBI website each > time. The NCBIQBlastService class is currently used. The project aims at > replacing this with code which will perform the search within Biojava. I > downloaded the source codes of BLAST and HMMER. My job will be to convert > these to Java. > > Regarding the C/C++ to Java converter, i found this on the internet: > > http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html > > but it is not free of cost. > Apart from this, I saw that many people discourage the use of C/C++ to Java > tools saying that they are not efficient. Does anyone know of any better > tool which can do this? > > Regarding the JNI, would it not be better if the whole code was written in > Java, rather than a part of it being in C/C++? I haven't used it before, > but if it is better than converting the code, I don't have a problem > working with it. > > Kindly clear my doubts. > Thanks in advance, > Komal > > > > On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > > > Hi Komal, > > > > stay tuned to this list, we still don't know if we will get funded > > from Google this year. > > > > Andreas > > > > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > > wrote: > > > Hi Everyone, > > > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > > in > > > working with BioJava for GSoC 2012. I am particularly interested in > > working > > > on the 'Porting an Algorithm to Java' project. > > > Kindly help me about how I should proceed. > > > > > > Thanks, > > > Komal > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Sun Mar 18 07:10:15 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Sun, 18 Mar 2012 12:10:15 +0100 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: While the ideal solution would be a clean port of the BLAST algorithm to Java, the complexity of such a mature and performance-optimized code might be a bit too much for a GSOC project. The JNI solution, while not optimal, would be an acceptable fallback solution. If implemented properly, a base for using existing programs like BLAST, HMMER, ? via JNI could be created. The project goal is: "Make BLAST work locally, without internet connection in addition to the existing NCBIQBlastService method" Hannes On Sun, Mar 18, 2012 at 08:42, Komal Sanjeev wrote: > Hi Amr, > > The following has been mentioned in the project description: > "Converting C or C++ source code by hand is not a trivial undertaking and > it is recommended that a C/C++ to Java conversion tool be used to do as > much of the work as possible. It is also an option to consider a JNI > approach for integrating these applications into Java." > > I am a bit confused. Kindly tell me what exactly has to be done in the > project. > > Thanks, > Komal > > > > On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary > wrote: > >> ? Dear Komal, >> >> As far as I know, The project is about porting the code to Java, not using >> existing C code within JNI. >> That means you should be able to digest the algorithm first, Build it >> using Java from scratch, depending on C code as a reference Implementation. >> >> Regards >> >> Amr >> >> -----Original Message----- >> From: Komal Sanjeev >> Sent: Sunday, March 18, 2012 12:59 AM >> To: Andreas Prlic ; biojava-l at lists.open-bio.org >> Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 >> >> Hi all, >> >> Introducing myself a bit more, I also work as a remote intern for >> DARNED. >> DARNED is a database of RNA Editing, and currently new features are being >> added into the project, one of which is incorporating the BLAST feature for >> sequence based search. Having worked on a similar project recently, I think >> I will be comfortable working with the 'Porting an Algorithm to Java' >> project. >> >> The ?following is what I understood about the project. Please correct me if >> I am wrong. >> This(link) >> is >> the current method used for BLAST, which accesses the NCBI website each >> time. The NCBIQBlastService class is currently used. The project aims at >> replacing this with code which will perform the search within Biojava. I >> downloaded the source codes of BLAST and HMMER. My job will be to convert >> these to Java. >> >> Regarding the C/C++ to Java converter, i found this on the internet: >> >> http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html >> >> but it is not free of cost. >> Apart from this, I saw that many people discourage the use of C/C++ to Java >> tools saying that they are not efficient. Does anyone know of any better >> tool which can do this? >> >> Regarding the JNI, would it not be better if the whole code was written in >> Java, rather than a part of it being in C/C++? I haven't used it before, >> but if it is better than converting the code, I don't have a problem >> working with it. >> >> Kindly clear my doubts. >> Thanks in advance, >> Komal >> >> >> >> On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: >> >> > Hi Komal, >> > >> > stay tuned to this list, we still don't know if we will get funded >> > from Google this year. >> > >> > Andreas >> > >> > >> > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev >> > wrote: >> > > Hi Everyone, >> > > >> > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested >> > in >> > > working with BioJava for GSoC 2012. I am particularly interested in >> > working >> > > on the 'Porting an Algorithm to Java' project. >> > > Kindly help me about how I should proceed. >> > > >> > > Thanks, >> > > Komal >> > > _______________________________________________ >> > > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From amr_alhossary at hotmail.com Sun Mar 18 07:06:01 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Sun, 18 Mar 2012 13:06:01 +0200 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Let me refer to Dr. Andreas & be back to you. P.S. please stop sending to both lists. Amr -----Original Message----- From: Komal Sanjeev Sent: Sunday, March 18, 2012 9:42 AM To: Amr AL-Hossary Cc: biojava-dev at lists.open-bio.org ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 Hi Amr, The following has been mentioned in the project description: "Converting C or C++ source code by hand is not a trivial undertaking and it is recommended that a C/C++ to Java conversion tool be used to do as much of the work as possible. It is also an option to consider a JNI approach for integrating these applications into Java." I am a bit confused. Kindly tell me what exactly has to be done in the project. Thanks, Komal On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary wrote: > Dear Komal, > > As far as I know, The project is about porting the code to Java, not using > existing C code within JNI. > That means you should be able to digest the algorithm first, Build it > using Java from scratch, depending on C code as a reference > Implementation. > > Regards > > Amr > > -----Original Message----- > From: Komal Sanjeev > Sent: Sunday, March 18, 2012 12:59 AM > To: Andreas Prlic ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 > > Hi all, > > Introducing myself a bit more, I also work as a remote intern for > DARNED. > DARNED is a database of RNA Editing, and currently new features are being > added into the project, one of which is incorporating the BLAST feature > for > sequence based search. Having worked on a similar project recently, I > think > I will be comfortable working with the 'Porting an Algorithm to Java' > project. > > The following is what I understood about the project. Please correct me > if > I am wrong. > This(link) > is > the current method used for BLAST, which accesses the NCBI website each > time. The NCBIQBlastService class is currently used. The project aims at > replacing this with code which will perform the search within Biojava. I > downloaded the source codes of BLAST and HMMER. My job will be to convert > these to Java. > > Regarding the C/C++ to Java converter, i found this on the internet: > > http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html > > but it is not free of cost. > Apart from this, I saw that many people discourage the use of C/C++ to > Java > tools saying that they are not efficient. Does anyone know of any better > tool which can do this? > > Regarding the JNI, would it not be better if the whole code was written in > Java, rather than a part of it being in C/C++? I haven't used it before, > but if it is better than converting the code, I don't have a problem > working with it. > > Kindly clear my doubts. > Thanks in advance, > Komal > > > > On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > > > Hi Komal, > > > > stay tuned to this list, we still don't know if we will get funded > > from Google this year. > > > > Andreas > > > > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > > wrote: > > > Hi Everyone, > > > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm > > > interested > > in > > > working with BioJava for GSoC 2012. I am particularly interested in > > working > > > on the 'Porting an Algorithm to Java' project. > > > Kindly help me about how I should proceed. > > > > > > Thanks, > > > Komal > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Sun Mar 18 12:37:06 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Mar 2012 09:37:06 -0700 Subject: [Biojava-l] GSoC 2012 - how to get started with your proposals Message-ID: Hi, It is great to see so much interest for GSoC again this year. To get started with a proposal I would recommend to look at the BioJava project proposals from the last two years (they are on the wiki) and see what kind of projects got funded and how those proposals were written. Think about what you would like to work on. Get a copy of BioJava and see how related features are working. Come up with a plan on how to extend this. We are fairly flexible regarding what kind of projects we will run this summer and this really depends on the submitted project proposals. All proposals will be compared and ranked together with other projects from the Bio* projects. As such a good proposal is key to get funded. A good proposals shows - the motivation of the student - that the candidate is qualified to do what he is proposing - adds useful new functionality to BioJava - discusses possible risks and what to do about them It is difficult to answer questions like "how should I perform this or that project?" - There are more than one possible path and it depends on your skills and interest what will be the best answer for this. Overall I recommend to pick a project on a topic that is close to your (future?) thesis, or is of particular interest for you. Here a couple of more thoughts which are project specific: - The best projects are those that you come up with yourself. If you want to distinguish yours from every other proposal, suggest something we have not been thinking of as of yet. - File parsers: if you want to work on file parsers take a look at existing ones. What features do they provide? How can they be extended? For example if you want to work on the CATH parser, take a look at how the SCOP parser works. What features are available around this (access to domains) and how can something like this be set up for CATH. Look at how the CATH website provides files. - Porting of algorithms: There are several approaches possible for doing this. I recommend that you should have some background both in C and in Java for this. Get a copy of the algorithm you want to port, compile it, and take a look at the source. There are several ways how to proceed for the actual port and having a good strategy for this is key for this proposal. Perhaps try to use your strategy on some simple test case to see how this might work. - BioJava in the cloud The goal here is parallelization of existing code. What parts of biojava are suitable for this? How can they be parallelized and moved to current cloud infrastructure? There is a lot of online material available for this which will be helpful here. Andreas From sbliven at ucsd.edu Sun Mar 18 16:47:10 2012 From: sbliven at ucsd.edu (Spencer Bliven) Date: Sun, 18 Mar 2012 13:47:10 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Unfortunately, HMMER is licensed as GPL. As such, we can't port it to BioJava or even link against it with JNI. A 2009 postindicates that they are not interested in re-licensing HMMER under a less restrictive license. I think we should move away from any HMMER-port project, and focus on porting other important algorithms such as BLAST (public domain ). I went ahead and removed HMMER from the GSoC wiki page. I was trying to think of other LGPL-compatable bioinformatics projectswhich would be nice to port to biojava. Maybe a sequence browser, such as incorporating/linking the Integrated Genome Browser? Anyone have ideas other than BLAST? -Spencer On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani wrote: > Hello, > > I am Ritisha Laungani, a pre-final year student currently persuing *MSc > Tech. Information Systems* at Birla Institute of Technology and Science, > Goa, India. > > I would like to apply for the BioJava project as i have worked into all the > 3 fields this projects requires- C, Java and Bio! > > As far as i understand, in simple terms, the project's goal is to convert > an existing HMMER source code, which is written in C, to a java code using > language processing tools. Do correct me if i am wrong here! > > I must admit here that i am new to open source software development and > also unaware of HMMER. But i would love to learn if given a chance and the > correct resources! :) > > Eagerly awaiting a reply, which could guide me to the next step. > > Regards, > > Ritisha Laungani > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Sun Mar 18 17:07:45 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Sun, 18 Mar 2012 22:07:45 +0100 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: What about: http://en.wikipedia.org/wiki/MUSCLE_(alignment_software) It's public domain. Another Multiple Sequence Alignment would be t-coffee, but it's GPL again. Hannes On Sun, Mar 18, 2012 at 21:47, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less > restrictive license. I think we should move away from any HMMER-port > project, and focus on porting other important algorithms such as BLAST (public > domain > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projectswhich > would be nice to port to biojava. Maybe a sequence browser, such as > incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing *MSc >> Tech. Information Systems* at Birla Institute of Technology and Science, >> Goa, India. >> >> I would like to apply for the BioJava project as i have worked into all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to convert >> an existing HMMER source code, which is written in C, to a java code using >> language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development and >> also unaware of HMMER. But i would love to learn if given a chance and the >> correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 19 01:19:19 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Mar 2012 22:19:19 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. Andreas On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less > restrictive license. I think we should move away from any HMMER-port > project, and focus on porting other important algorithms such as BLAST (public > domain > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projectswhich > would be nice to port to biojava. Maybe a sequence browser, such as > incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing *MSc >> Tech. Information Systems* at Birla Institute of Technology and Science, >> Goa, India. >> >> I would like to apply for the BioJava project as i have worked into all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to convert >> an existing HMMER source code, which is written in C, to a java code using >> language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development and >> also unaware of HMMER. But i would love to learn if given a chance and the >> correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From member at linkedin.com Mon Mar 19 11:16:32 2012 From: member at linkedin.com (Chuan Hock Koh via LinkedIn) Date: Mon, 19 Mar 2012 15:16:32 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <1402750432.7228374.1332170192542.JavaMail.app@ela4-app0132.prod> LinkedIn ------------ Chuan Hock Koh requested to add you as a connection on LinkedIn: ------------------------------------------ Christopher, I'd like to add you to my professional network on LinkedIn. - Chuan Hock Accept invitation from Chuan Hock Koh http://www.linkedin.com/e/triamj-gzznqj6i-16/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I142003583_16/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYSclYPe3kPc30Od359bSVQhRlqdlhIbP0Ncj4Vcz0RcPoLrCBxbOYWrSlI/EML_comm_afe/?hs=false&tok=0AB7NzG-jkOl81 View invitation from Chuan Hock Koh http://www.linkedin.com/e/triamj-gzznqj6i-16/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I142003583_16/3oNnPcUdjcMc38QckALqnpPbOYWrSlI/svi/?hs=false&tok=2iLq8WbEPkOl81 ------------------------------------------ Why might connecting with Chuan Hock Koh be a good idea? Chuan Hock Koh's connections could be useful to you: After accepting Chuan Hock Koh's invitation, check Chuan Hock Koh's connections to see who else you may know and who you might want an introduction to. Building these connections can create opportunities in the future. -- (c) 2012, LinkedIn Corporation From darnells at dnastar.com Mon Mar 19 12:18:02 2012 From: darnells at dnastar.com (Steve Darnell) Date: Mon, 19 Mar 2012 16:18:02 +0000 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Andreas, We spoke offline about the HMMER/GPL issue this weekend. I think it is premature to remove the HMMER option from the GSoC wiki page. I would like to clarify the Sean Eddy blog post linked to by Spencer (_**_ emphasis mine): >From the LICENSE section: The only thing the GPLv3 really blocks is someone forking a derivative copy of HMMER and distributing it under a different license, such as a closed-source proprietary license; to do that, _*you'd need to negotiate a non-GPL license with us first*_. >From the COPYRIGHT section: _*We really don't expect to negotiate any non-GPL licenses, though*_. We want to enable many different people _*to contribute to a single open source HMMER codebase*_, as a shared codebase for bioinformatics and computational biology. >From the TRADEMARK section: _*Did I mention, we want to enable a single open source HMMER codebase?*_ ========== Sean Eddy and the Howard Hughes Medical Institute are the main copyright holders. The main goal is clear... maintain a single open source HMMER codebase. The choice of license for HMMER (GPL v3) was to persuade people to contribute back. However, OBF might be able to negotiate other arrangements (perhaps a non-GPL library that can only be distributed with BioJava and any contributions made by the GSoC student must be licensed back to HHMI under GPL?). I do not know how hopeful to be about that possibility, but it cannot hurt to ask. I would like to dissuade GSoC students from directly contacting Sean Eddy or HHMI about this possibility. This task is most appropriate for a senior BioJava representative and it is up to the "Port an Algorithm to Java" mentors on how to proceed. Just my $0.02. Regards, Steve -- Steve Darnell DNASTAR, Inc. Madison, WI USA -----Original Message----- From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Andreas Prlic Sent: Monday, March 19, 2012 12:19 AM To: Spencer Bliven; Hannes Brandst?tter-M?ller Cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] GSoC 2012- Port an Algorithm to Java A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. Andreas On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less restrictive > license. I think we should move away from any HMMER-port project, and > focus on porting other important algorithms such as BLAST (public > domain pts/projects/blast/LICENSE> > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projects cs_software>which would be nice to port to biojava. Maybe a sequence > browser, such as incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing >> *MSc Tech. Information Systems* at Birla Institute of Technology and >> Science, Goa, India. >> >> I would like to apply for the BioJava project as i have worked into >> all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to >> convert an existing HMMER source code, which is written in C, to a >> java code using language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development >> and also unaware of HMMER. But i would love to learn if given a >> chance and the correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From shashank201091 at gmail.com Mon Mar 19 15:45:57 2012 From: shashank201091 at gmail.com (Shashank Gupta) Date: Tue, 20 Mar 2012 01:15:57 +0530 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: Hello I am a beginner in the open source development but I am trying to give my best shot at this year's GSoC. In "Porting an Algorithm to Java" what I infer is we can export the svn and then using a C++ to Java code converter convert the whole source code into Java. After which open the project through a Java based IDE and compile the source code and then fix the errors that have crept in the code while the conversion. After thorough testing and debugging we clean the code and give the final source. I know I am a newbie and i'll be grateful if you could let me know why this method won't work, considering if it won't. Regards Shashank Gupta From andreas at sdsc.edu Mon Mar 19 16:10:40 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 19 Mar 2012 13:10:40 -0700 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: Hi Shashank, You are right, this is on a high level how such a conversion could be performed. Andreas On Mon, Mar 19, 2012 at 12:45 PM, Shashank Gupta wrote: > Hello > > I am a beginner in the open source development but I am trying to give my > best shot at this year's GSoC. In "Porting an Algorithm to Java" what I > infer is we can export the svn and then using a C++ to Java code converter > convert the whole source code into Java. After which open the project > through a Java based IDE and compile the source code and then fix the > errors that have crept in the code while the conversion. > After thorough testing and debugging we clean the code and give the final > source. I know I am a newbie and i'll be grateful if you could let me know > why this method won't work, considering if it won't. > > Regards > Shashank Gupta > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From russ at kepler-eng.com Mon Mar 19 16:48:57 2012 From: russ at kepler-eng.com (Russ Kepler) Date: Mon, 19 Mar 2012 14:48:57 -0600 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: <2678562.mqEKBcxuqW@main> On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. From andreas at sdsc.edu Mon Mar 19 17:15:13 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 19 Mar 2012 14:15:13 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Thanks Steve, good points. Let's conclude this discussion with the take home message that converting GPLed code to BioJava has licensing issues and requires additional negotiations. Before embarking on such a project the mentors will have a discussion about licensing with HHMI (or any other license holder for other algorithms). Andreas 2012/3/19 Steve Darnell : > Hi Andreas, > > We spoke offline about the HMMER/GPL issue this weekend. I think it is premature to remove the HMMER option from the GSoC wiki page. I would like to clarify the Sean Eddy blog post linked to by Spencer (_**_ emphasis mine): > > From the LICENSE section: > > The only thing the GPLv3 really blocks is someone forking a derivative copy of HMMER and distributing it under a different license, such as a closed-source proprietary license; to do that, _*you'd need to negotiate a non-GPL license with us first*_. > > From the COPYRIGHT section: > > _*We really don't expect to negotiate any non-GPL licenses, though*_. We want to enable many different people _*to contribute to a single open source HMMER codebase*_, as a shared codebase for bioinformatics and computational biology. > > From the TRADEMARK section: > > _*Did I mention, we want to enable a single open source HMMER codebase?*_ > > ========== > > Sean Eddy and the Howard Hughes Medical Institute are the main copyright holders. The main goal is clear... maintain a single open source HMMER codebase. The choice of license for HMMER (GPL v3) was to persuade people to contribute back. However, OBF might be able to negotiate other arrangements (perhaps a non-GPL library that can only be distributed with BioJava and any contributions made by the GSoC student must be licensed back to HHMI under GPL?). I do not know how hopeful to be about that possibility, but it cannot hurt to ask. > > I would like to dissuade GSoC students from directly contacting Sean Eddy or HHMI about this possibility. This task is most appropriate for a senior BioJava representative and it is up to the "Port an Algorithm to Java" mentors on how to proceed. > > Just my $0.02. > > Regards, > Steve > > -- > Steve Darnell > DNASTAR, Inc. > Madison, WI USA > > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Andreas Prlic > Sent: Monday, March 19, 2012 12:19 AM > To: Spencer Bliven; Hannes Brandst?tter-M?ller > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] GSoC 2012- Port an Algorithm to Java > > A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I ?see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. > > In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. > > Andreas > > On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: >> Unfortunately, HMMER is licensed as GPL. As such, we can't port it to >> BioJava or even link against it with JNI. A 2009 >> postindicates that >> they are not interested in re-licensing HMMER under a less restrictive >> license. I think we should move away from any HMMER-port project, and >> focus on porting other important algorithms such as BLAST (public >> domain> pts/projects/blast/LICENSE> >> ). >> >> I went ahead and removed HMMER from the GSoC wiki >> page. >> I was trying to think of other LGPL-compatable bioinformatics >> projects> cs_software>which would be nice to port to biojava. Maybe a sequence >> browser, such as incorporating/linking the Integrated Genome >> Browser? >> Anyone have ideas other than BLAST? >> >> -Spencer >> >> On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani >> wrote: >> >>> Hello, >>> >>> I am Ritisha Laungani, a pre-final year student currently persuing >>> *MSc Tech. Information Systems* at Birla Institute of Technology and >>> Science, Goa, India. >>> >>> I would like to apply for the BioJava project as i have worked into >>> all the >>> 3 fields this projects requires- C, Java and Bio! >>> >>> As far as i understand, in simple terms, the project's goal is to >>> convert an existing HMMER source code, which is written in C, to a >>> java code using language processing tools. ?Do correct me if i am wrong here! >>> >>> I must admit here that i am new to open source software development >>> and also unaware of HMMER. But i would love to learn if given a >>> chance and the correct resources! ?:) >>> >>> Eagerly awaiting a reply, which could guide me to the next step. >>> >>> Regards, >>> >>> Ritisha Laungani >>> _______________________________________________ From HWillis at scripps.edu Mon Mar 19 18:03:12 2012 From: HWillis at scripps.edu (Scooter Willis) Date: Mon, 19 Mar 2012 18:03:12 -0400 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Message-ID: 3+ years ago we worked on a port of the reference implementation of the H264 encoder/decoder to Java. Performance was actually very good that we didnt spend time on optimization. We used jazillian to do the initial conversion and it does a nice job on fairly clean code. They are no longer in business and I plan on contacting the developer to see if they would use their software for the initial conversion. Doing a JNI conversion will be fairly easy to model the input and ouput paramaters of blast and/or hmmer. Neither code base is setup as a library so the number of mappings that need to be performed is minimal. The complexity of porting either from C/C++ to Java is high but also introduces a student to the field and thus has potential long term benefit. With JNI high degree of success with minimal work. I would advocate that the initial effort is placed on conversion to determine the overall likelyhood of success where much will depend on the student. JNI should be a requirement even if conversion is successful. JNI would also minimize GPL concerns of forking the HMMER codebase. Scooter ----- Reply message ----- From: "Russ Kepler" To: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Date: Tue, Mar 20, 2012 8:06 am On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From amr_alhossary at hotmail.com Mon Mar 19 21:39:33 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Tue, 20 Mar 2012 03:39:33 +0200 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: I agree with most of Scooter's opinions, 1) JNI can be used as an initial step, to use the current code until we work out the License issues. 2) Also it can be used as a reference conversion implementation. 3) Then it would be better for the student to (digest) the algorithm and rebuild it. Here I add that 4) Even best converters can't map things like variables carrying array length into simply array.length. 5) Using already built-in algorithm libraries (searching, sorting, data structures) in java would be easier to maintain later on, plus being optimized for Java data types. Amr -----Original Message----- From: Scooter Willis Sent: Tuesday, March 20, 2012 12:03 AM To: Russ Kepler ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Porting an Algorithm to Java GSoC 2012 3+ years ago we worked on a port of the reference implementation of the H264 encoder/decoder to Java. Performance was actually very good that we didnt spend time on optimization. We used jazillian to do the initial conversion and it does a nice job on fairly clean code. They are no longer in business and I plan on contacting the developer to see if they would use their software for the initial conversion. Doing a JNI conversion will be fairly easy to model the input and ouput paramaters of blast and/or hmmer. Neither code base is setup as a library so the number of mappings that need to be performed is minimal. The complexity of porting either from C/C++ to Java is high but also introduces a student to the field and thus has potential long term benefit. With JNI high degree of success with minimal work. I would advocate that the initial effort is placed on conversion to determine the overall likelyhood of success where much will depend on the student. JNI should be a requirement even if conversion is successful. JNI would also minimize GPL concerns of forking the HMMER codebase. Scooter ----- Reply message ----- From: "Russ Kepler" To: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Date: Tue, Mar 20, 2012 8:06 am On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From shashank201091 at gmail.com Tue Mar 20 00:57:54 2012 From: shashank201091 at gmail.com (Shashank Gupta) Date: Tue, 20 Mar 2012 10:27:54 +0530 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: While porting a piece of code to Java one can always see what errors the converter is making. As the converter is static it'll repeat its mistakes for example if it cant convert the array length function then it will create the same error each time the function appears. When compiling the java code we'll get to know of the errors the converter is making and then by using the replace option we reduce the number of errors to a handful which then can be coded by hand. I had a doubt, if we convert a piece of code to Java by using a converter does it have certain performance issues? Regards Shashank Gupta On Tue, Mar 20, 2012 at 7:09 AM, Amr AL-Hossary wrote: > I agree with most of Scooter's opinions, > > 1) JNI can be used as an initial step, to use the current code until we > work out the License issues. > 2) Also it can be used as a reference conversion implementation. > 3) Then it would be better for the student to (digest) the algorithm and > rebuild it. > Here I add that > 4) Even best converters can't map things like variables carrying array > length into simply array.length. > 5) Using already built-in algorithm libraries (searching, sorting, data > structures) in java would be easier to maintain later on, plus being > optimized for Java data types. > > Amr > > > -----Original Message----- From: Scooter Willis > Sent: Tuesday, March 20, 2012 12:03 AM > To: Russ Kepler ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Porting an Algorithm to Java GSoC 2012 > > > 3+ years ago we worked on a port of the reference implementation of the > H264 encoder/decoder to Java. Performance was actually very good that we > didnt spend time on optimization. We used jazillian to do the initial > conversion and it does a nice job on fairly clean code. They are no longer > in business and I plan on contacting the developer to see if they would use > their software for the initial conversion. > > Doing a JNI conversion will be fairly easy to model the input and ouput > paramaters of blast and/or hmmer. Neither code base is setup as a library > so the number of mappings that need to be performed is minimal. The > complexity of porting either from C/C++ to Java is high but also introduces > a student to the field and thus has potential long term benefit. > > With JNI high degree of success with minimal work. I would advocate that > the initial effort is placed on conversion to determine the overall > likelyhood of success where much will depend on the student. JNI should be > a requirement even if conversion is successful. > > JNI would also minimize GPL concerns of forking the HMMER codebase. > > Scooter > > ----- Reply message ----- > From: "Russ Kepler" > To: "biojava-l at lists.open-bio.org" > Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 > Date: Tue, Mar 20, 2012 8:06 am > > > > On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > > You are right, this is on a high level how such a conversion could be >> performed. >> > > In my experience Java makes a poor C++ (or C) emulator. The "convert C++ > to > Java" might make an "OK" first pass but in the end you're going to want to > recode critical sections in original Java. I've found that the really big > performance improvements are in using smarter algorithms in Java vs. some > of > the 'brute force' approaches that work in C. > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > From sbliven at ucsd.edu Tue Mar 20 18:31:20 2012 From: sbliven at ucsd.edu (Spencer Bliven) Date: Tue, 20 Mar 2012 15:31:20 -0700 Subject: [Biojava-l] AbstractSequence#getUserCollection() Message-ID: Sequences contain a 'user collection' of type Collection. Is anybody using this feature? If I want to store data in it, should I add it to the existing userCollection (if any), or is it ok to just set a new value to it? -Spencer From andreas.prlic at gmail.com Thu Mar 22 15:00:12 2012 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 22 Mar 2012 12:00:12 -0700 Subject: [Biojava-l] Project @ Google Summer of Code 2012 In-Reply-To: <1332407594.90121.YahooMailClassic@web125603.mail.ne1.yahoo.com> References: <1332407594.90121.YahooMailClassic@web125603.mail.ne1.yahoo.com> Message-ID: Hi Camelia, > I write to You regarding the project " Take BioJava into the Cloud" that > BioJava is mentoring during Google Summer of Code 2012. > - Who are the Mentors for this project? > We assign Mentors after we have received the final project proposals, that way we can make sure that the most suitable mentors will be assigned to the student. Usually we have mentor teams of two mentors per project and we have weekly skype calls throughout the duration of the project to make sure the project stays on track. > - Which modules of BioJava are subject of this project? The project's > description is "Hadoop-ify and/or Map-Reduce some of the BioJava modules" > That depends upon your interest. There are a number of possible candidates for this, e.g. the structure alignment package. It does not have to be Hadoop, it can be also one of several other available solutions, like Hazelcast, etc. The details of this can be part of the proposal. Andreas From andreas at sdsc.edu Sun Mar 25 22:06:08 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 25 Mar 2012 19:06:08 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, There are no formal requirements and everybody has the right to apply. Having said that, it sounds like you might be well qualified for such a project. The algorithm porting project is still on, with the addition that more negotiations might be required, depending on the license of the algorithm. What we are mainly looking for at this stage are realistic project proposals that add useful new features to BioJava. We will provide feedback on such proposals to help improving them before they are submitted at the Google site. Andreas On Sun, Mar 25, 2012 at 10:26 AM, Dragos-Bogdan Sima wrote: > Hello, > My name is Dragos and i am a 2nd year student in Computer Science at > University Politehnica of Bucharest. > I have strong C/C++ and Java lnowledge and I have worked in oher > programming environments such as: Python, Assemblt, MatLab, Octave or > Scheme. > The project idea that got my attention is the one mentioned in the mailing > Subject. Before getting more into the task, I would like to know if this > idea is on for GSoC 2012. > If so, what are the requirements in order to get the right to apply for > this organisation. > > Cheers, > Dragos Sima. > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From nick_mihaiu at yahoo.com Tue Mar 27 07:53:59 2012 From: nick_mihaiu at yahoo.com (Mihaiu Nick) Date: Tue, 27 Mar 2012 04:53:59 -0700 (PDT) Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava Message-ID: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Hello, My name is Mihaiu Nicolae,?I'm?a 2nd year student in Computer Science at the Politehnica University of Bucharest and I'm very interested in working at the "New File Parsers for BioJava" project. I choose BioJava because it blends two passions of mine: coding and biology. Back in highschool biology was one of my favourite subjects, having ?a very good teacher from whom I learned a lot, I finished every year with 10. ? About my knowledge and experience - 1 year and a half experience with Java; it became my first choice in coding; currently I do all my tasks and homework in Java, also developing a bot for aichallenge [1] in Java as a university project. And a little personal project I'm working at, a memory test game, also written in Java. - 5 years of C/C++? - web: HTML, PHP, CSS, MySQL - made a module for my school's website? Some thoughts and questions about the project? - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ?? - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ?? Questions ?about the "Coding exercise" - About the "ambiguous characters",?lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence??"ACTATATCGG" and in another one?"ATGKMCGW" ? -?What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission?? it says ?the test data file named data.fasta up to 10Kb in size? ? Should I understand that 10Kb is the limit for a ?large file? ? ? Best regards, Nicolae [1]?http://aichallenge.org From andreas at sdsc.edu Tue Mar 27 14:00:23 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 27 Mar 2012 11:00:23 -0700 Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava In-Reply-To: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> References: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Message-ID: Hi Nicolae, > - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ? We are keeping the answer to this question intentionally open and want students to pick a topic they are interested in. For a list of requests that we have received from users, please take a look at: http://biojava.org/wiki/BioJava3_Feature_Requests, however we welcome other suggestions as well. > - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ? For a start focus on one parser and make sure it integrates well with the rest of biojava. Only propose more than one if you think you can easily do that given the amount of time. We are looking for realistic student proposals, so make sure you come up with a good and realistic plan. We are happy to discuss proposals before they are being submitted to google and will give feedback about how to improve them. > Questions ?about the "Coding exercise" Peter, do you want to answer those? Thanks, Andreas > > - About the "ambiguous characters",?lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence??"ACTATATCGG" and in another one?"ATGKMCGW" ? > > -?What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission?? it says ?the test data file named data.fasta up to 10Kb in > size? ? Should I understand that 10Kb is the limit for a ?large file? ? > > Best regards, > Nicolae > > [1]?http://aichallenge.org > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Tue Mar 27 14:06:27 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 27 Mar 2012 11:06:27 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, many years of research have been put into both Blast and Hmmer and it will be highly unlikely that you will be able to come up with something better in a summer. As such best to focus on getting a plain port. If you get ideas how to make the algorithms perform better during that time, probably best to feed back that information to the original developers. > Secondly, converting C/C++ source code into Java would be a very interesting > and challenging task for me. At least for the C++ part I am thinkig to > approach the use oh JNI, but is it possible to occur problems with > portability, building, or JVM's stability? I have also read about NestedVM > which provides binary translation for Java Bycode, and this approach could > be useful. Scooter, do you want to answer this one? Thanks, Andreas From amr_alhossary at hotmail.com Tue Mar 27 14:44:29 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Wed, 28 Mar 2012 02:44:29 +0800 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Dear Dr. Andreas, There exists optimization when converting C code into Java code. But on the level of implementation, not on on the level of algorithm itself. Because java deals with data types in slightly different way, (e.g. unsigned numbers, double numbers, etc.) it will need some posttranslational modification to be optimized to work on java. Amr -----Original Message----- From: Andreas Prlic Sent: Wednesday, March 28, 2012 2:06 AM To: Dragos-Bogdan Sima Cc: Biojava ; Scooter Willis Subject: Re: [Biojava-l] [Biojava-dev] Port an Algorithm to Java Hi Dragos, many years of research have been put into both Blast and Hmmer and it will be highly unlikely that you will be able to come up with something better in a summer. As such best to focus on getting a plain port. If you get ideas how to make the algorithms perform better during that time, probably best to feed back that information to the original developers. > Secondly, converting C/C++ source code into Java would be a very > interesting > and challenging task for me. At least for the C++ part I am thinkig to > approach the use oh JNI, but is it possible to occur problems with > portability, building, or JVM's stability? I have also read about NestedVM > which provides binary translation for Java Bycode, and this approach could > be useful. Scooter, do you want to answer this one? Thanks, Andreas _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From to.petr at gmail.com Tue Mar 27 16:50:22 2012 From: to.petr at gmail.com (P. Troshin) Date: Tue, 27 Mar 2012 21:50:22 +0100 Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava In-Reply-To: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> References: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Message-ID: Hi Nick, I agree with Andreas (thank you for coming in!), just a few additions below: > - 1 year and a half experience with Java; it became my first choice in coding; currently I do all my tasks and homework in Java, also developing a bot for aichallenge [1] in Java as a university project. And a little personal project I'm working at, a memory test game, also written in Java. > - 5 years of C/C++ > - web: HTML, PHP, CSS, MySQL - made a module for my school's website Great, sound Java knowledge is something that would help you a lot on this project. > > Some thoughts and questions about the project > > - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ? You are right there are many parsers in BioJava, too many actually, we only need one parser for one file format. However, currently this is not the case, there are 2 or 3 FASTA parsers for example. They are all subtly different, so the task would be to unify these parsers so one parser could be used for in all the cases. > - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ? It depends on the parser and on your own abilities. However, if you can only make one FASTA parser in 3 months, than your application is unlikely to be competitive. > Questions about the "Coding exercise" > > - About the "ambiguous characters", lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence "ACTATATCGG" and in another one "ATGKMCGW" ? Correct > > - What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission? it says ?the test data file named data.fasta up to 10Kb in > size? ? Should I understand that 10Kb is the limit for a ?large file? ? For this exercise assume that the large file is the one that does not fit into the computers RAM. With Java programme you can substitute computer RAM with the amount of memory available for JVM. So let's say that your parser should be able to work with 512Mb file with the JVM settings -Xmx256M. And yes, you do not have to email this file to me. I hope that helps. Good luck with your application. Regards, Peter From to.petr at gmail.com Tue Mar 27 17:31:39 2012 From: to.petr at gmail.com (P. Troshin) Date: Tue, 27 Mar 2012 22:31:39 +0100 Subject: [Biojava-l] GSOC New File Parsers for BioJava project coding exercise submission deadline change(!) Message-ID: Hello prospective GSOC students, I corrected the deadline to coincide with this year's GSOC student application deadline, which is the 6 of April inclusive. Please make sure to send your solution on or earlier than the 6-th of April 23:59pm GMT. Have fun and good luck. Kind regards, Peter From andreas at sdsc.edu Wed Mar 28 23:35:51 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 28 Mar 2012 20:35:51 -0700 Subject: [Biojava-l] [Biojava-dev] GSoC - BioJava File Parsers question In-Reply-To: References: Message-ID: Hi David, If you take a look at the Sequence interface, this is the central place to represents all sorts of sequences. New parser should fit in with providing instances of objects that implement Sequence. If this principle is kept up we are pretty close to the SeqIO scenario from what I understand. Andreas On Wed, Mar 28, 2012 at 3:46 PM, David Felty wrote: > Thanks for the info, the SeqIO modules are very helpful. In fact, it seems > like they are quite similar to what this project asks for. Could this this > type of implementation work for BioJava? > > David > > On Wed, Mar 28, 2012 at 6:09 PM, Peter Cock wrote: > >> On Wed, Mar 28, 2012 at 10:05 PM, P. Troshin wrote: >> > Well, they all widely used tools, and as a result of analysis they >> > produce files. If you need to process these results further then you'd >> > need to parse the result files. Hence the connection. >> > >> > Regards, >> > Peter >> >> Indeed. It is quite common in Bioinformatics for file formats to >> be named after the tool which introduced them - even if sometimes >> they become much more widely used. >> >> And for GenBank and UniProt, people typically mean the GenBank >> plain text 'flat file' format also used by DDBJ (there is a very similar >> format used by EMBL with a common feature table), and for >> UniProt that could refer to the old plain text 'SwissProt' file format >> or the newer UniProt XML format. For background on these an >> other sequence file file formats you might find these pages >> helpful: >> >> http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats >> http://biopython.org/wiki/SeqIO >> >> Peter C. >> > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From andreas at sdsc.edu Wed Mar 28 23:57:50 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 28 Mar 2012 20:57:50 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, > So basically I have to understand how the programs are working while > overviewing the sources, so that I could explain in my aplication how I plan > to port to Java? I would say so. What are possible problem areas, where could things go wrong, and what would you do in that case? Andreas > > And Arn, could you explain me a little bit more the cases where the PTM > would be required or give me some usefull links, beside the wikipage, for > study? > > Thanks you, > Dragos. -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From amr_alhossary at hotmail.com Thu Mar 29 00:14:29 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Thu, 29 Mar 2012 12:14:29 +0800 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: I don?t have a link to a website in my head right now, but any way, here are some examples: 1- Never use variables less than 32 bit width if you are interested in performance: consider that a and b are both short integers statement like a = a+b is valid in C but not in Java; because Java promotes all variables to 4 bytes before ANY operation (even comparative reading [non modifying] operations). This way, both a & b will be promoted to int, and summed, then an explicit cast is required before being assigned back to the variable a. All this overhead is not required in processing-oriented applications. BTW although a= a+b won?t work , a+=b will work (because it includes an implicit casting operator) 2- Remember that all variables in java are signed (except for char which is equivalent to unsigned short in C) and that variables in Java have different sizes than C so comparing the literal 0xFFFF to the short variable a whose value is 0xFFFF won?t return true..try it. Did you get why? to overcome this problem you will lose some of the performance too. 3- Remember that not all operators do the same action in Java like in C: revise the function of the >> operator in java versus C. One final point: my name is ?Amr? not ?Arn?. Regards Amr From: Dragos-Bogdan Sima Sent: Wednesday, March 28, 2012 7:05 AM To: Amr AL-Hossary Cc: Andreas Prlic ; Biojava ; Scooter Willis Subject: Re: [Biojava-l] [Biojava-dev] Port an Algorithm to Java Hello Dr. Andreas Prlic, So basically I have to understand how the programs are working while overviewing the sources, so that I could explain in my aplication how I plan to port to Java? And Arn, could you explain me a little bit more the cases where the PTM would be required or give me some usefull links, beside the wikipage, for study? Thanks you, Dragos. From xixunwu at gmail.com Thu Mar 29 14:26:48 2012 From: xixunwu at gmail.com (Xixun Wu) Date: Thu, 29 Mar 2012 14:26:48 -0400 Subject: [Biojava-l] WORLDCOMP and Hamid Arabnia Message-ID: Defamation campaign against WORLDCOMP and Hamid Arabnia For the last several months, a systematic defamation campaign is going on against the worlds' biggest computer science conference WORLDCOMP, eg. http://www.sites.google.com/site/worlddump1 or http://research.cs.wisc.edu/dbworld/messages/2012-03/1332361790.html WORLDCOMP is addressing this matter legally and a lawsuit has been filed to resolve this matter (visit WORLDCOMP's website http://www.world-academy-of-science.org and click on "news" on right side). Our preliminary investigation found the footprints of the actual persons who are sending all these defamatory comments about WORLDCOMP and its Chair Hamid Arabnia. As of now, these are the persons behind this defamatory campaign: http://www.cs.uga.edu/~thiab http://www.cs.uga.edu/~tliu http://www.cs.uga.edu/~erc http://ktwop.wordpress.com/about http://www.cs.fsu.edu/~tyson http://www.cis.famu.edu/~hchi http://www.scs.gatech.edu/people/mustaque-ahamad http://www.cs.fsu.edu/~xyuan http://www.unf.edu/~ree http://www.johnlevine.com http://curly.cis.unf.edu http://en.wikipedia.org/wiki/Albert_Shiryaev http://www.cse.sc.edu/~jtang http://www.ninaringo.com http://www.cis.famu.edu/~prasad http://www.scs.gatech.edu/people/maria-balcan http://www.f4.htw-berlin.de/~weberwu http://www.iaria.org/speakers/PetreDini.html http://www.eecs.ucf.edu/index.php?id=profiles&link=joseph_laviola (more names will be announced later on?) These people formed a team and mailing to different forums, groups, blogs and individuals, heavily criticizing WORLDCOMP. Some of them have personal or professional enmity with Professor Hamid Arabnia and some of them don't like WORLDCOMP for one reason or the other. They are using proxy servers in Georgia (Athens, Atlanta), Florida (Tallahassee, Jacksonville, Orlando), Chicago and Texas (Austin, Houston) and sending the defamatory emails. I request all of you to submit papers and make WORLDCOMP 2012 a success. All tracks of WORLDCOMP have received high citations. I assure you that WORLDCOMP will be held in July 2012 and it will continue for many years to come. I know Professor Hamid Arabnia well and he is a very nice and professional person and he is committed to organize WORLDCOMP in 2012, 2013, 2014, 2015,... With sincere respects, Mohammad Homayoun drmhomayoun at gmail.com Note: This message is sent to help defend my longtime friend Professor Hamid Arabnia http://www.cs.uga.edu/~hra (chair and coordinator of WORLDCOMP) From arthur.oviedo at epfl.ch Fri Mar 30 12:35:54 2012 From: arthur.oviedo at epfl.ch (Arthur Oviedo) Date: Fri, 30 Mar 2012 18:35:54 +0200 Subject: [Biojava-l] Interested in the "cloudization" of BioJava Message-ID: Hello, My name is Arthur, and i'm a master student at EPFL (?cole Polytechnique F?d?rale de Lausanne) in computer science. I worked in different project that are somewhat related to BioJava and cloud environment. I have worked , while i was research assistant, (briefly) in a project called UnaCloud ( http://sistemas.uniandes.edu.co/~unacloud/dokuwiki/doku.php?id=recursos:documentacion) which provides an opportunistic grid/cloud infrastructure for running scientific experiments and we have used it to help bio-informaticians with their different jobs like huge BLAST queryes, HMMER jobs, etc. As part of my assistant work in the same university, I developed a cool system called UnaCloud MSA which integrates some existing and mew developed tools to analyze Multiple Sequence Alignments. It even uses the BioJava library to perform some verification about the sequences. All of this is also done employing the UnaCloud infrastructure. This work is still in development and in preparation for publication. http://unacloudmsa.uniandes.edu.co Currently, i'm working on a class project on Hadoop (An implementation of subset of the functionalities of a Database Manager System) using Hadoop (Map-reduce) framework. All of the mentioned projects have been implemented in Java, so i suppose that i meet the java expertise requirement. I would like to know more about this project and to know also the rough dates where the Google Summer of Code would be held (To prepare my schedule). Thanks and best regards, Arthur Oviedo From andreas at sdsc.edu Fri Mar 30 13:57:34 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 30 Mar 2012 10:57:34 -0700 Subject: [Biojava-l] Interested in the "cloudization" of BioJava In-Reply-To: References: Message-ID: Hi Arthur, In short the goal is to take some aspects from BioJava and improve them so they play together well for deployment on the cloud. It sounds like you have a lot of related background for this project. However since you have been working on very similar things, for your proposal it will be important to come up with something independent and not just to propagate your previous work into BioJava. About the dates and other details, please consult the official GSoC - FAQ http://www.google-melange.com/document/show/gsoc_program/google/gsoc2012/faqs Andreas On Fri, Mar 30, 2012 at 9:35 AM, Arthur Oviedo wrote: > Hello, > My name is Arthur, and i'm a master student at EPFL (?cole Polytechnique > F?d?rale de Lausanne) in computer science. > I worked in different project that are somewhat related to BioJava and > cloud environment. > I have worked , while i was research assistant, (briefly) in a project > called UnaCloud ( > http://sistemas.uniandes.edu.co/~unacloud/dokuwiki/doku.php?id=recursos:documentacion) > which provides an opportunistic grid/cloud infrastructure for running > scientific experiments and we have used it to help bio-informaticians with > their different jobs like huge BLAST queryes, HMMER jobs, etc. > As part of my assistant work in the same university, I developed a cool > system called UnaCloud MSA which integrates some existing and mew developed > tools to analyze Multiple Sequence Alignments. It even uses the BioJava > library to perform some verification about the sequences. All of this is > also done employing the UnaCloud infrastructure. This work is still in > development and in preparation for publication. > http://unacloudmsa.uniandes.edu.co > Currently, i'm working on a class project on Hadoop (An implementation of > subset of the functionalities of a Database Manager System) using Hadoop > (Map-reduce) framework. > All of the mentioned projects have been implemented in Java, so i suppose > that i meet the java expertise requirement. > I would like to know more about this project and to know also the rough > dates where the Google Summer of Code would be held (To prepare my > schedule). > Thanks and best regards, > Arthur Oviedo > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Fri Mar 30 14:01:27 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 30 Mar 2012 11:01:27 -0700 Subject: [Biojava-l] GSOC 2012 In-Reply-To: <4F63D36D.1010401@gmail.com> References: <4F63D36D.1010401@gmail.com> Message-ID: Hi Evgeniy, I just noticed, you have not received a response to your mail. Apologies for that, too many emails going back and forth. The goal is to add new features to biojava and to extend it so it has support for more file formats. You could take a look at our feature request page to get some ideas... http://biojava.org/wiki/BioJava3_Feature_Requests Andreas On Fri, Mar 16, 2012 at 4:57 PM, superrubiroyd wrote: > Hi, > I am the final year student in Ukraine. I have 2.5 year of Javaexperience. I > would like to work with project 'New File Parsers for BioJava' during GSOC > 2012. Can you explain little more what should be done in this project and > can you give some advices how to make correct application. > Thanks in advance. With best regards, Evgeniy Berlog > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From sharma.dhrv at gmail.com Sat Mar 31 16:46:06 2012 From: sharma.dhrv at gmail.com (Dhruv Sharma) Date: Sun, 1 Apr 2012 02:16:06 +0530 Subject: [Biojava-l] GSoC Application Discussion and Help - Porting BLAST to Java Message-ID: Hi, I am Dhruv Sharma, a senior undergraduate student pursuing B.E.(Hons.) Computer Science at BITS, Pilani, India. I am very much interested in 'porting BLAST algorithm to Java' as a GSoC 2012 project. I am proficient and primarily work using Java and C. Also, I have past experience of working in C++ before migrating to Java. However, I am new to GSoC and haven't used version control in the past. My recent project was based on developing a web application in Java for posting data to remote CS-BLAST web service with FASTA sequence, parse and auto-filter its results using the release date from RCSB PDB and download the PDB files. Since, the project aims at converting the legacy C/C++ code to Java, already suggested approaches on the Bio-Java page and my observations are:- 1) Using C++ to Java converters for 100% conversion. I have tried converting the ncbi-blast-2.2.26 source code using a few freely available converters but all of them either crashed or failed to convert even after I resolved certain header file dependency issues that emerged. Most failures occurred at function calls to non-standard C++ libraries. 2) Using JNI as an alternative solution. JNI programming would be a tedious task and would anyway require understanding of the purpose of underlying C++ code. Hence,has little advantage over rewriting the equivalent Java code. A significant advantage can be seen when there is no efficient Java alternative of the C++ code. However, platform dependence would still exist. According to my understanding of the problem, a hybrid approach can be taken up which includes using code converters for simpler files, manual coding for tricky areas and using JNI for typical C++ code involving non-standard libraries. But, I am still not clear about my exact course of action. Can you please tell me if my analysis of the problem is correct? Please also comment on the feasibility of my suggested approach and please make any suggestions as they would help me in improving my application draft that I would soon be sharing for review. As BLAST is a collection of programs, so, keeping in mind the length of code to be ported, can we work on certain selectively critical programs in it from the GSoC's perspective? Thanks. -- *Dhruv Sharma* *Student B.E.(Hons.) Computer Science BITS, Pilani * *India* From andreas at sdsc.edu Sat Mar 31 19:21:36 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sat, 31 Mar 2012 16:21:36 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, we are always looking for volunteers to help with various aspects of the project. The tasks range from answering emails on the mailing list, improve documentation on our wiki, provide patches for bugs and keep developing new features. The best way to get started is to pick one of those areas and come up with an improvement ... Andreas On Sat, Mar 31, 2012 at 11:26 AM, Dragos-Bogdan Sima wrote: > Hello everyone, > > I would like to know what are the post-gsoc opportunities. Regardless, the > summer of coding I wish to continue if possible in this organization. > > Cheers, > Dragos-Bogdan. From andreas at sdsc.edu Wed Mar 7 19:39:38 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 7 Mar 2012 11:39:38 -0800 Subject: [Biojava-l] Google Summer of code preparations Message-ID: Hi, If you want to add any project ideas for BioJava's Google summer of code application for 2012, please add them to http://biojava.org/wiki/Google_Summer_of_Code_2012 ideally by tomorrow. Adding more details to the already existing projects is also good. Friday is the application deadline for organisations. Andreas From andreas.prlic at gmail.com Thu Mar 8 02:46:56 2012 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Wed, 7 Mar 2012 18:46:56 -0800 Subject: [Biojava-l] Biojava3 pairwise aligner result is different emboss' needle In-Reply-To: References: Message-ID: Hi, looks like the emboss' alignment is not penalising end gaps. You could try to use the smith waterman algorithm instead... Andreas On Wed, Mar 7, 2012 at 6:33 PM, ??? wrote: > Hello > My name is Oh jeongsu,?I am a student at Chungbuk National University in > korea. > > I've been run global alignments with biojava and needle. but biojava3 > pairwise aligner result is different emboss' needle. > > here my option and code >>query > AAAAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGCTGAGAAGTTAAACGGATTACCGCCTAAAGAATGAGCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG >>target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGAGAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > Gap opening penalty : -10 > Gap extension penalty : -0.5 > > SubstitutionMatrix : biojava - nuc4_4 , needle - ednafull > > biojava result > query > AA-------------------------------------------------------------------------------------------AAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGC----TGAG---AAG-TTAAACGGATTACCGCCTAAAGAATGA---GCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG------------------------------------------------------------------------------------------------------------------------------------------------------------ > target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGA------------GAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > needle result > query > -------------------------------------------------------------------------------AA------------AAAGAATAACAATTGGAAACGATTGCTAATACTTTATATGC----TGAG---AAG-TTAAACGGATTACCGCCTAAAGAATGAGCTTGCGTCTGATTAGCTAGTTGGTAAGGTAAAAGCTTACCAAGGCAATTGTCAGTAGTTGGTCTGAGAGGATGATCAACCACACTGGGACTGAGACACGGCCCAG------------------------------------------------------------------------------------------------------------------------------------------------------------ > target > AACGCTGGCGGCAGGCTTAACACATGCAAGTCGAGCGCATCCTTCGGGGTGAGCGGCGGACGGGTTAGTAACGCGTGGGAACGTACCCTTTCTAAGGAATAATCATTGGAAATGATGACTAATACCTTATACGCCCTTTGGGGGAAAGATTTATCGGA---------GAAGGATCGGCCCGCGTTAGATTAGATAGTTGGTGGGGTAATGGCCTACCAAGTCTACGATCTATAGCTGGTTTTAGAGGATGATCAGCAACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATCTTAGACAATGGGCGCAAGCCTGATCTAGCCATGCCGCGTGAGTGATGAAGGTCTTAGGATCGTAAAGCTCTTTCGCTGGGGAAGATAATGACTGTACCCAGTAAAGAAGTCCCGGCTAACTCCGTGC > > > i want to same result , could you tell me what is wrong? > > thansks > > Oh jeongsu From andreas at sdsc.edu Thu Mar 8 19:43:50 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 8 Mar 2012 11:43:50 -0800 Subject: [Biojava-l] BioJava 3.0.3. release plan Message-ID: Hi, We are planning to release BioJava 3.0.3 ( and the legacy BioJava 1.8.2.) next Friday, March 16th. The code freeze will happen on Thursday March 15th, please commit any code before that. Also please make sure all documentation is up-to-date, both in javadoc and on the wiki.. Andreas From tariq_cp at hotmail.com Mon Mar 12 06:19:52 2012 From: tariq_cp at hotmail.com (Muhammad Tariq Pervez) Date: Mon, 12 Mar 2012 06:19:52 +0000 Subject: [Biojava-l] Biojava-l Digest, Vol 110, Issue 2 In-Reply-To: References: Message-ID: A great news. Muhammad Tariq Pervez Ph.D Bioinformatics Scholar Department of Computer Science Virtual University of Pakistan, Lahore Tel: (042) 9203114-7 URL: www.vu.edu.pk Mobile: +923364120541, +923214602694 > From: biojava-l-request at lists.open-bio.org > Subject: Biojava-l Digest, Vol 110, Issue 2 > To: biojava-l at lists.open-bio.org > Date: Fri, 9 Mar 2012 12:00:02 -0500 > > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. BioJava 3.0.3. release plan (Andreas Prlic) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 8 Mar 2012 11:43:50 -0800 > From: Andreas Prlic > Subject: [Biojava-l] BioJava 3.0.3. release plan > To: biojava-dev , Biojava > > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi, > > We are planning to release BioJava 3.0.3 ( and the legacy BioJava > 1.8.2.) next Friday, March 16th. > > The code freeze will happen on Thursday March 15th, please commit any > code before that. Also please make sure all documentation is > up-to-date, both in javadoc and on the wiki.. > > Andreas > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 110, Issue 2 > ***************************************** From nickengland at gmail.com Mon Mar 12 15:04:05 2012 From: nickengland at gmail.com (Nick England) Date: Mon, 12 Mar 2012 15:04:05 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? Message-ID: Hello all, I am trying to read some .scf files into BioJava3. I have found the example code for 1.3 (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I can't find any classes in the 3.0 API which look at all related to those ones. Is it possible, or should I downgrade to 1.3 if I want to be able to read .scf files? Thanks, Nick From andreas at sdsc.edu Mon Mar 12 16:43:43 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 09:43:43 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Hi Nick, this feature has not been ported to biojava3 so far and you can get it via biojava 1.8.1 Andreas On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: > Hello all, > > I am trying to read some .scf files into BioJava3. I have found the > example code for 1.3 > (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I > can't find any classes in the 3.0 API which look at all related to > those ones. Is it possible, or should I downgrade to 1.3 if I want to > be able to read .scf files? > > Thanks, > > Nick > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From nickengland at gmail.com Mon Mar 12 17:39:55 2012 From: nickengland at gmail.com (Nick England) Date: Mon, 12 Mar 2012 17:39:55 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Thanks for the help, 1.8.1 seems to have the classes I need. But I haven't been able to get maven to work with biojava 1.8. It works fine with 3.X, but when I try to use the dependency: org.biojava biojava-legacy 1.8.1 it fails, even though there appears to be the corresponding pom on the repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ I have the repo at : biojava-maven-repo BioJava repository http://www.biojava.org/download/maven/ Am I doing something wrong, or has the release version 1.8.1 somehow broken on the repository? Thanks, Nick On 12 March 2012 16:43, Andreas Prlic wrote: > Hi Nick, > > this feature has not been ported to biojava3 so far and you can get it > via biojava 1.8.1 > > Andreas > > On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >> Hello all, >> >> I am trying to read some .scf files into BioJava3. I have found the >> example code for 1.3 >> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >> can't find any classes in the 3.0 API which look at all related to >> those ones. Is it possible, or should I downgrade to 1.3 if I want to >> be able to read .scf files? >> >> Thanks, >> >> Nick >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 12 17:54:29 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 10:54:29 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: I am not aware of any issues with that. Does either 1.8 or 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to refresh your Maven dependencies? Andreas On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: > Thanks for the help, 1.8.1 seems to have the classes I need. But I > haven't been able to get maven to work with biojava 1.8. It works fine > with 3.X, but when I try to use the dependency: > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ?org.biojava > ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy > ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 > ? ? ? ? ? ? ? ? > it fails, even though there appears to be the corresponding pom on the > repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ > > I have the repo at : > ? ? ? ? ? ? ? ? > ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo > ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository > ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ > ? ? ? ? ? ? ? ? > > Am I doing something wrong, or has the release version 1.8.1 somehow > broken on the repository? > > Thanks, > > Nick > > On 12 March 2012 16:43, Andreas Prlic wrote: >> Hi Nick, >> >> this feature has not been ported to biojava3 so far and you can get it >> via biojava 1.8.1 >> >> Andreas >> >> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>> Hello all, >>> >>> I am trying to read some .scf files into BioJava3. I have found the >>> example code for 1.3 >>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>> can't find any classes in the 3.0 API which look at all related to >>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>> be able to read .scf files? >>> >>> Thanks, >>> >>> Nick >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l From heuermh at gmail.com Mon Mar 12 20:03:35 2012 From: heuermh at gmail.com (Michael Heuer) Date: Mon, 12 Mar 2012 15:03:35 -0500 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Hello Andreas, I noticed this in the pom for das 1.8.2-SNAPSHOT org.biojava biojava3-structure 3.0-alpha1 compile I don't think that would cause trouble as reported below, but it is odd in any case. 1.8.x shouldn't have a dependency on 3.x. Hello Nick, What kind of maven failure are you seeing? michael On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: > I am not aware of any issues with that. Does either 1.8 or > 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to > refresh your Maven dependencies? > > Andreas > > > > On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >> Thanks for the help, 1.8.1 seems to have the classes I need. But I >> haven't been able to get maven to work with biojava 1.8. It works fine >> with 3.X, but when I try to use the dependency: >> ? ? ? ? ? ? ? ? >> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >> ? ? ? ? ? ? ? ? >> it fails, even though there appears to be the corresponding pom on the >> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >> >> I have the repo at : >> ? ? ? ? ? ? ? ? >> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >> ? ? ? ? ? ? ? ? >> >> Am I doing something wrong, or has the release version 1.8.1 somehow >> broken on the repository? >> >> Thanks, >> >> Nick >> >> On 12 March 2012 16:43, Andreas Prlic wrote: >>> Hi Nick, >>> >>> this feature has not been ported to biojava3 so far and you can get it >>> via biojava 1.8.1 >>> >>> Andreas >>> >>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>> Hello all, >>>> >>>> I am trying to read some .scf files into BioJava3. I have found the >>>> example code for 1.3 >>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>> can't find any classes in the 3.0 API which look at all related to >>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>> be able to read .scf files? >>>> >>>> Thanks, >>>> >>>> Nick >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 12 20:38:25 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 12 Mar 2012 13:38:25 -0700 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: > I don't think that would cause trouble as reported below, but it is > odd in any case. ?1.8.x shouldn't have a dependency on 3.x. In principle I agree, however there are historic reasons for this. If this config causes a problem then I would vote for removing it. The biojava-legacy project contains no structure sources any more since it was completely upgraded into the 3.X series. The API is compatible to a large extent. There is no reason for anybody to still be using a 1.X version of structure. Andreas > Hello Nick, > > What kind of maven failure are you seeing? > > ? michael > > > On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >> I am not aware of any issues with that. Does either 1.8 or >> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >> refresh your Maven dependencies? >> >> Andreas >> >> >> >> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>> haven't been able to get maven to work with biojava 1.8. It works fine >>> with 3.X, but when I try to use the dependency: >>> ? ? ? ? ? ? ? ? >>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>> ? ? ? ? ? ? ? ? >>> it fails, even though there appears to be the corresponding pom on the >>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>> >>> I have the repo at : >>> ? ? ? ? ? ? ? ? >>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>> ? ? ? ? ? ? ? ? >>> >>> Am I doing something wrong, or has the release version 1.8.1 somehow >>> broken on the repository? >>> >>> Thanks, >>> >>> Nick >>> >>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>> Hi Nick, >>>> >>>> this feature has not been ported to biojava3 so far and you can get it >>>> via biojava 1.8.1 >>>> >>>> Andreas >>>> >>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>> Hello all, >>>>> >>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>> example code for 1.3 >>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>> can't find any classes in the 3.0 API which look at all related to >>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>> be able to read .scf files? >>>>> >>>>> Thanks, >>>>> >>>>> Nick >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l From biojava at hannes.oib.com Tue Mar 13 05:56:29 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 13 Mar 2012 06:56:29 +0100 Subject: [Biojava-l] From : Chromatagraph reading in BioJava3? Message-ID: On Mon, Mar 12, 2012 at 16:04, Nick England wrote: > Hello all, > > I am trying to read some .scf files into BioJava3. I have found the > example code for 1.3 > (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I > can't find any classes in the 3.0 API which look at all related to > those ones. Is it possible, or should I downgrade to 1.3 if I want to > be able to read .scf files? > > Thanks, > > Nick Should we port that to 3.0.x after the next release? Or for 3.0.3? Sounds like another nice addition to org.biojava3.genome.parsers - I'll take a look at it to see what info is stored in these scf files. Hannes From nickengland at gmail.com Tue Mar 13 10:25:26 2012 From: nickengland at gmail.com (Nick England) Date: Tue, 13 Mar 2012 10:25:26 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: If I try to run maven I got the following: [INFO] Building sequence analysis 0.0.1-SNAPSHOT [INFO] ------------------------------------------------------------------------ Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, no dependency information available Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE I get this for 1.8 and 1.8.1 as well: Could not find artifact org.biojava:biojava-legacy:jar:1.8 in biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help 1] Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help 1] If I try to install it from source, I'm getting some class cast exceptions during the unit tests for some tests in core: Failed tests: testGetNoteSet(org.biojavax.bio.SimpleBioEntryTest) testGetRankedCrossRefs(org.biojavax.bio.SimpleBioEntryTest) 3.0 works fine, but lacks the class I want!. Presumably BioJava 1.8 hasn't always been called biojava-legacy? Since maven releases should be immutable, it should still be available from the original release maven repo? Thanks for the help, Nick On 12 March 2012 20:38, Andreas Prlic wrote: >> I don't think that would cause trouble as reported below, but it is >> odd in any case. ?1.8.x shouldn't have a dependency on 3.x. > > In principle I agree, however there are historic reasons for this. ?If > this config causes a problem then I would vote for removing it. The > biojava-legacy project contains no structure sources any more since it > was completely upgraded into the 3.X series. The API is compatible to > a large extent. ?There is no reason for anybody to still be using a > 1.X version of structure. > > Andreas > > > >> Hello Nick, >> >> What kind of maven failure are you seeing? >> >> ? michael >> >> >> On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >>> I am not aware of any issues with that. Does either 1.8 or >>> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >>> refresh your Maven dependencies? >>> >>> Andreas >>> >>> >>> >>> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>>> haven't been able to get maven to work with biojava 1.8. It works fine >>>> with 3.X, but when I try to use the dependency: >>>> ? ? ? ? ? ? ? ? >>>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>>> ? ? ? ? ? ? ? ? >>>> it fails, even though there appears to be the corresponding pom on the >>>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>>> >>>> I have the repo at : >>>> ? ? ? ? ? ? ? ? >>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>>> ? ? ? ? ? ? ? ? >>>> >>>> Am I doing something wrong, or has the release version 1.8.1 somehow >>>> broken on the repository? >>>> >>>> Thanks, >>>> >>>> Nick >>>> >>>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>>> Hi Nick, >>>>> >>>>> this feature has not been ported to biojava3 so far and you can get it >>>>> via biojava 1.8.1 >>>>> >>>>> Andreas >>>>> >>>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>>> Hello all, >>>>>> >>>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>>> example code for 1.3 >>>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>>> can't find any classes in the 3.0 API which look at all related to >>>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>>> be able to read .scf files? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Nick >>>>>> _______________________________________________ >>>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l From nickengland at gmail.com Tue Mar 13 10:48:11 2012 From: nickengland at gmail.com (Nick England) Date: Tue, 13 Mar 2012 10:48:11 +0000 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: Aha, solved the test problem, seems the unit tests fail under JDK 7 as a TreeSet now fails if you add a single non-comparable object, while before it failed only when you added a second one and a comparison was actually needed. Since new Object() doesn't implement comparable, the tests were failing under maven for me. I can at least locally install 1.8 now, but would be nice to have a maven repo dependency on 1.8 work in the future! Cheers, Nick On 13 March 2012 10:25, Nick England wrote: > If I try to run maven I got the following: > > [INFO] Building sequence analysis 0.0.1-SNAPSHOT > [INFO] ------------------------------------------------------------------------ > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, > no dependency information available > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > [INFO] ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > > I get this for 1.8 and 1.8.1 as well: > > Could not find artifact org.biojava:biojava-legacy:jar:1.8 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > > If I try to install it from source, I'm getting some class cast > exceptions during the unit tests for some tests in core: > > Failed tests: > ?testGetNoteSet(org.biojavax.bio.SimpleBioEntryTest) > ?testGetRankedCrossRefs(org.biojavax.bio.SimpleBioEntryTest) > > > 3.0 works fine, but lacks the class I want!. > Presumably BioJava 1.8 hasn't always been called biojava-legacy? Since > maven releases should be immutable, it should still be available from > the original release maven repo? > > Thanks for the help, > > Nick > > > On 12 March 2012 20:38, Andreas Prlic wrote: >>> I don't think that would cause trouble as reported below, but it is >>> odd in any case. ?1.8.x shouldn't have a dependency on 3.x. >> >> In principle I agree, however there are historic reasons for this. ?If >> this config causes a problem then I would vote for removing it. The >> biojava-legacy project contains no structure sources any more since it >> was completely upgraded into the 3.X series. The API is compatible to >> a large extent. ?There is no reason for anybody to still be using a >> 1.X version of structure. >> >> Andreas >> >> >> >>> Hello Nick, >>> >>> What kind of maven failure are you seeing? >>> >>> ? michael >>> >>> >>> On Mon, Mar 12, 2012 at 12:54 PM, Andreas Prlic wrote: >>>> I am not aware of any issues with that. Does either 1.8 or >>>> 1.8.2-SNAPSHOT work? Depending on your IDE, you just might have to >>>> refresh your Maven dependencies? >>>> >>>> Andreas >>>> >>>> >>>> >>>> On Mon, Mar 12, 2012 at 10:39 AM, Nick England wrote: >>>>> Thanks for the help, 1.8.1 seems to have the classes I need. But I >>>>> haven't been able to get maven to work with biojava 1.8. It works fine >>>>> with 3.X, but when I try to use the dependency: >>>>> ? ? ? ? ? ? ? ? >>>>> ? ? ? ? ? ? ? ? ? ? ? ?org.biojava >>>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-legacy >>>>> ? ? ? ? ? ? ? ? ? ? ? ?1.8.1 >>>>> ? ? ? ? ? ? ? ? >>>>> it fails, even though there appears to be the corresponding pom on the >>>>> repository at http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.1/ >>>>> >>>>> I have the repo at : >>>>> ? ? ? ? ? ? ? ? >>>>> ? ? ? ? ? ? ? ? ? ? ? ?biojava-maven-repo >>>>> ? ? ? ? ? ? ? ? ? ? ? ?BioJava repository >>>>> ? ? ? ? ? ? ? ? ? ? ? ?http://www.biojava.org/download/maven/ >>>>> ? ? ? ? ? ? ? ? >>>>> >>>>> Am I doing something wrong, or has the release version 1.8.1 somehow >>>>> broken on the repository? >>>>> >>>>> Thanks, >>>>> >>>>> Nick >>>>> >>>>> On 12 March 2012 16:43, Andreas Prlic wrote: >>>>>> Hi Nick, >>>>>> >>>>>> this feature has not been ported to biojava3 so far and you can get it >>>>>> via biojava 1.8.1 >>>>>> >>>>>> Andreas >>>>>> >>>>>> On Mon, Mar 12, 2012 at 8:04 AM, Nick England wrote: >>>>>>> Hello all, >>>>>>> >>>>>>> I am trying to read some .scf files into BioJava3. I have found the >>>>>>> example code for 1.3 >>>>>>> (http://biojava.org/wiki/BioJava:Cookbook:SeqIO:ABItoSequence) but I >>>>>>> can't find any classes in the 3.0 API which look at all related to >>>>>>> those ones. Is it possible, or should I downgrade to 1.3 if I want to >>>>>>> be able to read .scf files? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Nick >>>>>>> _______________________________________________ >>>>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l From biojava at hannes.oib.com Tue Mar 13 12:23:37 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 13 Mar 2012 13:23:37 +0100 Subject: [Biojava-l] Chromatagraph reading in BioJava3? In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 11:25, Nick England wrote: > If I try to run maven I got the following: > > [INFO] Building sequence analysis 0.0.1-SNAPSHOT > [INFO] ------------------------------------------------------------------------ > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.pom > [WARNING] The POM for org.biojava:biojava-legacy:jar:1.8.2 is missing, > no dependency information available > Downloading: http://www.biojava.org/download/maven/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > Downloading: http://repo.maven.apache.org/maven2/org/biojava/biojava-legacy/1.8.2/biojava-legacy-1.8.2.jar > [INFO] ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > > I get this for 1.8 and 1.8.1 as well: > > Could not find artifact org.biojava:biojava-legacy:jar:1.8 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] > Could not find artifact org.biojava:biojava-legacy:jar:1.8.1 in > biojava-maven-repo (http://www.biojava.org/download/maven/) -> [Help > 1] I just tried it myself, I see the same results. My maven tool usually helps me filling out the pom.xml file and suggests valid entries. In the artifactId filed, i only get: biojava biojava3-alignment biojava3-core biojava3-phylo and when selecting biojava, I only get 3.0 and 3.0.2 in the version field -> looks linke something is seriously wrong with the maven repo. Note: when I enter it manually, the 1.8.1 pom is downloaded, but the jar fails. Hannes From komalsnehal1991 at gmail.com Wed Mar 14 19:27:50 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Thu, 15 Mar 2012 00:57:50 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 Message-ID: Hi Everyone, I am Komal, an undergraduate student from IT-BHU, India. I'm interested in working with BioJava for GSoC 2012. I am particularly interested in working on the 'Porting an Algorithm to Java' project. Kindly help me about how I should proceed. Thanks, Komal From andreas at sdsc.edu Wed Mar 14 19:47:32 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 14 Mar 2012 12:47:32 -0700 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi Komal, stay tuned to this list, we still don't know if we will get funded from Google this year. Andreas On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev wrote: > Hi Everyone, > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested in > working with BioJava for GSoC 2012. I am particularly interested in working > on the 'Porting an Algorithm to Java' project. > Kindly help me about how I should proceed. > > Thanks, > Komal > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From jmsallen12 at gmail.com Thu Mar 15 16:30:15 2012 From: jmsallen12 at gmail.com (James Allen) Date: Thu, 15 Mar 2012 12:30:15 -0400 Subject: [Biojava-l] World's biggest fake conference in computer science Message-ID: --- Better not to have a publication than to publish in WORLDCOMP and spoil the resume forever --- If you didn't know already, WORLDCOMP is the World's biggest fake conference in computer science http://sites.google.com/site/worlddump1 The next WORLDCOMP (consists of more than twenty different conferences) will be during July 16-19, 2012, USA. Hamid Arabnia (a professor in computer science from University of Georgia, USA) has been running this fake (bogus or junk or scam) conference business to collect registration fee for over a decade. He accepts almost all submitted papers but cheating that there is a review by "two experts". He accumulated millions of dollars by this business. If the above link didn't work, you may try http://copy-shake-paste.blogspot.com/2012/02/fake-conference-worldcomp.html If none of these links work, search internet using worldcomp bogus If you have a paper in WORLDCOMP 2011 or earlier, you may file a lawsuit against Hamid Arabnia because he cheated you about reviews, reviewers, acceptance policies and acceptance rates. Sincerely, James Allen LATEST NEWS (as of March 11, 2012): Hamid Arabnia removed his name and address (ie, University of Georgia, where he is working as a professor) from WORLDCOMP website. If the conference is not fake, why he removed them suddenly? Who is the chair/coordinator of this bogus conference now? There are no committee members, keynote speeches and sponsors. The draft paper submission date is quietly extended. Why is Hamid Arabnia still running the fake conference anonymously and for whose benefit? We ask him to answer all these questions. This is the first time (in the entire world) a conference is running without the chair/coordinator name listed. Hamid Arabnia may soon bring another person (just for name sake) as the chair of WORLDCOMP and still Hamdid Arabnia run it from behind the scenes forever. From superrubiroyd at gmail.com Fri Mar 16 23:57:33 2012 From: superrubiroyd at gmail.com (superrubiroyd) Date: Sat, 17 Mar 2012 01:57:33 +0200 Subject: [Biojava-l] GSOC 2012 Message-ID: <4F63D36D.1010401@gmail.com> Hi, I am the final year student in Ukraine. I have 2.5 year of Javaexperience. I would like to work with project 'New File Parsers for BioJava' during GSOC 2012. Can you explain little more what should be done in this project and can you give some advices how to make correct application. Thanks in advance. With best regards, Evgeniy Berlog From andreas at sdsc.edu Sat Mar 17 00:41:59 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Mar 2012 17:41:59 -0700 Subject: [Biojava-l] BioJava at at Google Summer of Code 2012 Message-ID: Hi All, The Open Bioinformatics foundation as an umbrella organisation for BioJava has been accepted to participate in this year's Google Summer of Code. See the announcement message below. This means we will again be able to offer mentoring through BioJava this year. Accepted students will get a stipend of 5,000$ from Google. Participation is possible from most countries in the world, as long as you are eligible to work in the country in which you'll reside throughout the duration of the program. If you are interested in working on a BioJava related project, now is the time to start preparing and discussing your proposals. For the last two years we had many applications for the projects proposed by mentors. If you want to distinguish your application I recommend to propose your own project. Don't forget to discuss any proposal with us before you submit them. We will try to provide feedback and match you with a suitable Mentor. Also see http://biojava.org/wiki/Google_Summer_of_Code and Google's FAQs: http://www.google-melange.com/document/show/gsoc_program/google/gsoc2012/faqs The student application deadline is April 6th. Google will announce which proposals got accepted on April 23rd. Andreas ---------- Forwarded message ---------- From: Robert Buels Date: Fri, Mar 16, 2012 at 12:47 PM Subject: [Open-bio-l] Google Summer of Code is *ON* for OBF projects! To: Open-Bio List Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). ? Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2012 FAQ at http://goo.gl/kNv48 Student applications are due April 6, 2012 at 19:00 UTC. ?Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and whom to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. ?Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2012 Administrator _______________________________________________ Open-Bio-l mailing list Open-Bio-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/open-bio-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From andreas at sdsc.edu Sat Mar 17 01:30:11 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Mar 2012 18:30:11 -0700 Subject: [Biojava-l] BioJava 3.0.3 released Message-ID: BioJava 3.0.3 has been released and is available from http://www.biojava.org/wiki/BioJava:Download as well as from the BioJava maven repository at http://www.biojava.org/download/maven/ . BioJava 3.0.3 adds several new features - Significant improvements for the web service module (ncbi blast and hmmer web services) - Fastq parser (ported from the biojava 1 series to version 3) - Support for SIFTS-PDB to UniProt mapping - Improved support for working with external protein domain definitions - Protmod module renamed to modfinder - Numerous improvements all over the place (several hundred commits since last release) - We are also working on an update for the legacy biojava 1.8 series. This release would not have been possible with contributions from numerous people, thanks to all for their support! About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy BioJava-ing, Andreas From anupam.aries19 at gmail.com Sat Mar 17 08:04:53 2012 From: anupam.aries19 at gmail.com (Anupam Singh) Date: Sat, 17 Mar 2012 13:34:53 +0530 Subject: [Biojava-l] gsoc Message-ID: Sir, I am Anupam Singh, a second year student pursuing a M.sc Tech degree in Information Systems from Bits Pilani in India.I went through the Biojava project details.I found the project very interesting and would genuinely like to work those features of Biojava3 like Cath parser and Genbank parser to make it more user friendly and versatile.I have been coding in java/c/c++ for the past 4 years and have a good command over the language to make this project possible. Regards, Anupam Singh From ritishalaungani at gmail.com Sat Mar 17 09:50:13 2012 From: ritishalaungani at gmail.com (Ritisha Laungani) Date: Sat, 17 Mar 2012 15:20:13 +0530 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java Message-ID: Hello, I am Ritisha Laungani, a pre-final year student currently persuing *MSc Tech. Information Systems* at Birla Institute of Technology and Science, Goa, India. I would like to apply for the BioJava project as i have worked into all the 3 fields this projects requires- C, Java and Bio! As far as i understand, in simple terms, the project's goal is to convert an existing HMMER source code, which is written in C, to a java code using language processing tools. Do correct me if i am wrong here! I must admit here that i am new to open source software development and also unaware of HMMER. But i would love to learn if given a chance and the correct resources! :) Eagerly awaiting a reply, which could guide me to the next step. Regards, Ritisha Laungani From komalsnehal1991 at gmail.com Sat Mar 17 22:59:06 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Sun, 18 Mar 2012 04:29:06 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi all, Introducing myself a bit more, I also work as a remote intern for DARNED. DARNED is a database of RNA Editing, and currently new features are being added into the project, one of which is incorporating the BLAST feature for sequence based search. Having worked on a similar project recently, I think I will be comfortable working with the 'Porting an Algorithm to Java' project. The following is what I understood about the project. Please correct me if I am wrong. This(link) is the current method used for BLAST, which accesses the NCBI website each time. The NCBIQBlastService class is currently used. The project aims at replacing this with code which will perform the search within Biojava. I downloaded the source codes of BLAST and HMMER. My job will be to convert these to Java. Regarding the C/C++ to Java converter, i found this on the internet: http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html but it is not free of cost. Apart from this, I saw that many people discourage the use of C/C++ to Java tools saying that they are not efficient. Does anyone know of any better tool which can do this? Regarding the JNI, would it not be better if the whole code was written in Java, rather than a part of it being in C/C++? I haven't used it before, but if it is better than converting the code, I don't have a problem working with it. Kindly clear my doubts. Thanks in advance, Komal On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > Hi Komal, > > stay tuned to this list, we still don't know if we will get funded > from Google this year. > > Andreas > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > wrote: > > Hi Everyone, > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > in > > working with BioJava for GSoC 2012. I am particularly interested in > working > > on the 'Porting an Algorithm to Java' project. > > Kindly help me about how I should proceed. > > > > Thanks, > > Komal > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > From amr_alhossary at hotmail.com Sun Mar 18 04:05:41 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Sun, 18 Mar 2012 06:05:41 +0200 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Dear Komal, As far as I know, The project is about porting the code to Java, not using existing C code within JNI. That means you should be able to digest the algorithm first, Build it using Java from scratch, depending on C code as a reference Implementation. Regards Amr -----Original Message----- From: Komal Sanjeev Sent: Sunday, March 18, 2012 12:59 AM To: Andreas Prlic ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 Hi all, Introducing myself a bit more, I also work as a remote intern for DARNED. DARNED is a database of RNA Editing, and currently new features are being added into the project, one of which is incorporating the BLAST feature for sequence based search. Having worked on a similar project recently, I think I will be comfortable working with the 'Porting an Algorithm to Java' project. The following is what I understood about the project. Please correct me if I am wrong. This(link) is the current method used for BLAST, which accesses the NCBI website each time. The NCBIQBlastService class is currently used. The project aims at replacing this with code which will perform the search within Biojava. I downloaded the source codes of BLAST and HMMER. My job will be to convert these to Java. Regarding the C/C++ to Java converter, i found this on the internet: http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html but it is not free of cost. Apart from this, I saw that many people discourage the use of C/C++ to Java tools saying that they are not efficient. Does anyone know of any better tool which can do this? Regarding the JNI, would it not be better if the whole code was written in Java, rather than a part of it being in C/C++? I haven't used it before, but if it is better than converting the code, I don't have a problem working with it. Kindly clear my doubts. Thanks in advance, Komal On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > Hi Komal, > > stay tuned to this list, we still don't know if we will get funded > from Google this year. > > Andreas > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > wrote: > > Hi Everyone, > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > in > > working with BioJava for GSoC 2012. I am particularly interested in > working > > on the 'Porting an Algorithm to Java' project. > > Kindly help me about how I should proceed. > > > > Thanks, > > Komal > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From komalsnehal1991 at gmail.com Sun Mar 18 07:42:29 2012 From: komalsnehal1991 at gmail.com (Komal Sanjeev) Date: Sun, 18 Mar 2012 13:12:29 +0530 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Hi Amr, The following has been mentioned in the project description: "Converting C or C++ source code by hand is not a trivial undertaking and it is recommended that a C/C++ to Java conversion tool be used to do as much of the work as possible. It is also an option to consider a JNI approach for integrating these applications into Java." I am a bit confused. Kindly tell me what exactly has to be done in the project. Thanks, Komal On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary wrote: > Dear Komal, > > As far as I know, The project is about porting the code to Java, not using > existing C code within JNI. > That means you should be able to digest the algorithm first, Build it > using Java from scratch, depending on C code as a reference Implementation. > > Regards > > Amr > > -----Original Message----- > From: Komal Sanjeev > Sent: Sunday, March 18, 2012 12:59 AM > To: Andreas Prlic ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 > > Hi all, > > Introducing myself a bit more, I also work as a remote intern for > DARNED. > DARNED is a database of RNA Editing, and currently new features are being > added into the project, one of which is incorporating the BLAST feature for > sequence based search. Having worked on a similar project recently, I think > I will be comfortable working with the 'Porting an Algorithm to Java' > project. > > The following is what I understood about the project. Please correct me if > I am wrong. > This(link) > is > the current method used for BLAST, which accesses the NCBI website each > time. The NCBIQBlastService class is currently used. The project aims at > replacing this with code which will perform the search within Biojava. I > downloaded the source codes of BLAST and HMMER. My job will be to convert > these to Java. > > Regarding the C/C++ to Java converter, i found this on the internet: > > http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html > > but it is not free of cost. > Apart from this, I saw that many people discourage the use of C/C++ to Java > tools saying that they are not efficient. Does anyone know of any better > tool which can do this? > > Regarding the JNI, would it not be better if the whole code was written in > Java, rather than a part of it being in C/C++? I haven't used it before, > but if it is better than converting the code, I don't have a problem > working with it. > > Kindly clear my doubts. > Thanks in advance, > Komal > > > > On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > > > Hi Komal, > > > > stay tuned to this list, we still don't know if we will get funded > > from Google this year. > > > > Andreas > > > > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > > wrote: > > > Hi Everyone, > > > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested > > in > > > working with BioJava for GSoC 2012. I am particularly interested in > > working > > > on the 'Porting an Algorithm to Java' project. > > > Kindly help me about how I should proceed. > > > > > > Thanks, > > > Komal > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Sun Mar 18 11:10:15 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Sun, 18 Mar 2012 12:10:15 +0100 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: While the ideal solution would be a clean port of the BLAST algorithm to Java, the complexity of such a mature and performance-optimized code might be a bit too much for a GSOC project. The JNI solution, while not optimal, would be an acceptable fallback solution. If implemented properly, a base for using existing programs like BLAST, HMMER, ? via JNI could be created. The project goal is: "Make BLAST work locally, without internet connection in addition to the existing NCBIQBlastService method" Hannes On Sun, Mar 18, 2012 at 08:42, Komal Sanjeev wrote: > Hi Amr, > > The following has been mentioned in the project description: > "Converting C or C++ source code by hand is not a trivial undertaking and > it is recommended that a C/C++ to Java conversion tool be used to do as > much of the work as possible. It is also an option to consider a JNI > approach for integrating these applications into Java." > > I am a bit confused. Kindly tell me what exactly has to be done in the > project. > > Thanks, > Komal > > > > On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary > wrote: > >> ? Dear Komal, >> >> As far as I know, The project is about porting the code to Java, not using >> existing C code within JNI. >> That means you should be able to digest the algorithm first, Build it >> using Java from scratch, depending on C code as a reference Implementation. >> >> Regards >> >> Amr >> >> -----Original Message----- >> From: Komal Sanjeev >> Sent: Sunday, March 18, 2012 12:59 AM >> To: Andreas Prlic ; biojava-l at lists.open-bio.org >> Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 >> >> Hi all, >> >> Introducing myself a bit more, I also work as a remote intern for >> DARNED. >> DARNED is a database of RNA Editing, and currently new features are being >> added into the project, one of which is incorporating the BLAST feature for >> sequence based search. Having worked on a similar project recently, I think >> I will be comfortable working with the 'Porting an Algorithm to Java' >> project. >> >> The ?following is what I understood about the project. Please correct me if >> I am wrong. >> This(link) >> is >> the current method used for BLAST, which accesses the NCBI website each >> time. The NCBIQBlastService class is currently used. The project aims at >> replacing this with code which will perform the search within Biojava. I >> downloaded the source codes of BLAST and HMMER. My job will be to convert >> these to Java. >> >> Regarding the C/C++ to Java converter, i found this on the internet: >> >> http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html >> >> but it is not free of cost. >> Apart from this, I saw that many people discourage the use of C/C++ to Java >> tools saying that they are not efficient. Does anyone know of any better >> tool which can do this? >> >> Regarding the JNI, would it not be better if the whole code was written in >> Java, rather than a part of it being in C/C++? I haven't used it before, >> but if it is better than converting the code, I don't have a problem >> working with it. >> >> Kindly clear my doubts. >> Thanks in advance, >> Komal >> >> >> >> On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: >> >> > Hi Komal, >> > >> > stay tuned to this list, we still don't know if we will get funded >> > from Google this year. >> > >> > Andreas >> > >> > >> > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev >> > wrote: >> > > Hi Everyone, >> > > >> > > I am Komal, an undergraduate student from IT-BHU, India. I'm interested >> > in >> > > working with BioJava for GSoC 2012. I am particularly interested in >> > working >> > > on the 'Porting an Algorithm to Java' project. >> > > Kindly help me about how I should proceed. >> > > >> > > Thanks, >> > > Komal >> > > _______________________________________________ >> > > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From amr_alhossary at hotmail.com Sun Mar 18 11:06:01 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Sun, 18 Mar 2012 13:06:01 +0200 Subject: [Biojava-l] Interested in working with BioJava for GSoC 2012 In-Reply-To: References: Message-ID: Let me refer to Dr. Andreas & be back to you. P.S. please stop sending to both lists. Amr -----Original Message----- From: Komal Sanjeev Sent: Sunday, March 18, 2012 9:42 AM To: Amr AL-Hossary Cc: biojava-dev at lists.open-bio.org ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 Hi Amr, The following has been mentioned in the project description: "Converting C or C++ source code by hand is not a trivial undertaking and it is recommended that a C/C++ to Java conversion tool be used to do as much of the work as possible. It is also an option to consider a JNI approach for integrating these applications into Java." I am a bit confused. Kindly tell me what exactly has to be done in the project. Thanks, Komal On Sun, Mar 18, 2012 at 9:35 AM, Amr AL-Hossary wrote: > Dear Komal, > > As far as I know, The project is about porting the code to Java, not using > existing C code within JNI. > That means you should be able to digest the algorithm first, Build it > using Java from scratch, depending on C code as a reference > Implementation. > > Regards > > Amr > > -----Original Message----- > From: Komal Sanjeev > Sent: Sunday, March 18, 2012 12:59 AM > To: Andreas Prlic ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Interested in working with BioJava for GSoC 2012 > > Hi all, > > Introducing myself a bit more, I also work as a remote intern for > DARNED. > DARNED is a database of RNA Editing, and currently new features are being > added into the project, one of which is incorporating the BLAST feature > for > sequence based search. Having worked on a similar project recently, I > think > I will be comfortable working with the 'Porting an Algorithm to Java' > project. > > The following is what I understood about the project. Please correct me > if > I am wrong. > This(link) > is > the current method used for BLAST, which accesses the NCBI website each > time. The NCBIQBlastService class is currently used. The project aims at > replacing this with code which will perform the search within Biojava. I > downloaded the source codes of BLAST and HMMER. My job will be to convert > these to Java. > > Regarding the C/C++ to Java converter, i found this on the internet: > > http://tangiblesoftwaresolutions.com/Product_Details/CPlusPlus_to_Java_Converter_Details.html > > but it is not free of cost. > Apart from this, I saw that many people discourage the use of C/C++ to > Java > tools saying that they are not efficient. Does anyone know of any better > tool which can do this? > > Regarding the JNI, would it not be better if the whole code was written in > Java, rather than a part of it being in C/C++? I haven't used it before, > but if it is better than converting the code, I don't have a problem > working with it. > > Kindly clear my doubts. > Thanks in advance, > Komal > > > > On Thu, Mar 15, 2012 at 1:17 AM, Andreas Prlic wrote: > > > Hi Komal, > > > > stay tuned to this list, we still don't know if we will get funded > > from Google this year. > > > > Andreas > > > > > > On Wed, Mar 14, 2012 at 12:27 PM, Komal Sanjeev > > wrote: > > > Hi Everyone, > > > > > > I am Komal, an undergraduate student from IT-BHU, India. I'm > > > interested > > in > > > working with BioJava for GSoC 2012. I am particularly interested in > > working > > > on the 'Porting an Algorithm to Java' project. > > > Kindly help me about how I should proceed. > > > > > > Thanks, > > > Komal > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Sun Mar 18 16:37:06 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Mar 2012 09:37:06 -0700 Subject: [Biojava-l] GSoC 2012 - how to get started with your proposals Message-ID: Hi, It is great to see so much interest for GSoC again this year. To get started with a proposal I would recommend to look at the BioJava project proposals from the last two years (they are on the wiki) and see what kind of projects got funded and how those proposals were written. Think about what you would like to work on. Get a copy of BioJava and see how related features are working. Come up with a plan on how to extend this. We are fairly flexible regarding what kind of projects we will run this summer and this really depends on the submitted project proposals. All proposals will be compared and ranked together with other projects from the Bio* projects. As such a good proposal is key to get funded. A good proposals shows - the motivation of the student - that the candidate is qualified to do what he is proposing - adds useful new functionality to BioJava - discusses possible risks and what to do about them It is difficult to answer questions like "how should I perform this or that project?" - There are more than one possible path and it depends on your skills and interest what will be the best answer for this. Overall I recommend to pick a project on a topic that is close to your (future?) thesis, or is of particular interest for you. Here a couple of more thoughts which are project specific: - The best projects are those that you come up with yourself. If you want to distinguish yours from every other proposal, suggest something we have not been thinking of as of yet. - File parsers: if you want to work on file parsers take a look at existing ones. What features do they provide? How can they be extended? For example if you want to work on the CATH parser, take a look at how the SCOP parser works. What features are available around this (access to domains) and how can something like this be set up for CATH. Look at how the CATH website provides files. - Porting of algorithms: There are several approaches possible for doing this. I recommend that you should have some background both in C and in Java for this. Get a copy of the algorithm you want to port, compile it, and take a look at the source. There are several ways how to proceed for the actual port and having a good strategy for this is key for this proposal. Perhaps try to use your strategy on some simple test case to see how this might work. - BioJava in the cloud The goal here is parallelization of existing code. What parts of biojava are suitable for this? How can they be parallelized and moved to current cloud infrastructure? There is a lot of online material available for this which will be helpful here. Andreas From sbliven at ucsd.edu Sun Mar 18 20:47:10 2012 From: sbliven at ucsd.edu (Spencer Bliven) Date: Sun, 18 Mar 2012 13:47:10 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Unfortunately, HMMER is licensed as GPL. As such, we can't port it to BioJava or even link against it with JNI. A 2009 postindicates that they are not interested in re-licensing HMMER under a less restrictive license. I think we should move away from any HMMER-port project, and focus on porting other important algorithms such as BLAST (public domain ). I went ahead and removed HMMER from the GSoC wiki page. I was trying to think of other LGPL-compatable bioinformatics projectswhich would be nice to port to biojava. Maybe a sequence browser, such as incorporating/linking the Integrated Genome Browser? Anyone have ideas other than BLAST? -Spencer On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani wrote: > Hello, > > I am Ritisha Laungani, a pre-final year student currently persuing *MSc > Tech. Information Systems* at Birla Institute of Technology and Science, > Goa, India. > > I would like to apply for the BioJava project as i have worked into all the > 3 fields this projects requires- C, Java and Bio! > > As far as i understand, in simple terms, the project's goal is to convert > an existing HMMER source code, which is written in C, to a java code using > language processing tools. Do correct me if i am wrong here! > > I must admit here that i am new to open source software development and > also unaware of HMMER. But i would love to learn if given a chance and the > correct resources! :) > > Eagerly awaiting a reply, which could guide me to the next step. > > Regards, > > Ritisha Laungani > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Sun Mar 18 21:07:45 2012 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Sun, 18 Mar 2012 22:07:45 +0100 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: What about: http://en.wikipedia.org/wiki/MUSCLE_(alignment_software) It's public domain. Another Multiple Sequence Alignment would be t-coffee, but it's GPL again. Hannes On Sun, Mar 18, 2012 at 21:47, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less > restrictive license. I think we should move away from any HMMER-port > project, and focus on porting other important algorithms such as BLAST (public > domain > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projectswhich > would be nice to port to biojava. Maybe a sequence browser, such as > incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing *MSc >> Tech. Information Systems* at Birla Institute of Technology and Science, >> Goa, India. >> >> I would like to apply for the BioJava project as i have worked into all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to convert >> an existing HMMER source code, which is written in C, to a java code using >> language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development and >> also unaware of HMMER. But i would love to learn if given a chance and the >> correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Mon Mar 19 05:19:19 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Mar 2012 22:19:19 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. Andreas On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less > restrictive license. I think we should move away from any HMMER-port > project, and focus on porting other important algorithms such as BLAST (public > domain > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projectswhich > would be nice to port to biojava. Maybe a sequence browser, such as > incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing *MSc >> Tech. Information Systems* at Birla Institute of Technology and Science, >> Goa, India. >> >> I would like to apply for the BioJava project as i have worked into all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to convert >> an existing HMMER source code, which is written in C, to a java code using >> language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development and >> also unaware of HMMER. But i would love to learn if given a chance and the >> correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From member at linkedin.com Mon Mar 19 15:16:32 2012 From: member at linkedin.com (Chuan Hock Koh via LinkedIn) Date: Mon, 19 Mar 2012 15:16:32 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <1402750432.7228374.1332170192542.JavaMail.app@ela4-app0132.prod> LinkedIn ------------ Chuan Hock Koh requested to add you as a connection on LinkedIn: ------------------------------------------ Christopher, I'd like to add you to my professional network on LinkedIn. - Chuan Hock Accept invitation from Chuan Hock Koh http://www.linkedin.com/e/triamj-gzznqj6i-16/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I142003583_16/1BpC5vrmRLoRZcjkkZt5YCpnlOt3RApnhMpmdzgmhxrSNBszYSclYPe3kPc30Od359bSVQhRlqdlhIbP0Ncj4Vcz0RcPoLrCBxbOYWrSlI/EML_comm_afe/?hs=false&tok=0AB7NzG-jkOl81 View invitation from Chuan Hock Koh http://www.linkedin.com/e/triamj-gzznqj6i-16/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I142003583_16/3oNnPcUdjcMc38QckALqnpPbOYWrSlI/svi/?hs=false&tok=2iLq8WbEPkOl81 ------------------------------------------ Why might connecting with Chuan Hock Koh be a good idea? Chuan Hock Koh's connections could be useful to you: After accepting Chuan Hock Koh's invitation, check Chuan Hock Koh's connections to see who else you may know and who you might want an introduction to. Building these connections can create opportunities in the future. -- (c) 2012, LinkedIn Corporation From darnells at dnastar.com Mon Mar 19 16:18:02 2012 From: darnells at dnastar.com (Steve Darnell) Date: Mon, 19 Mar 2012 16:18:02 +0000 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Andreas, We spoke offline about the HMMER/GPL issue this weekend. I think it is premature to remove the HMMER option from the GSoC wiki page. I would like to clarify the Sean Eddy blog post linked to by Spencer (_**_ emphasis mine): >From the LICENSE section: The only thing the GPLv3 really blocks is someone forking a derivative copy of HMMER and distributing it under a different license, such as a closed-source proprietary license; to do that, _*you'd need to negotiate a non-GPL license with us first*_. >From the COPYRIGHT section: _*We really don't expect to negotiate any non-GPL licenses, though*_. We want to enable many different people _*to contribute to a single open source HMMER codebase*_, as a shared codebase for bioinformatics and computational biology. >From the TRADEMARK section: _*Did I mention, we want to enable a single open source HMMER codebase?*_ ========== Sean Eddy and the Howard Hughes Medical Institute are the main copyright holders. The main goal is clear... maintain a single open source HMMER codebase. The choice of license for HMMER (GPL v3) was to persuade people to contribute back. However, OBF might be able to negotiate other arrangements (perhaps a non-GPL library that can only be distributed with BioJava and any contributions made by the GSoC student must be licensed back to HHMI under GPL?). I do not know how hopeful to be about that possibility, but it cannot hurt to ask. I would like to dissuade GSoC students from directly contacting Sean Eddy or HHMI about this possibility. This task is most appropriate for a senior BioJava representative and it is up to the "Port an Algorithm to Java" mentors on how to proceed. Just my $0.02. Regards, Steve -- Steve Darnell DNASTAR, Inc. Madison, WI USA -----Original Message----- From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Andreas Prlic Sent: Monday, March 19, 2012 12:19 AM To: Spencer Bliven; Hannes Brandst?tter-M?ller Cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] GSoC 2012- Port an Algorithm to Java A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. Andreas On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: > Unfortunately, HMMER is licensed as GPL. As such, we can't port it to > BioJava or even link against it with JNI. A 2009 > postindicates that > they are not interested in re-licensing HMMER under a less restrictive > license. I think we should move away from any HMMER-port project, and > focus on porting other important algorithms such as BLAST (public > domain pts/projects/blast/LICENSE> > ). > > I went ahead and removed HMMER from the GSoC wiki > page. > I was trying to think of other LGPL-compatable bioinformatics > projects cs_software>which would be nice to port to biojava. Maybe a sequence > browser, such as incorporating/linking the Integrated Genome > Browser? > Anyone have ideas other than BLAST? > > -Spencer > > On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani > wrote: > >> Hello, >> >> I am Ritisha Laungani, a pre-final year student currently persuing >> *MSc Tech. Information Systems* at Birla Institute of Technology and >> Science, Goa, India. >> >> I would like to apply for the BioJava project as i have worked into >> all the >> 3 fields this projects requires- C, Java and Bio! >> >> As far as i understand, in simple terms, the project's goal is to >> convert an existing HMMER source code, which is written in C, to a >> java code using language processing tools. ?Do correct me if i am wrong here! >> >> I must admit here that i am new to open source software development >> and also unaware of HMMER. But i would love to learn if given a >> chance and the correct resources! ?:) >> >> Eagerly awaiting a reply, which could guide me to the next step. >> >> Regards, >> >> Ritisha Laungani >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From shashank201091 at gmail.com Mon Mar 19 19:45:57 2012 From: shashank201091 at gmail.com (Shashank Gupta) Date: Tue, 20 Mar 2012 01:15:57 +0530 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: Hello I am a beginner in the open source development but I am trying to give my best shot at this year's GSoC. In "Porting an Algorithm to Java" what I infer is we can export the svn and then using a C++ to Java code converter convert the whole source code into Java. After which open the project through a Java based IDE and compile the source code and then fix the errors that have crept in the code while the conversion. After thorough testing and debugging we clean the code and give the final source. I know I am a newbie and i'll be grateful if you could let me know why this method won't work, considering if it won't. Regards Shashank Gupta From andreas at sdsc.edu Mon Mar 19 20:10:40 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 19 Mar 2012 13:10:40 -0700 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: Hi Shashank, You are right, this is on a high level how such a conversion could be performed. Andreas On Mon, Mar 19, 2012 at 12:45 PM, Shashank Gupta wrote: > Hello > > I am a beginner in the open source development but I am trying to give my > best shot at this year's GSoC. In "Porting an Algorithm to Java" what I > infer is we can export the svn and then using a C++ to Java code converter > convert the whole source code into Java. After which open the project > through a Java based IDE and compile the source code and then fix the > errors that have crept in the code while the conversion. > After thorough testing and debugging we clean the code and give the final > source. I know I am a newbie and i'll be grateful if you could let me know > why this method won't work, considering if it won't. > > Regards > Shashank Gupta > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From russ at kepler-eng.com Mon Mar 19 20:48:57 2012 From: russ at kepler-eng.com (Russ Kepler) Date: Mon, 19 Mar 2012 14:48:57 -0600 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: <2678562.mqEKBcxuqW@main> On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. From andreas at sdsc.edu Mon Mar 19 21:15:13 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 19 Mar 2012 14:15:13 -0700 Subject: [Biojava-l] GSoC 2012- Port an Algorithm to Java In-Reply-To: References: Message-ID: Thanks Steve, good points. Let's conclude this discussion with the take home message that converting GPLed code to BioJava has licensing issues and requires additional negotiations. Before embarking on such a project the mentors will have a discussion about licensing with HHMI (or any other license holder for other algorithms). Andreas 2012/3/19 Steve Darnell : > Hi Andreas, > > We spoke offline about the HMMER/GPL issue this weekend. I think it is premature to remove the HMMER option from the GSoC wiki page. I would like to clarify the Sean Eddy blog post linked to by Spencer (_**_ emphasis mine): > > From the LICENSE section: > > The only thing the GPLv3 really blocks is someone forking a derivative copy of HMMER and distributing it under a different license, such as a closed-source proprietary license; to do that, _*you'd need to negotiate a non-GPL license with us first*_. > > From the COPYRIGHT section: > > _*We really don't expect to negotiate any non-GPL licenses, though*_. We want to enable many different people _*to contribute to a single open source HMMER codebase*_, as a shared codebase for bioinformatics and computational biology. > > From the TRADEMARK section: > > _*Did I mention, we want to enable a single open source HMMER codebase?*_ > > ========== > > Sean Eddy and the Howard Hughes Medical Institute are the main copyright holders. The main goal is clear... maintain a single open source HMMER codebase. The choice of license for HMMER (GPL v3) was to persuade people to contribute back. However, OBF might be able to negotiate other arrangements (perhaps a non-GPL library that can only be distributed with BioJava and any contributions made by the GSoC student must be licensed back to HHMI under GPL?). I do not know how hopeful to be about that possibility, but it cannot hurt to ask. > > I would like to dissuade GSoC students from directly contacting Sean Eddy or HHMI about this possibility. This task is most appropriate for a senior BioJava representative and it is up to the "Port an Algorithm to Java" mentors on how to proceed. > > Just my $0.02. > > Regards, > Steve > > -- > Steve Darnell > DNASTAR, Inc. > Madison, WI USA > > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Andreas Prlic > Sent: Monday, March 19, 2012 12:19 AM > To: Spencer Bliven; Hannes Brandst?tter-M?ller > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] GSoC 2012- Port an Algorithm to Java > > A worst case scenario could be to host an independent and GPLed project on the BioJava SVN. However I ?see your point. These licensing issues contribute to the complexity around such a project and make it much more difficult. > > In terms of other algorithms, BioJava already contains a multiple sequence alignment algorithm, as such I would rather see that one getting extended, than a 2nd algorithm being implemented. > > Andreas > > On Sun, Mar 18, 2012 at 1:47 PM, Spencer Bliven wrote: >> Unfortunately, HMMER is licensed as GPL. As such, we can't port it to >> BioJava or even link against it with JNI. A 2009 >> postindicates that >> they are not interested in re-licensing HMMER under a less restrictive >> license. I think we should move away from any HMMER-port project, and >> focus on porting other important algorithms such as BLAST (public >> domain> pts/projects/blast/LICENSE> >> ). >> >> I went ahead and removed HMMER from the GSoC wiki >> page. >> I was trying to think of other LGPL-compatable bioinformatics >> projects> cs_software>which would be nice to port to biojava. Maybe a sequence >> browser, such as incorporating/linking the Integrated Genome >> Browser? >> Anyone have ideas other than BLAST? >> >> -Spencer >> >> On Sat, Mar 17, 2012 at 02:50, Ritisha Laungani >> wrote: >> >>> Hello, >>> >>> I am Ritisha Laungani, a pre-final year student currently persuing >>> *MSc Tech. Information Systems* at Birla Institute of Technology and >>> Science, Goa, India. >>> >>> I would like to apply for the BioJava project as i have worked into >>> all the >>> 3 fields this projects requires- C, Java and Bio! >>> >>> As far as i understand, in simple terms, the project's goal is to >>> convert an existing HMMER source code, which is written in C, to a >>> java code using language processing tools. ?Do correct me if i am wrong here! >>> >>> I must admit here that i am new to open source software development >>> and also unaware of HMMER. But i would love to learn if given a >>> chance and the correct resources! ?:) >>> >>> Eagerly awaiting a reply, which could guide me to the next step. >>> >>> Regards, >>> >>> Ritisha Laungani >>> _______________________________________________ From HWillis at scripps.edu Mon Mar 19 22:03:12 2012 From: HWillis at scripps.edu (Scooter Willis) Date: Mon, 19 Mar 2012 18:03:12 -0400 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Message-ID: 3+ years ago we worked on a port of the reference implementation of the H264 encoder/decoder to Java. Performance was actually very good that we didnt spend time on optimization. We used jazillian to do the initial conversion and it does a nice job on fairly clean code. They are no longer in business and I plan on contacting the developer to see if they would use their software for the initial conversion. Doing a JNI conversion will be fairly easy to model the input and ouput paramaters of blast and/or hmmer. Neither code base is setup as a library so the number of mappings that need to be performed is minimal. The complexity of porting either from C/C++ to Java is high but also introduces a student to the field and thus has potential long term benefit. With JNI high degree of success with minimal work. I would advocate that the initial effort is placed on conversion to determine the overall likelyhood of success where much will depend on the student. JNI should be a requirement even if conversion is successful. JNI would also minimize GPL concerns of forking the HMMER codebase. Scooter ----- Reply message ----- From: "Russ Kepler" To: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Date: Tue, Mar 20, 2012 8:06 am On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From amr_alhossary at hotmail.com Tue Mar 20 01:39:33 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Tue, 20 Mar 2012 03:39:33 +0200 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: I agree with most of Scooter's opinions, 1) JNI can be used as an initial step, to use the current code until we work out the License issues. 2) Also it can be used as a reference conversion implementation. 3) Then it would be better for the student to (digest) the algorithm and rebuild it. Here I add that 4) Even best converters can't map things like variables carrying array length into simply array.length. 5) Using already built-in algorithm libraries (searching, sorting, data structures) in java would be easier to maintain later on, plus being optimized for Java data types. Amr -----Original Message----- From: Scooter Willis Sent: Tuesday, March 20, 2012 12:03 AM To: Russ Kepler ; biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Porting an Algorithm to Java GSoC 2012 3+ years ago we worked on a port of the reference implementation of the H264 encoder/decoder to Java. Performance was actually very good that we didnt spend time on optimization. We used jazillian to do the initial conversion and it does a nice job on fairly clean code. They are no longer in business and I plan on contacting the developer to see if they would use their software for the initial conversion. Doing a JNI conversion will be fairly easy to model the input and ouput paramaters of blast and/or hmmer. Neither code base is setup as a library so the number of mappings that need to be performed is minimal. The complexity of porting either from C/C++ to Java is high but also introduces a student to the field and thus has potential long term benefit. With JNI high degree of success with minimal work. I would advocate that the initial effort is placed on conversion to determine the overall likelyhood of success where much will depend on the student. JNI should be a requirement even if conversion is successful. JNI would also minimize GPL concerns of forking the HMMER codebase. Scooter ----- Reply message ----- From: "Russ Kepler" To: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 Date: Tue, Mar 20, 2012 8:06 am On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > You are right, this is on a high level how such a conversion could be > performed. In my experience Java makes a poor C++ (or C) emulator. The "convert C++ to Java" might make an "OK" first pass but in the end you're going to want to recode critical sections in original Java. I've found that the really big performance improvements are in using smarter algorithms in Java vs. some of the 'brute force' approaches that work in C. _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From shashank201091 at gmail.com Tue Mar 20 04:57:54 2012 From: shashank201091 at gmail.com (Shashank Gupta) Date: Tue, 20 Mar 2012 10:27:54 +0530 Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 In-Reply-To: References: Message-ID: While porting a piece of code to Java one can always see what errors the converter is making. As the converter is static it'll repeat its mistakes for example if it cant convert the array length function then it will create the same error each time the function appears. When compiling the java code we'll get to know of the errors the converter is making and then by using the replace option we reduce the number of errors to a handful which then can be coded by hand. I had a doubt, if we convert a piece of code to Java by using a converter does it have certain performance issues? Regards Shashank Gupta On Tue, Mar 20, 2012 at 7:09 AM, Amr AL-Hossary wrote: > I agree with most of Scooter's opinions, > > 1) JNI can be used as an initial step, to use the current code until we > work out the License issues. > 2) Also it can be used as a reference conversion implementation. > 3) Then it would be better for the student to (digest) the algorithm and > rebuild it. > Here I add that > 4) Even best converters can't map things like variables carrying array > length into simply array.length. > 5) Using already built-in algorithm libraries (searching, sorting, data > structures) in java would be easier to maintain later on, plus being > optimized for Java data types. > > Amr > > > -----Original Message----- From: Scooter Willis > Sent: Tuesday, March 20, 2012 12:03 AM > To: Russ Kepler ; biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Porting an Algorithm to Java GSoC 2012 > > > 3+ years ago we worked on a port of the reference implementation of the > H264 encoder/decoder to Java. Performance was actually very good that we > didnt spend time on optimization. We used jazillian to do the initial > conversion and it does a nice job on fairly clean code. They are no longer > in business and I plan on contacting the developer to see if they would use > their software for the initial conversion. > > Doing a JNI conversion will be fairly easy to model the input and ouput > paramaters of blast and/or hmmer. Neither code base is setup as a library > so the number of mappings that need to be performed is minimal. The > complexity of porting either from C/C++ to Java is high but also introduces > a student to the field and thus has potential long term benefit. > > With JNI high degree of success with minimal work. I would advocate that > the initial effort is placed on conversion to determine the overall > likelyhood of success where much will depend on the student. JNI should be > a requirement even if conversion is successful. > > JNI would also minimize GPL concerns of forking the HMMER codebase. > > Scooter > > ----- Reply message ----- > From: "Russ Kepler" > To: "biojava-l at lists.open-bio.org" > Subject: [Biojava-l] Porting an Algorithm to Java GSoC 2012 > Date: Tue, Mar 20, 2012 8:06 am > > > > On Monday, March 19, 2012 01:10:40 PM Andreas Prlic wrote: > > You are right, this is on a high level how such a conversion could be >> performed. >> > > In my experience Java makes a poor C++ (or C) emulator. The "convert C++ > to > Java" might make an "OK" first pass but in the end you're going to want to > recode critical sections in original Java. I've found that the really big > performance improvements are in using smarter algorithms in Java vs. some > of > the 'brute force' approaches that work in C. > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > ______________________________**_________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/**mailman/listinfo/biojava-l > From sbliven at ucsd.edu Tue Mar 20 22:31:20 2012 From: sbliven at ucsd.edu (Spencer Bliven) Date: Tue, 20 Mar 2012 15:31:20 -0700 Subject: [Biojava-l] AbstractSequence#getUserCollection() Message-ID: Sequences contain a 'user collection' of type Collection. Is anybody using this feature? If I want to store data in it, should I add it to the existing userCollection (if any), or is it ok to just set a new value to it? -Spencer From andreas.prlic at gmail.com Thu Mar 22 19:00:12 2012 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 22 Mar 2012 12:00:12 -0700 Subject: [Biojava-l] Project @ Google Summer of Code 2012 In-Reply-To: <1332407594.90121.YahooMailClassic@web125603.mail.ne1.yahoo.com> References: <1332407594.90121.YahooMailClassic@web125603.mail.ne1.yahoo.com> Message-ID: Hi Camelia, > I write to You regarding the project " Take BioJava into the Cloud" that > BioJava is mentoring during Google Summer of Code 2012. > - Who are the Mentors for this project? > We assign Mentors after we have received the final project proposals, that way we can make sure that the most suitable mentors will be assigned to the student. Usually we have mentor teams of two mentors per project and we have weekly skype calls throughout the duration of the project to make sure the project stays on track. > - Which modules of BioJava are subject of this project? The project's > description is "Hadoop-ify and/or Map-Reduce some of the BioJava modules" > That depends upon your interest. There are a number of possible candidates for this, e.g. the structure alignment package. It does not have to be Hadoop, it can be also one of several other available solutions, like Hazelcast, etc. The details of this can be part of the proposal. Andreas From andreas at sdsc.edu Mon Mar 26 02:06:08 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 25 Mar 2012 19:06:08 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, There are no formal requirements and everybody has the right to apply. Having said that, it sounds like you might be well qualified for such a project. The algorithm porting project is still on, with the addition that more negotiations might be required, depending on the license of the algorithm. What we are mainly looking for at this stage are realistic project proposals that add useful new features to BioJava. We will provide feedback on such proposals to help improving them before they are submitted at the Google site. Andreas On Sun, Mar 25, 2012 at 10:26 AM, Dragos-Bogdan Sima wrote: > Hello, > My name is Dragos and i am a 2nd year student in Computer Science at > University Politehnica of Bucharest. > I have strong C/C++ and Java lnowledge and I have worked in oher > programming environments such as: Python, Assemblt, MatLab, Octave or > Scheme. > The project idea that got my attention is the one mentioned in the mailing > Subject. Before getting more into the task, I would like to know if this > idea is on for GSoC 2012. > If so, what are the requirements in order to get the right to apply for > this organisation. > > Cheers, > Dragos Sima. > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From nick_mihaiu at yahoo.com Tue Mar 27 11:53:59 2012 From: nick_mihaiu at yahoo.com (Mihaiu Nick) Date: Tue, 27 Mar 2012 04:53:59 -0700 (PDT) Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava Message-ID: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Hello, My name is Mihaiu Nicolae,?I'm?a 2nd year student in Computer Science at the Politehnica University of Bucharest and I'm very interested in working at the "New File Parsers for BioJava" project. I choose BioJava because it blends two passions of mine: coding and biology. Back in highschool biology was one of my favourite subjects, having ?a very good teacher from whom I learned a lot, I finished every year with 10. ? About my knowledge and experience - 1 year and a half experience with Java; it became my first choice in coding; currently I do all my tasks and homework in Java, also developing a bot for aichallenge [1] in Java as a university project. And a little personal project I'm working at, a memory test game, also written in Java. - 5 years of C/C++? - web: HTML, PHP, CSS, MySQL - made a module for my school's website? Some thoughts and questions about the project? - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ?? - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ?? Questions ?about the "Coding exercise" - About the "ambiguous characters",?lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence??"ACTATATCGG" and in another one?"ATGKMCGW" ? -?What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission?? it says ?the test data file named data.fasta up to 10Kb in size? ? Should I understand that 10Kb is the limit for a ?large file? ? ? Best regards, Nicolae [1]?http://aichallenge.org From andreas at sdsc.edu Tue Mar 27 18:00:23 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 27 Mar 2012 11:00:23 -0700 Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava In-Reply-To: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> References: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Message-ID: Hi Nicolae, > - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ? We are keeping the answer to this question intentionally open and want students to pick a topic they are interested in. For a list of requests that we have received from users, please take a look at: http://biojava.org/wiki/BioJava3_Feature_Requests, however we welcome other suggestions as well. > - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ? For a start focus on one parser and make sure it integrates well with the rest of biojava. Only propose more than one if you think you can easily do that given the amount of time. We are looking for realistic student proposals, so make sure you come up with a good and realistic plan. We are happy to discuss proposals before they are being submitted to google and will give feedback about how to improve them. > Questions ?about the "Coding exercise" Peter, do you want to answer those? Thanks, Andreas > > - About the "ambiguous characters",?lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence??"ACTATATCGG" and in another one?"ATGKMCGW" ? > > -?What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission?? it says ?the test data file named data.fasta up to 10Kb in > size? ? Should I understand that 10Kb is the limit for a ?large file? ? > > Best regards, > Nicolae > > [1]?http://aichallenge.org > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Tue Mar 27 18:06:27 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 27 Mar 2012 11:06:27 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, many years of research have been put into both Blast and Hmmer and it will be highly unlikely that you will be able to come up with something better in a summer. As such best to focus on getting a plain port. If you get ideas how to make the algorithms perform better during that time, probably best to feed back that information to the original developers. > Secondly, converting C/C++ source code into Java would be a very interesting > and challenging task for me. At least for the C++ part I am thinkig to > approach the use oh JNI, but is it possible to occur problems with > portability, building, or JVM's stability? I have also read about NestedVM > which provides binary translation for Java Bycode, and this approach could > be useful. Scooter, do you want to answer this one? Thanks, Andreas From amr_alhossary at hotmail.com Tue Mar 27 18:44:29 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Wed, 28 Mar 2012 02:44:29 +0800 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Dear Dr. Andreas, There exists optimization when converting C code into Java code. But on the level of implementation, not on on the level of algorithm itself. Because java deals with data types in slightly different way, (e.g. unsigned numbers, double numbers, etc.) it will need some posttranslational modification to be optimized to work on java. Amr -----Original Message----- From: Andreas Prlic Sent: Wednesday, March 28, 2012 2:06 AM To: Dragos-Bogdan Sima Cc: Biojava ; Scooter Willis Subject: Re: [Biojava-l] [Biojava-dev] Port an Algorithm to Java Hi Dragos, many years of research have been put into both Blast and Hmmer and it will be highly unlikely that you will be able to come up with something better in a summer. As such best to focus on getting a plain port. If you get ideas how to make the algorithms perform better during that time, probably best to feed back that information to the original developers. > Secondly, converting C/C++ source code into Java would be a very > interesting > and challenging task for me. At least for the C++ part I am thinkig to > approach the use oh JNI, but is it possible to occur problems with > portability, building, or JVM's stability? I have also read about NestedVM > which provides binary translation for Java Bycode, and this approach could > be useful. Scooter, do you want to answer this one? Thanks, Andreas _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From to.petr at gmail.com Tue Mar 27 20:50:22 2012 From: to.petr at gmail.com (P. Troshin) Date: Tue, 27 Mar 2012 21:50:22 +0100 Subject: [Biojava-l] GSCO 2012: New File Parsers for BioJava In-Reply-To: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> References: <1332849239.6592.YahooMailNeo@web114120.mail.gq1.yahoo.com> Message-ID: Hi Nick, I agree with Andreas (thank you for coming in!), just a few additions below: > - 1 year and a half experience with Java; it became my first choice in coding; currently I do all my tasks and homework in Java, also developing a bot for aichallenge [1] in Java as a university project. And a little personal project I'm working at, a memory test game, also written in Java. > - 5 years of C/C++ > - web: HTML, PHP, CSS, MySQL - made a module for my school's website Great, sound Java knowledge is something that would help you a lot on this project. > > Some thoughts and questions about the project > > - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ? You are right there are many parsers in BioJava, too many actually, we only need one parser for one file format. However, currently this is not the case, there are 2 or 3 FASTA parsers for example. They are all subtly different, so the task would be to unify these parsers so one parser could be used for in all the cases. > - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ? It depends on the parser and on your own abilities. However, if you can only make one FASTA parser in 3 months, than your application is unlikely to be competitive. > Questions about the "Coding exercise" > > - About the "ambiguous characters", lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence "ACTATATCGG" and in another one "ATGKMCGW" ? Correct > > - What do you mean by large, ?be capable of reading large files?, because afterwards under ?Submission? it says ?the test data file named data.fasta up to 10Kb in > size? ? Should I understand that 10Kb is the limit for a ?large file? ? For this exercise assume that the large file is the one that does not fit into the computers RAM. With Java programme you can substitute computer RAM with the amount of memory available for JVM. So let's say that your parser should be able to work with 512Mb file with the JVM settings -Xmx256M. And yes, you do not have to email this file to me. I hope that helps. Good luck with your application. Regards, Peter From to.petr at gmail.com Tue Mar 27 21:31:39 2012 From: to.petr at gmail.com (P. Troshin) Date: Tue, 27 Mar 2012 22:31:39 +0100 Subject: [Biojava-l] GSOC New File Parsers for BioJava project coding exercise submission deadline change(!) Message-ID: Hello prospective GSOC students, I corrected the deadline to coincide with this year's GSOC student application deadline, which is the 6 of April inclusive. Please make sure to send your solution on or earlier than the 6-th of April 23:59pm GMT. Have fun and good luck. Kind regards, Peter From andreas at sdsc.edu Thu Mar 29 03:35:51 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 28 Mar 2012 20:35:51 -0700 Subject: [Biojava-l] [Biojava-dev] GSoC - BioJava File Parsers question In-Reply-To: References: Message-ID: Hi David, If you take a look at the Sequence interface, this is the central place to represents all sorts of sequences. New parser should fit in with providing instances of objects that implement Sequence. If this principle is kept up we are pretty close to the SeqIO scenario from what I understand. Andreas On Wed, Mar 28, 2012 at 3:46 PM, David Felty wrote: > Thanks for the info, the SeqIO modules are very helpful. In fact, it seems > like they are quite similar to what this project asks for. Could this this > type of implementation work for BioJava? > > David > > On Wed, Mar 28, 2012 at 6:09 PM, Peter Cock wrote: > >> On Wed, Mar 28, 2012 at 10:05 PM, P. Troshin wrote: >> > Well, they all widely used tools, and as a result of analysis they >> > produce files. If you need to process these results further then you'd >> > need to parse the result files. Hence the connection. >> > >> > Regards, >> > Peter >> >> Indeed. It is quite common in Bioinformatics for file formats to >> be named after the tool which introduced them - even if sometimes >> they become much more widely used. >> >> And for GenBank and UniProt, people typically mean the GenBank >> plain text 'flat file' format also used by DDBJ (there is a very similar >> format used by EMBL with a common feature table), and for >> UniProt that could refer to the old plain text 'SwissProt' file format >> or the newer UniProt XML format. For background on these an >> other sequence file file formats you might find these pages >> helpful: >> >> http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats >> http://biopython.org/wiki/SeqIO >> >> Peter C. >> > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From andreas at sdsc.edu Thu Mar 29 03:57:50 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 28 Mar 2012 20:57:50 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, > So basically I have to understand how the programs are working while > overviewing the sources, so that I could explain in my aplication how I plan > to port to Java? I would say so. What are possible problem areas, where could things go wrong, and what would you do in that case? Andreas > > And Arn, could you explain me a little bit more the cases where the PTM > would be required or give me some usefull links, beside the wikipage, for > study? > > Thanks you, > Dragos. -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From amr_alhossary at hotmail.com Thu Mar 29 04:14:29 2012 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Thu, 29 Mar 2012 12:14:29 +0800 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: I don?t have a link to a website in my head right now, but any way, here are some examples: 1- Never use variables less than 32 bit width if you are interested in performance: consider that a and b are both short integers statement like a = a+b is valid in C but not in Java; because Java promotes all variables to 4 bytes before ANY operation (even comparative reading [non modifying] operations). This way, both a & b will be promoted to int, and summed, then an explicit cast is required before being assigned back to the variable a. All this overhead is not required in processing-oriented applications. BTW although a= a+b won?t work , a+=b will work (because it includes an implicit casting operator) 2- Remember that all variables in java are signed (except for char which is equivalent to unsigned short in C) and that variables in Java have different sizes than C so comparing the literal 0xFFFF to the short variable a whose value is 0xFFFF won?t return true..try it. Did you get why? to overcome this problem you will lose some of the performance too. 3- Remember that not all operators do the same action in Java like in C: revise the function of the >> operator in java versus C. One final point: my name is ?Amr? not ?Arn?. Regards Amr From: Dragos-Bogdan Sima Sent: Wednesday, March 28, 2012 7:05 AM To: Amr AL-Hossary Cc: Andreas Prlic ; Biojava ; Scooter Willis Subject: Re: [Biojava-l] [Biojava-dev] Port an Algorithm to Java Hello Dr. Andreas Prlic, So basically I have to understand how the programs are working while overviewing the sources, so that I could explain in my aplication how I plan to port to Java? And Arn, could you explain me a little bit more the cases where the PTM would be required or give me some usefull links, beside the wikipage, for study? Thanks you, Dragos. From xixunwu at gmail.com Thu Mar 29 18:26:48 2012 From: xixunwu at gmail.com (Xixun Wu) Date: Thu, 29 Mar 2012 14:26:48 -0400 Subject: [Biojava-l] WORLDCOMP and Hamid Arabnia Message-ID: Defamation campaign against WORLDCOMP and Hamid Arabnia For the last several months, a systematic defamation campaign is going on against the worlds' biggest computer science conference WORLDCOMP, eg. http://www.sites.google.com/site/worlddump1 or http://research.cs.wisc.edu/dbworld/messages/2012-03/1332361790.html WORLDCOMP is addressing this matter legally and a lawsuit has been filed to resolve this matter (visit WORLDCOMP's website http://www.world-academy-of-science.org and click on "news" on right side). Our preliminary investigation found the footprints of the actual persons who are sending all these defamatory comments about WORLDCOMP and its Chair Hamid Arabnia. As of now, these are the persons behind this defamatory campaign: http://www.cs.uga.edu/~thiab http://www.cs.uga.edu/~tliu http://www.cs.uga.edu/~erc http://ktwop.wordpress.com/about http://www.cs.fsu.edu/~tyson http://www.cis.famu.edu/~hchi http://www.scs.gatech.edu/people/mustaque-ahamad http://www.cs.fsu.edu/~xyuan http://www.unf.edu/~ree http://www.johnlevine.com http://curly.cis.unf.edu http://en.wikipedia.org/wiki/Albert_Shiryaev http://www.cse.sc.edu/~jtang http://www.ninaringo.com http://www.cis.famu.edu/~prasad http://www.scs.gatech.edu/people/maria-balcan http://www.f4.htw-berlin.de/~weberwu http://www.iaria.org/speakers/PetreDini.html http://www.eecs.ucf.edu/index.php?id=profiles&link=joseph_laviola (more names will be announced later on?) These people formed a team and mailing to different forums, groups, blogs and individuals, heavily criticizing WORLDCOMP. Some of them have personal or professional enmity with Professor Hamid Arabnia and some of them don't like WORLDCOMP for one reason or the other. They are using proxy servers in Georgia (Athens, Atlanta), Florida (Tallahassee, Jacksonville, Orlando), Chicago and Texas (Austin, Houston) and sending the defamatory emails. I request all of you to submit papers and make WORLDCOMP 2012 a success. All tracks of WORLDCOMP have received high citations. I assure you that WORLDCOMP will be held in July 2012 and it will continue for many years to come. I know Professor Hamid Arabnia well and he is a very nice and professional person and he is committed to organize WORLDCOMP in 2012, 2013, 2014, 2015,... With sincere respects, Mohammad Homayoun drmhomayoun at gmail.com Note: This message is sent to help defend my longtime friend Professor Hamid Arabnia http://www.cs.uga.edu/~hra (chair and coordinator of WORLDCOMP) From arthur.oviedo at epfl.ch Fri Mar 30 16:35:54 2012 From: arthur.oviedo at epfl.ch (Arthur Oviedo) Date: Fri, 30 Mar 2012 18:35:54 +0200 Subject: [Biojava-l] Interested in the "cloudization" of BioJava Message-ID: Hello, My name is Arthur, and i'm a master student at EPFL (?cole Polytechnique F?d?rale de Lausanne) in computer science. I worked in different project that are somewhat related to BioJava and cloud environment. I have worked , while i was research assistant, (briefly) in a project called UnaCloud ( http://sistemas.uniandes.edu.co/~unacloud/dokuwiki/doku.php?id=recursos:documentacion) which provides an opportunistic grid/cloud infrastructure for running scientific experiments and we have used it to help bio-informaticians with their different jobs like huge BLAST queryes, HMMER jobs, etc. As part of my assistant work in the same university, I developed a cool system called UnaCloud MSA which integrates some existing and mew developed tools to analyze Multiple Sequence Alignments. It even uses the BioJava library to perform some verification about the sequences. All of this is also done employing the UnaCloud infrastructure. This work is still in development and in preparation for publication. http://unacloudmsa.uniandes.edu.co Currently, i'm working on a class project on Hadoop (An implementation of subset of the functionalities of a Database Manager System) using Hadoop (Map-reduce) framework. All of the mentioned projects have been implemented in Java, so i suppose that i meet the java expertise requirement. I would like to know more about this project and to know also the rough dates where the Google Summer of Code would be held (To prepare my schedule). Thanks and best regards, Arthur Oviedo From andreas at sdsc.edu Fri Mar 30 17:57:34 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 30 Mar 2012 10:57:34 -0700 Subject: [Biojava-l] Interested in the "cloudization" of BioJava In-Reply-To: References: Message-ID: Hi Arthur, In short the goal is to take some aspects from BioJava and improve them so they play together well for deployment on the cloud. It sounds like you have a lot of related background for this project. However since you have been working on very similar things, for your proposal it will be important to come up with something independent and not just to propagate your previous work into BioJava. About the dates and other details, please consult the official GSoC - FAQ http://www.google-melange.com/document/show/gsoc_program/google/gsoc2012/faqs Andreas On Fri, Mar 30, 2012 at 9:35 AM, Arthur Oviedo wrote: > Hello, > My name is Arthur, and i'm a master student at EPFL (?cole Polytechnique > F?d?rale de Lausanne) in computer science. > I worked in different project that are somewhat related to BioJava and > cloud environment. > I have worked , while i was research assistant, (briefly) in a project > called UnaCloud ( > http://sistemas.uniandes.edu.co/~unacloud/dokuwiki/doku.php?id=recursos:documentacion) > which provides an opportunistic grid/cloud infrastructure for running > scientific experiments and we have used it to help bio-informaticians with > their different jobs like huge BLAST queryes, HMMER jobs, etc. > As part of my assistant work in the same university, I developed a cool > system called UnaCloud MSA which integrates some existing and mew developed > tools to analyze Multiple Sequence Alignments. It even uses the BioJava > library to perform some verification about the sequences. All of this is > also done employing the UnaCloud infrastructure. This work is still in > development and in preparation for publication. > http://unacloudmsa.uniandes.edu.co > Currently, i'm working on a class project on Hadoop (An implementation of > subset of the functionalities of a Database Manager System) using Hadoop > (Map-reduce) framework. > All of the mentioned projects have been implemented in Java, so i suppose > that i meet the java expertise requirement. > I would like to know more about this project and to know also the rough > dates where the Google Summer of Code would be held (To prepare my > schedule). > Thanks and best regards, > Arthur Oviedo > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas at sdsc.edu Fri Mar 30 18:01:27 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 30 Mar 2012 11:01:27 -0700 Subject: [Biojava-l] GSOC 2012 In-Reply-To: <4F63D36D.1010401@gmail.com> References: <4F63D36D.1010401@gmail.com> Message-ID: Hi Evgeniy, I just noticed, you have not received a response to your mail. Apologies for that, too many emails going back and forth. The goal is to add new features to biojava and to extend it so it has support for more file formats. You could take a look at our feature request page to get some ideas... http://biojava.org/wiki/BioJava3_Feature_Requests Andreas On Fri, Mar 16, 2012 at 4:57 PM, superrubiroyd wrote: > Hi, > I am the final year student in Ukraine. I have 2.5 year of Javaexperience. I > would like to work with project 'New File Parsers for BioJava' during GSOC > 2012. Can you explain little more what should be done in this project and > can you give some advices how to make correct application. > Thanks in advance. With best regards, Evgeniy Berlog > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From sharma.dhrv at gmail.com Sat Mar 31 20:46:06 2012 From: sharma.dhrv at gmail.com (Dhruv Sharma) Date: Sun, 1 Apr 2012 02:16:06 +0530 Subject: [Biojava-l] GSoC Application Discussion and Help - Porting BLAST to Java Message-ID: Hi, I am Dhruv Sharma, a senior undergraduate student pursuing B.E.(Hons.) Computer Science at BITS, Pilani, India. I am very much interested in 'porting BLAST algorithm to Java' as a GSoC 2012 project. I am proficient and primarily work using Java and C. Also, I have past experience of working in C++ before migrating to Java. However, I am new to GSoC and haven't used version control in the past. My recent project was based on developing a web application in Java for posting data to remote CS-BLAST web service with FASTA sequence, parse and auto-filter its results using the release date from RCSB PDB and download the PDB files. Since, the project aims at converting the legacy C/C++ code to Java, already suggested approaches on the Bio-Java page and my observations are:- 1) Using C++ to Java converters for 100% conversion. I have tried converting the ncbi-blast-2.2.26 source code using a few freely available converters but all of them either crashed or failed to convert even after I resolved certain header file dependency issues that emerged. Most failures occurred at function calls to non-standard C++ libraries. 2) Using JNI as an alternative solution. JNI programming would be a tedious task and would anyway require understanding of the purpose of underlying C++ code. Hence,has little advantage over rewriting the equivalent Java code. A significant advantage can be seen when there is no efficient Java alternative of the C++ code. However, platform dependence would still exist. According to my understanding of the problem, a hybrid approach can be taken up which includes using code converters for simpler files, manual coding for tricky areas and using JNI for typical C++ code involving non-standard libraries. But, I am still not clear about my exact course of action. Can you please tell me if my analysis of the problem is correct? Please also comment on the feasibility of my suggested approach and please make any suggestions as they would help me in improving my application draft that I would soon be sharing for review. As BLAST is a collection of programs, so, keeping in mind the length of code to be ported, can we work on certain selectively critical programs in it from the GSoC's perspective? Thanks. -- *Dhruv Sharma* *Student B.E.(Hons.) Computer Science BITS, Pilani * *India* From andreas at sdsc.edu Sat Mar 31 23:21:36 2012 From: andreas at sdsc.edu (Andreas Prlic) Date: Sat, 31 Mar 2012 16:21:36 -0700 Subject: [Biojava-l] [Biojava-dev] Port an Algorithm to Java In-Reply-To: References: Message-ID: Hi Dragos, we are always looking for volunteers to help with various aspects of the project. The tasks range from answering emails on the mailing list, improve documentation on our wiki, provide patches for bugs and keep developing new features. The best way to get started is to pick one of those areas and come up with an improvement ... Andreas On Sat, Mar 31, 2012 at 11:26 AM, Dragos-Bogdan Sima wrote: > Hello everyone, > > I would like to know what are the post-gsoc opportunities. Regardless, the > summer of coding I wish to continue if possible in this organization. > > Cheers, > Dragos-Bogdan.