From gmicha at gmail.com Sat Aug 1 11:49:50 2009 From: gmicha at gmail.com (Micha Sammeth) Date: Sat, 01 Aug 2009 17:49:50 +0200 Subject: [Biojava-dev] apidoc in org.biojava.bio.symbol.SimpleSymbolList Message-ID: <4A74641E.80104@gmail.com> Hi, the class header in my copy (1.7) contains the example .. FiniteAlphabet dna = (FiniteAlphabet) AlphabetManager.alphabetForName("DNA"); SymbolParser parser = dna.getParser("token"); .. but the version I check out from the CVS does not contain a method FiniteAlphabet.getParser(). I think it should read parser = dna.getTokenization("token"); right? Just wanted to bring to attention.. Best, micha. From bugzilla-daemon at portal.open-bio.org Sun Aug 2 13:31:09 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 2 Aug 2009 13:31:09 -0400 Subject: [Biojava-dev] [Bug 2540] RichSequenceIterator does not skip sequence when exception is thrown In-Reply-To: Message-ID: <200908021731.n72HV9W4010985@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2540 ------- Comment #1 from vdmerwe.karen at gmail.com 2009-08-02 13:31 EST ------- Created an attachment (id=1352) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1352&action=view) Code to make the RichSequenceIterator skip sequence when exception is thrown Any feedback regarding the use of this proposed solution will be appreciated. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gmicha at gmail.com Sun Aug 2 15:28:10 2009 From: gmicha at gmail.com (Micha Sammeth) Date: Sun, 02 Aug 2009 21:28:10 +0200 Subject: [Biojava-dev] Sequence and Feature Message-ID: <4A75E8CA.3040904@gmail.com> Hi, I am writing a parser for aligned sequencing reads and I plan to separate the read information (sequence, qualities) from the alignment information by reasons of redundancy and sortings. I planned the following classes: Read extends AbstractChangeable implements Sequence, Qualitative Alignment extends AbstractChangeable implements Feature Alignment I put directly as inner class of Read, to delegate the Feature.getSequence() directly via the outer Object. I also have sort of alignment groups which are inserted as additional Feature in between these two, but I think for the sketched toy example they are not important. One doubt is: Alignment links a subpart of the read with a subpart of the genomic sequence, which is big and probably I will never hold an instance of it. So, getSequence() here refers to the subpart of the read that gets aligned and I have a couple of custom attributes that annotate the location in the genome. Is this in the philosophy of the class hierachy design? It would be nice if someone with a bit more experience in Biojava could leave a comment if I go the right direction, or if there is a more natural way to get my hierachy into biojava. Thanks and cheers! micha. From holland at eaglegenomics.com Mon Aug 3 04:01:57 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 09:01:57 +0100 Subject: [Biojava-dev] Sequence and Feature In-Reply-To: <4A75E8CA.3040904@gmail.com> References: <4A75E8CA.3040904@gmail.com> Message-ID: <2DEC4F45-25E2-497B-A0E7-100A2AD1693C@eaglegenomics.com> Yes, Feature.getSequence() is intended only to return the sequence of the feature itself - so it would be fine not to store the whole genomic sequence, and instead just store locations referring to it. Have you looked into the existing Alignment classes in BioJava? They might be of some help to you. cheers, Richard On 2 Aug 2009, at 20:28, Micha Sammeth wrote: > Hi, > > I am writing a parser for aligned sequencing reads and I plan to > separate the read information (sequence, qualities) from the > alignment information by reasons of redundancy and sortings. > > I planned the following classes: > > Read extends AbstractChangeable implements Sequence, Qualitative > > Alignment extends AbstractChangeable implements Feature > > Alignment I put directly as inner class of Read, to delegate the > Feature.getSequence() directly via the outer Object. I also have > sort of alignment groups which are inserted as additional Feature in > between these two, but I think for the sketched toy example they are > not important. > > One doubt is: Alignment links a subpart of the read with a subpart > of the genomic sequence, which is big and probably I will never hold > an instance of it. So, getSequence() here refers to the subpart of > the read that gets aligned and I have a couple of custom attributes > that annotate the location in the genome. Is this in the philosophy > of the class hierachy design? > > It would be nice if someone with a bit more experience in Biojava > could leave a comment if I go the right direction, or if there is a > more natural way to get my hierachy into biojava. > > Thanks and cheers! > > micha. > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Aug 3 07:51:19 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 12:51:19 +0100 Subject: [Biojava-dev] Hackathon update Message-ID: Hi guys, 10 people responded (including me). 5 of those are in Cambridge, UK, 3 are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the hackathon with a holiday, and 3 suggested linking the hackathon with a conference, which would almost certainly increase chances of getting funding for travel/accommodation from employers. So, I have two options. Venues in both cases to be worked out later: 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle of the winter in the UK, but on the bright side, the Cambridge Winter Beer Festival runs from the 22nd-24th, so that's something to cheer you up at the end of the hackathon. 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is 9th-10th (TBC), then ISMB which is 11th-14th). Both have pros and cons - the Cambridge meeting means 50% of the delegates could attend for free and we might even be able to get a free venue, whereas the Boston meeting would be attractive to anyone already planning to attend BOSC or ISMB who might otherwise not be able to find funding for travel. I'm going to stick my neck out and suggest that BOSC/ISMB is the better choice, simply because of the wider range of potential delegates to attend the hackathon. We could always have a Cambridge mini-meeting at some other time. So, unless anyone objects, pencil in your diary for July 5th-8th in Boston. Please could all those interested vote yes or no for this plan so that I can find a suitably sized venue. Attendance will need to be confirmed by the date the venue sets for final booking/payment. cheers, Richard -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Aug 3 09:29:17 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 14:29:17 +0100 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> Good plan - my worry is whether or not people can get 2 weeks off in the same year for the purposes of a hackathon. But, if people are willing, I'm happy to set up both. It does mean extra cost in terms of venue hire etc. - do you have any ideas as to good sponsors? On 3 Aug 2009, at 14:10, Scooter Willis wrote: > Richard > > It probably wouldn?t hurt to try and do both. Waiting a year delays > getting started and because the two events are six months apart it > increases the odds of those who may be able to attend both. This way > at BOSC/ISMB we can have good momentum and stability for the current > modules. The BOSC/ISMB can then be focused on recruiting new > developers with a focus on new modules, code examples, docs etc. > > It also probably makes sense to try and identify/recruit Java based > bioinformatics open source applications that have needed or > interesting functionality to ?biojava? enable the algorithm of the > application. This could be a good theme for the BOSC/ISMB conference > to have current Biojava developers work with developers of other > java bioinformatics application to port key functionality so that it > works with Biojava core. > > Scooter > > > On 8/3/09 7:51 AM, "Richard Holland" > wrote: > > Hi guys, > > 10 people responded (including me). 5 of those are in Cambridge, UK, 3 > are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the > hackathon with a holiday, and 3 suggested linking the hackathon with a > conference, which would almost certainly increase chances of getting > funding for travel/accommodation from employers. > > So, I have two options. Venues in both cases to be worked out later: > > 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle > of the winter in the UK, but on the bright side, the Cambridge Winter > Beer Festival runs from the 22nd-24th, so that's something to cheer > you up at the end of the hackathon. > > 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is > 9th-10th (TBC), then ISMB which is 11th-14th). > > Both have pros and cons - the Cambridge meeting means 50% of the > delegates could attend for free and we might even be able to get a > free venue, whereas the Boston meeting would be attractive to anyone > already planning to attend BOSC or ISMB who might otherwise not be > able to find funding for travel. > > I'm going to stick my neck out and suggest that BOSC/ISMB is the > better choice, simply because of the wider range of potential > delegates to attend the hackathon. We could always have a Cambridge > mini-meeting at some other time. So, unless anyone objects, pencil in > your diary for July 5th-8th in Boston. > > Please could all those interested vote yes or no for this plan so that > I can find a suitably sized venue. Attendance will need to be > confirmed by the date the venue sets for final booking/payment. > > cheers, > Richard > > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Mon Aug 3 12:38:32 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 4 Aug 2009 00:38:32 +0800 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <93b45ca50908030938j7899572et780fd2ccd0f2f417@mail.gmail.com> Boston++ On 3 Aug 2009, 8:52 PM, "Richard Holland" wrote: Hi guys, 10 people responded (including me). 5 of those are in Cambridge, UK, 3 are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the hackathon with a holiday, and 3 suggested linking the hackathon with a conference, which would almost certainly increase chances of getting funding for travel/accommodation from employers. So, I have two options. Venues in both cases to be worked out later: 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle of the winter in the UK, but on the bright side, the Cambridge Winter Beer Festival runs from the 22nd-24th, so that's something to cheer you up at the end of the hackathon. 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is 9th-10th (TBC), then ISMB which is 11th-14th). Both have pros and cons - the Cambridge meeting means 50% of the delegates could attend for free and we might even be able to get a free venue, whereas the Boston meeting would be attractive to anyone already planning to attend BOSC or ISMB who might otherwise not be able to find funding for travel. I'm going to stick my neck out and suggest that BOSC/ISMB is the better choice, simply because of the wider range of potential delegates to attend the hackathon. We could always have a Cambridge mini-meeting at some other time. So, unless anyone objects, pencil in your diary for July 5th-8th in Boston. Please could all those interested vote yes or no for this plan so that I can find a suitably sized venue. Attendance will need to be confirmed by the date the venue sets for final booking/payment. cheers, Richard -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From andreas at sdsc.edu Tue Aug 4 02:09:37 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 3 Aug 2009 23:09:37 -0700 Subject: [Biojava-dev] Hackathon update In-Reply-To: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> References: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> Message-ID: <59a41c430908032309l7b380c92hf018c12d38dd566f@mail.gmail.com> Hi Richard, I think it is a great idea to plan a hackaton prior to next BOSC. Still this is still almost a year ahead and as such a long time away. Ideally I would like to have something already earlier than that... San Diego is far away from the UK, but I would be happy to organize and host something here, if people would be up for the longish-journey... Andreas On Mon, Aug 3, 2009 at 6:29 AM, Richard Holland wrote: > Good plan - my worry is whether or not people can get 2 weeks off in the > same year for the purposes of a hackathon. > > But, if people are willing, I'm happy to set up both. It does mean extra > cost in terms of venue hire etc. - do you have any ideas as to good > sponsors? > > > On 3 Aug 2009, at 14:10, Scooter Willis wrote: > > Richard >> >> It probably wouldn?t hurt to try and do both. Waiting a year delays >> getting started and because the two events are six months apart it increases >> the odds of those who may be able to attend both. This way at BOSC/ISMB we >> can have good momentum and stability for the current modules. The BOSC/ISMB >> can then be focused on recruiting new developers with a focus on new >> modules, code examples, docs etc. >> >> It also probably makes sense to try and identify/recruit Java based >> bioinformatics open source applications that have needed or interesting >> functionality to ?biojava? enable the algorithm of the application. This >> could be a good theme for the BOSC/ISMB conference to have current Biojava >> developers work with developers of other java bioinformatics application to >> port key functionality so that it works with Biojava core. >> >> Scooter >> >> >> >> On 8/3/09 7:51 AM, "Richard Holland" wrote: >> >> Hi guys, >> >> 10 people responded (including me). 5 of those are in Cambridge, UK, 3 >> are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the >> hackathon with a holiday, and 3 suggested linking the hackathon with a >> conference, which would almost certainly increase chances of getting >> funding for travel/accommodation from employers. >> >> So, I have two options. Venues in both cases to be worked out later: >> >> 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle >> of the winter in the UK, but on the bright side, the Cambridge Winter >> Beer Festival runs from the 22nd-24th, so that's something to cheer >> you up at the end of the hackathon. >> >> 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is >> 9th-10th (TBC), then ISMB which is 11th-14th). >> >> Both have pros and cons - the Cambridge meeting means 50% of the >> delegates could attend for free and we might even be able to get a >> free venue, whereas the Boston meeting would be attractive to anyone >> already planning to attend BOSC or ISMB who might otherwise not be >> able to find funding for travel. >> >> I'm going to stick my neck out and suggest that BOSC/ISMB is the >> better choice, simply because of the wider range of potential >> delegates to attend the hackathon. We could always have a Cambridge >> mini-meeting at some other time. So, unless anyone objects, pencil in >> your diary for July 5th-8th in Boston. >> >> Please could all those interested vote yes or no for this plan so that >> I can find a suitably sized venue. Attendance will need to be >> confirmed by the date the venue sets for final booking/payment. >> >> cheers, >> Richard >> >> -- >> Richard Holland, BSc MBCS >> Operations and Delivery Director, Eagle Genomics Ltd >> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >> http://www.eaglegenomics.com/ >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From bugzilla-daemon at portal.open-bio.org Tue Aug 4 13:28:58 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 4 Aug 2009 13:28:58 -0400 Subject: [Biojava-dev] [Bug 2540] RichSequenceIterator does not skip sequence when exception is thrown In-Reply-To: Message-ID: <200908041728.n74HSwfd027233@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2540 vdmerwe.karen at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1352 is|0 |1 obsolete| | ------- Comment #2 from vdmerwe.karen at gmail.com 2009-08-04 13:28 EST ------- Created an attachment (id=1356) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1356&action=view) Updated the previous solution -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From florian.mittag at uni-tuebingen.de Wed Aug 5 08:45:41 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Wed, 5 Aug 2009 14:45:41 +0200 Subject: [Biojava-dev] How to parse large Genbank files? In-Reply-To: References: <200907241929.08768.florian.mittag@uni-tuebingen.de> <200907281414.55156.florian.mittag@uni-tuebingen.de> Message-ID: <200908051445.42345.florian.mittag@uni-tuebingen.de> On Tuesday, 28. July 2009 14:52, Richard Holland wrote: > > Btw: Should we move this to Biojava-dev? >> probably, yes! :) done ;) > If you want to explore my ideas for a replacement Sequence model, the > code and docs are here (sequence handling is in the 'core' module with > DNA-specifics in the 'dna' module): > > http://biojava.org/wiki/BioJava3:HowTo > http://www.biojava.org/wiki/BioJava3_project > > (Methods such as file parsers would request Strings (or ideally > CharSequence - more flexible, and String extends it) as parameters > whenever they don't care about content - if they care about content > but don't care in advance about size or random access then they should > request Iterator which can be used to wrap a String and parse > on demand, and if they need full functionality then they should > request List which the default implementation of uses > ArrayLists but there's no reason a String-backed one could be written > as well). By now, I was mostly interested in a quick and dirty solution. I first attempted to create a new class StringSymbolList that would use the String as representation for the sequence and only convert to Symbols on demand. Since SimpleRichSequence uses SimpleSymbolList hard-coded, I wanted to implement a new RichSequence as well, but I was back-stabbed by Hibernate, because the bindings are set to SimpleRichSequence and when retrieving objects from the DB it uses the original BioJava classes again My solution now works and it consists out of my own implementation of GenbankFormat, RichSequenceBuilder, and RichSequence, a new class called StringSymbolList as described above and a change to SimpleRichSequence, adding the method: @Override public String seqString() { return seqstring; } which circumvents most of the array copying stuff. I also noticed that processing the Genbank files became slower with every file, so I closed the Hibernate session after each chromosome and opened a new one. (I also tried session.clean(), but somehow this didn't work). For now, it seems like everything is fine and I have no more OutOfMemory exceptions. - Florian > > cheers, > Richard > > > - Florian > > > >> On Mon, Jul 27, 2009 at 8:16 PM, Florian > >> > >> Mittag wrote: > >>> Hi Mark! > >>> > >>> On Saturday, 25. July 2009 04:20, Mark Schreiber wrote: > >>>> I don't think anyone has done much or anything to optimize these > >>>> parsers. The process you outline sounds extremely inefficient. It > >>>> is > >>>> also likely to lead to memory leaks due to the number of copy > >>>> operations. > >>> > >>> I wouldn't necessarily say that it leads to memory leaks, but it > >>> definitively leads to a high memory consumption (2GB are not > >>> enough for a > >>> 200MB file). Also, my outline of the process is based on only 2 > >>> hours of > >>> viewing the code, so actually I expected to be corrected on this. > >>> Unfortunately, it seems like I did get the right idea and it IS > >>> extremely > >>> inefficient. > >>> > >>> I mean, I understand that this is a high level of abstraction that > >>> might > >>> come in handy in many situations, but it certainly is more of an > >>> obstacle > >>> in my specific case. > >>> > >>>> As always with java, don't try and optimize without a profiler > >>>> which > >>>> will tell you which methods are taking a long time and which > >>>> objects > >>>> take the most memory. > >>> > >>> I think we should continue this discussion on the biojava-dev list > >>> or in > >>> a private conversation, as it will probably get very detailed and > >>> technical. > >>> > >>> > >>> My question to this list again: > >>> Is there a way to achieve my goal of parsing a 200MB Genbank file > >>> with > >>> the current biojava version without code changes? > >>> > >>> > >>> - Florian > >>> > >>>> On 25 Jul 2009, 1:33 AM, "Florian Mittag" > >>>> wrote: > >>>> > >>>> Hi! > >>>> > >>>> I think this is a problem worth of its own thread, so I'll start > >>>> one: > >>>> > >>>> I want to store all human chromosomes in a BioSQL database after I > >>>> loaded the > >>>> information from .gbk files. The files I get from NCBI with the > >>>> following URIs, where the id ranges from nc_000001 to nc_000024 > >>>> plus > >>>> nc_001804: > >>>> > >>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id > >>>>=n c_0 00023&rettype=gbwithparts&retmode=text > >>>> > >>>> I then try to parse the files as described in > >>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#Tools_for_reading.2Fwriti > >>>>ng _fi les but it wont work. While there are no problems parsing 1804 > >>>> and > >>>> 24, chromosome > >>>> 23 leads to a OutOfMemory exception although I gave it 2GB of heap > >>>> space. > >>>> > >>>> Here is a stack trace (the line numbers might differ, because I > >>>> already > >>>> tried > >>>> to improve GenbankFormat.java in memory efficiency): > >>>> > >>>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap > >>>> space > >>>> at > >>>> org > >>>> .biojava > >>>> .bio.seq.io.ChunkedSymbolListFactory.addSymbols(ChunkedSymbol > >>>> Lis tFactory.java:222) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.SimpleRichSequenceBuilder.addSymbols(SimpleRichS > >>>> equ enceBuilder.java:256) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.jav > >>>> a:5 35) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.RichStreamReader.nextRichSequence(RichStreamRead > >>>> er. java:110) at > >>>> org > >>>> .prodge > >>>> .sequence_viewer.db.UpdateDB_Main.updateChromosome(UpdateDB_Ma > >>>> in. java:537) at > >>>> org > >>>> .prodge > >>>> .sequence_viewer.db.UpdateDB_Main.newGenome(UpdateDB_Main.java > >>>> > >>>> :46 8) at > >>>> > >>>> org > >>>> .prodge.sequence_viewer.db.UpdateDB_Main.main(UpdateDB_Main.java: > >>>> 164) > >>>> > >>>> The line in GenbankFormat.java is: > >>>> > >>>> rlistener.addSymbols( > >>>> symParser.getAlphabet(), > >>>> (Symbol[])(sl.toList().toArray(new Symbol[0])), > >>>> 0, sl.length()); > >>>> > >>>> Sometimes it fails at the sl.toList().toArray()-part, sometimes > >>>> it fails > >>>> later > >>>> inside the addSymbols method, but it always fails. > >>>> > >>>> How can this be? I mean, the file is only 190MB in size, so 2GB of > >>>> memory should be more than enough. Browsing through the source > >>>> code, I > >>>> discovered what I think of as very inefficient handling of > >>>> sequences: > >>>> > >>>> 1) the sequence string is read from file into a StringBuffer > >>>> 2) it is converted to a string (with whitespaces removed) > >>>> 3) a SimpleSymbolList is created out of the string > >>>> 4) the SymbolList is converted to a List of Symbols > >>>> 5) the List is converted to an array of Symbols > >>>> 6) the array is passed to addSymbols > >>>> 7) there it is added to a ChunkedSymbolListFactory > >>>> 8) if at some point the sequence is requested, a SymbolList is > >>>> created > >>>> and then converted to a string. > >>>> > >>>> You see, there is a lot of copying and converting, but in the end > >>>> I have > >>>> the same string I started with. Well, I had the string, if it ever > >>>> reached the end, because it will crash before completing this > >>>> process. > >>>> > >>>> > >>>> Am I doing something wrong or is there a great potential of > >>>> improving > >>>> parsing > >>>> of Genbank files? > >>>> > >>>> > >>>> Regards, > >>>> Florian > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> -- > >>> Dipl. Inf. Florian Mittag > >>> Universit?t Tuebingen > >>> WSI-RA, Sand 1 > >>> 72076 Tuebingen, Germany > >>> Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 > > > > -- > > Dipl. Inf. Florian Mittag > > Universit?t Tuebingen > > WSI-RA, Sand 1 > > 72076 Tuebingen, Germany > > Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From markjschreiber at gmail.com Wed Aug 5 09:16:03 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 5 Aug 2009 21:16:03 +0800 Subject: [Biojava-dev] How to parse large Genbank files? In-Reply-To: <200908051445.42345.florian.mittag@uni-tuebingen.de> References: <200907241929.08768.florian.mittag@uni-tuebingen.de> <200907281414.55156.florian.mittag@uni-tuebingen.de> <200908051445.42345.florian.mittag@uni-tuebingen.de> Message-ID: <93b45ca50908050616n210bd2a3u8391d9ad7114015a@mail.gmail.com> Would it be better for the biojava SimpleRichSequence to be backed by a String and do symbol operations on the fly? Alternatively the default hibernate mapping could be to a more stringy sequence. Arguably in the absence of JPA and entity beans Hibernate should probably be talking to biojava via DTOs. An efficient BioSQL loader would directly use the DTOs or Entity beans (which could implement biojava interfaces) and not go through all the symbol hassle. Might be worth considering for BJ3 - Mark On Aug 5, 2009 8:45 PM, "Florian Mittag" wrote: On Tuesday, 28. July 2009 14:52, Richard Holland wrote: > > Btw: Should we move this to Biojava-dev?... done ;) > If you want to explore my ideas for a replacement Sequence model, the > code and docs are here (... By now, I was mostly interested in a quick and dirty solution. I first attempted to create a new class StringSymbolList that would use the String as representation for the sequence and only convert to Symbols on demand. Since SimpleRichSequence uses SimpleSymbolList hard-coded, I wanted to implement a new RichSequence as well, but I was back-stabbed by Hibernate, because the bindings are set to SimpleRichSequence and when retrieving objects from the DB it uses the original BioJava classes again My solution now works and it consists out of my own implementation of GenbankFormat, RichSequenceBuilder, and RichSequence, a new class called StringSymbolList as described above and a change to SimpleRichSequence, adding the method: @Override public String seqString() { return seqstring; } which circumvents most of the array copying stuff. I also noticed that processing the Genbank files became slower with every file, so I closed the Hibernate session after each chromosome and opened a new one. (I also tried session.clean(), but somehow this didn't work). For now, it seems like everything is fine and I have no more OutOfMemory exceptions. - Florian > > cheers, > Richard > > > - Florian > > > >> On Mon, Jul 27, 2009 at 8:16 PM, Florian > >> > >> ... > >>>>ng _fi les but it wont work. While there are no problems parsing 1804 > >>>> and > >>>> 24, chromosome > >>>> 23 leads to a OutOfMemory exception although I gave it 2GB o... -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7... From florian.mittag at uni-tuebingen.de Wed Aug 5 11:41:24 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Wed, 5 Aug 2009 17:41:24 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate Message-ID: <200908051741.24367.florian.mittag@uni-tuebingen.de> Hi, it's me again ;-) I'm really sorry to bother you with yet another problem, but I seem to attract those problems. When I parse Genbank files and store them in a BioSQL DB, all features like "gap", "mRNA", "gene", etc. are represented by newly created Terms in the ontology "biojavax" with the comment "autocreated by biojavax". I searched for an appropriate ontology and found the Sequence Ontology, which I loaded into the DB using BioPerl's load_ontology.pl I tried setting the default ontology using RichObjectBuilder.setDefaultOntology("sequence"), but when it comes to instantiation the SimpleRichSequenceBuilder, a multi-nested exception is thrown. I followed it in the code and found the cause in Hibernate: [SEVERE] (): illegal access to loading collection >> org.hibernate.LazyInitializationException: illegal access to loading collection at org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:341) at org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersistentCollection.java:86) at org.hibernate.collection.PersistentSet.toString(PersistentSet.java:309) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at java.util.AbstractCollection.toString(AbstractCollection.java:422) at org.hibernate.engine.StatefulPersistenceContext.initializeNonLazyCollections(StatefulPersistenceContext.java:844) probably cause by this exception org.hibernate.PropertyAccessException: Null value was assigned to a property of primitive type setter of org.biojavax.SimpleRankedCrossRef.rank The code to reproduce this: sessionFactory = new Configuration().configure().buildSessionFactory(); session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultOntologyName("sequence"); Ontology onto = RichObjectFactory.getDefaultOntology(); My DB has the following ontologies listed: - biological_process - gene_ontology - molecular_function - cellular_component - sequence - biojavax and only for "gene_ontology" and "biojavax" the above code snippet runs without failure. All ontologies were loaded with the load_ontology.pl script. What might be the cause? Thanks - Florian -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From florian.mittag at uni-tuebingen.de Thu Aug 6 09:16:50 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Thu, 6 Aug 2009 15:16:50 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <200908051741.24367.florian.mittag@uni-tuebingen.de> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> Message-ID: <200908061516.50183.florian.mittag@uni-tuebingen.de> Found the cause. After importing an ontology (Gene or Sequence Ontology) into the BioSQL using load_ontology.pl, the table "term_dbxref" has only NULL values in the rank column. I tried it with DB2 and MySQL, same results/error. The way I see it, this is not a problem of Hibernate. Can I set the "rank" to an arbitrary value to circumvent this problem? On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me again ;-) > > I'm really sorry to bother you with yet another problem, but I seem to > attract those problems. > > When I parse Genbank files and store them in a BioSQL DB, all features > like "gap", "mRNA", "gene", etc. are represented by newly created Terms in > the ontology "biojavax" with the comment "autocreated by biojavax". I > searched for an appropriate ontology and found the Sequence Ontology, which > I loaded into the DB using BioPerl's load_ontology.pl > > I tried setting the default ontology using > RichObjectBuilder.setDefaultOntology("sequence"), but when it comes to > instantiation the SimpleRichSequenceBuilder, a multi-nested exception is > thrown. I followed it in the code and found the cause in Hibernate: > > [SEVERE] (): illegal access to loading collection >> > org.hibernate.LazyInitializationException: illegal access to loading > collection > at > org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPe >rsistentCollection.java:341) at > org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersiste >ntCollection.java:86) at > org.hibernate.collection.PersistentSet.toString(PersistentSet.java:309) at > java.lang.String.valueOf(String.java:2827) > at java.lang.StringBuilder.append(StringBuilder.java:115) > at java.util.AbstractCollection.toString(AbstractCollection.java:422) > at > org.hibernate.engine.StatefulPersistenceContext.initializeNonLazyCollection >s(StatefulPersistenceContext.java:844) > > probably cause by this exception > > org.hibernate.PropertyAccessException: Null value was assigned to a > property of primitive type setter of org.biojavax.SimpleRankedCrossRef.rank > > > The code to reproduce this: > > sessionFactory = new Configuration().configure().buildSessionFactory(); > session = sessionFactory.openSession(); > RichObjectFactory.connectToBioSQL(session); > RichObjectFactory.setDefaultOntologyName("sequence"); > Ontology onto = RichObjectFactory.getDefaultOntology(); > > My DB has the following ontologies listed: > - biological_process > - gene_ontology > - molecular_function > - cellular_component > - sequence > - biojavax > > and only for "gene_ontology" and "biojavax" the above code snippet runs > without failure. All ontologies were loaded with the load_ontology.pl > script. > > > What might be the cause? > > Thanks > > - Florian -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From markjschreiber at gmail.com Thu Aug 6 09:48:37 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 6 Aug 2009 21:48:37 +0800 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <200908061516.50183.florian.mittag@uni-tuebingen.de> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> <200908061516.50183.florian.mittag@uni-tuebingen.de> Message-ID: <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> There shouldn't be an issue with using an arbitrary value. The ranks in biosql are mainly to preserve the order of features etc. during roundtripping. It will affect sorting of ontology terms but this is probably not a problem. - mark On Aug 6, 2009 9:42 PM, "Florian Mittag" wrote: Found the cause. After importing an ontology (Gene or Sequence Ontology) into the BioSQL using load_ontology.pl, the table "term_dbxref" has only NULL values in the rank column. I tried it with DB2 and MySQL, same results/error. The way I see it, this is not a problem of Hibernate. Can I set the "rank" to an arbitrary value to circumvent this problem? On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me again ;-) > > I'm really s... From florian.mittag at uni-tuebingen.de Thu Aug 6 10:14:02 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Thu, 6 Aug 2009 16:14:02 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> <200908061516.50183.florian.mittag@uni-tuebingen.de> <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> Message-ID: <200908061614.03033.florian.mittag@uni-tuebingen.de> On Thursday, 6. August 2009 15:48, you wrote: > There shouldn't be an issue with using an arbitrary value. The ranks in > biosql are mainly to preserve the order of features etc. during > roundtripping. It will affect sorting of ontology terms but this is > probably not a problem. Ok, then I will try this as a quick hack until I've found out if the NULL values are a bug and if it can be fixed. Thanks for the quick answer! - Florian > On Aug 6, 2009 9:42 PM, "Florian Mittag" > wrote: > > Found the cause. > > After importing an ontology (Gene or Sequence Ontology) into the BioSQL > using > load_ontology.pl, the table "term_dbxref" has only NULL values in the rank > column. I tried it with DB2 and MySQL, same results/error. > > The way I see it, this is not a problem of Hibernate. Can I set the "rank" > to > an arbitrary value to circumvent this problem? > > On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me > again ;-) > > I'm really s... From holland at eaglegenomics.com Fri Aug 7 13:51:59 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 7 Aug 2009 18:51:59 +0100 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <0AA4618C-2A99-4ACD-B07D-0AA05FE77665@eaglegenomics.com> Several have said the same. I'll try to get both organised. Watch this space. cheers, Richard On 7 Aug 2009, at 18:23, Michael Heuer wrote: > Richard Holland wrote: > >> 10 people responded (including me). 5 of those are in Cambridge, >> UK, 3 >> are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine >> the >> hackathon with a holiday, and 3 suggested linking the hackathon >> with a >> conference, which would almost certainly increase chances of getting >> funding for travel/accommodation from employers. >> >> So, I have two options. Venues in both cases to be worked out later: >> >> 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle >> of the winter in the UK, but on the bright side, the Cambridge Winter >> Beer Festival runs from the 22nd-24th, so that's something to cheer >> you up at the end of the hackathon. >> >> 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is >> 9th-10th (TBC), then ISMB which is 11th-14th). > > > I would suggest trying for both. Winter in the UK means that a lot of > work would get done. Attendance would probably be better for Boston. > > I would caution that accomodations in Boston are quite expensive, and > that the 4th of July week is the busiest week of the year with > tourists. > Perhaps the hackathon in Boston might be arranged flexibly around the > actual days of the conference, evenings and late nights and so on. > > michael > -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From heuermh at acm.org Fri Aug 7 13:23:53 2009 From: heuermh at acm.org (Michael Heuer) Date: Fri, 7 Aug 2009 13:23:53 -0400 (EDT) Subject: [Biojava-dev] Hackathon update In-Reply-To: Message-ID: Richard Holland wrote: > 10 people responded (including me). 5 of those are in Cambridge, UK, 3 > are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the > hackathon with a holiday, and 3 suggested linking the hackathon with a > conference, which would almost certainly increase chances of getting > funding for travel/accommodation from employers. > > So, I have two options. Venues in both cases to be worked out later: > > 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle > of the winter in the UK, but on the bright side, the Cambridge Winter > Beer Festival runs from the 22nd-24th, so that's something to cheer > you up at the end of the hackathon. > > 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is > 9th-10th (TBC), then ISMB which is 11th-14th). I would suggest trying for both. Winter in the UK means that a lot of work would get done. Attendance would probably be better for Boston. I would caution that accomodations in Boston are quite expensive, and that the 4th of July week is the busiest week of the year with tourists. Perhaps the hackathon in Boston might be arranged flexibly around the actual days of the conference, evenings and late nights and so on. michael From andreas at sdsc.edu Sun Aug 16 17:41:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 16 Aug 2009 14:41:03 -0700 Subject: [Biojava-dev] plans for next months Message-ID: <59a41c430908161441l3ae3ebao524237a1b7b868fe@mail.gmail.com> Hi, Here a quick summary of what I propose to be our action plan for the next months for BioJava: * I would like to call for a code-freeze in 2 weeks (or so) in order to finalize the new modularized and mavenized version of biojava for the developers. The current developmental trunk will remain permanently frozen and all future work should continue at a new location in SVN. As such it will be important that all developers commit any changes they are working on before that. * We will update the documentation for how to obtain a new mavenized checkout on the wiki. * After the change the new modules need to be tested and if no major problems are found, the ok will be given to continue working on the new modules (at the new location) * All developers should obtain a new checkout. * We need to identify sub-module leaders who will take over leadership of the sub-modules. In order to come up with a new release of biojava we should continue development on the new modules for a few months. Talking off list with Richard Holland it looks like we will have a hackaton in January in Cambridge, U.K. (details to be finalized and announced). I suggest that we use that opportunity to focus on further developing the modules and make a new public BioJava release shortly after that. At the present I see the following topics that would be great to work on until and during the hackaton in order to prepare a shiny new version of BioJava for public release: + Work on standardizing the organization of the modules (tests, examples, source, docu etc.) + Add new modules + Improve existing modules + Anything the module leaders deem necessary for their modules. + Use OSGI for visualisation related modules I can post a more detailed and specific list of things to work on if people are interested. Andreas From andreas at sdsc.edu Mon Aug 24 00:18:14 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 23 Aug 2009 21:18:14 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules Message-ID: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Hi, In order to push the modularization and migration to Maven, I would like to declare a code freeze on the current developmental trunk. Please commit all new changes by Thursday 27th of August 23:00 GMT. In the week after I would like to refactor the code base and commit the initial set of modules to a new developmental trunk. All future development will happen on that new trunk. You will be able to follow the ongoing status of this at http://biojava.org/wiki/BioJava:MavenMigration Once the modules are in place it is a good moment to hand over the leadership of the sub-modules to the new module-project leaders. It will be up to the module-lead to take the modules into the direction that he/she feels important. I would like to take this opportunity to suggest a couple of people as module-leaders and propose some action items for the modules. Feel free to comment or make additional suggestions... Here a list of modules / action items and the people that I would propose to become module leaders: Module: biojava-core Lead: Andreas Prlic - break the new modules out of core - bring up to modern Java standards, use Generics - declare old/unused code obsolete - don;t break backwards compatibility Module: biojava-sequence Lead: Richard Holland - Bring in Richard's new code that he started to develop on the biojava-3 branch. - provide a more scaleable and efficient basis for dealing with large sequence files Module: biojava-alignment Lead: Andreas Draeger - allow better access to underlying dynamic programming data structures - allow more customizable display of pairwise alignments (HTML/plain text, etc) Module : biojava-blast Lead: still looking for a leader - provide access to all details of the blast output - add support for RPS blast Module: biojava-phylo Lead: Scooter Willis - provide improved NJtree /Jalview Module: biojava-biosql Lead: Richard Holland - merge the new biojava-sequence module with the current biojava-biosql code Module: biojava-structure Lead: Andreas Prlic - add support for SCOP file parsing - add support for easy access of domains (in terms of coordinates) - add secondary structure assignment - improve structure alignments - better integration with 3D viewers (Jmol, RCSB viewers) Module: biojava-web services: The details seem still to be under discussion and perhaps we need multiple modules here? also what about REST vs. SOAP? To be discussed. People who expressed interest are: Niall Haslam,Scooter Willis, Sylvain Foisy Module?: biojava-ws-blast Module?: biojava-ws-biolit Module: biojava-sequencing Lead: ??? - support FastQ files - support parsing of output for various new sequencing machines This is only an initial set of modules and I think it is safe to say that more modules will be added after more discussions (and people volunteering to contribute). Andreas From simpleyrx at 163.com Mon Aug 24 12:48:01 2009 From: simpleyrx at 163.com (simpleyrx) Date: Tue, 25 Aug 2009 00:48:01 +0800 (CST) Subject: [Biojava-dev] Adding profile-profile alignment algorithms to Biojava Message-ID: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> Experts, Profile-profile alignment or HMM-HMM alignments have become more important in protein bioinformation field than ever before. So I think, if we can implement Profile-profile alignment and HMM-HMM alignments algorithms in Biojava package, it will be more useful to the researchers who interested in protein bioinformatics. From holland at eaglegenomics.com Mon Aug 24 13:30:31 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 24 Aug 2009 18:30:31 +0100 Subject: [Biojava-dev] Adding profile-profile alignment algorithms to Biojava In-Reply-To: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> References: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> Message-ID: Contributions of code would be welcome! Are you volunteering? :) cheers, Richard On 24 Aug 2009, at 17:48, simpleyrx wrote: > > Experts, > > Profile-profile alignment or HMM-HMM alignments have > become more important in protein bioinformation field than ever > before. So I think, if we can implement Profile-profile alignment > and HMM-HMM alignments algorithms in Biojava package, it will be > more useful to the researchers who interested in protein > bioinformatics. > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From heuermh at acm.org Mon Aug 24 21:19:24 2009 From: heuermh at acm.org (Michael Heuer) Date: Mon, 24 Aug 2009 21:19:24 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: Andreas Prlic wrote: > In order to push the modularization and migration to Maven, I would like to > declare a code freeze on the current developmental trunk. Please commit all > new changes by > > Thursday 27th of August 23:00 GMT. > > In the week after I would like to refactor the code base and commit the > initial set of modules to a new developmental trunk. All future development > will happen on that new trunk. > > You will be able to follow the ongoing status of this at > > http://biojava.org/wiki/BioJava:MavenMigration > > > Once the modules are in place it is a good moment to hand over the > leadership of the sub-modules to the new module-project leaders. It will be > up to the module-lead to take the modules into the direction that he/she > feels important. I would like to take this opportunity to suggest a couple > of people as module-leaders and propose some action items for the modules. > Feel free to comment or make additional suggestions... Sign me up for help with maven configuration/reporting, unit testing, and generics API matters if you wish. > Here a list of modules / action items and the people that I would propose to > become module leaders: > > Module: biojava-core Lead: Andreas Prlic > - break the new modules out of core > - bring up to modern Java standards, use Generics > - declare old/unused code obsolete > - don;t break backwards compatibility Seems to me the last one will greatly hamper the rest of this effort. The next version needs to be binary compatible with 1.7? michael From andreas at sdsc.edu Mon Aug 24 22:17:00 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 24 Aug 2009 19:17:00 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> >> Once the modules are in place it is a good moment to hand over the >> leadership of the sub-modules to the new module-project leaders. It will be >> up to the module-lead to take the modules into the direction that he/she >> feels important. I would like to take this opportunity to suggest a couple >> of people as module-leaders and propose some action items for the modules. >> Feel free to comment or make additional suggestions... > > Sign me up for help with maven configuration/reporting, unit testing, and > generics API matters if you wish. Excellent, I will come back to you on this :-) >> ?- don;t break backwards compatibility > > Seems to me the last one will greatly hamper the rest of this effort. > The next version needs to be binary compatible with 1.7? What I mean is that we should try not to disrupt things as much as is reasonable. I am all for a pragmatic approach. While trying to be conservative I guess refactoring should be discussed on a case by case basis. To give an example: an area where I am supporting re-factoring is the blast parser. The package name is confusing and we probably need some code changes to expose more details of the parser. Are you thinking of any other situtations, where you think breaking backwards compatibility will be inevitable? Andreas From heuermh at acm.org Mon Aug 24 22:50:09 2009 From: heuermh at acm.org (Michael Heuer) Date: Mon, 24 Aug 2009 22:50:09 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> Message-ID: Andreas Prlic wrote: > >> Once the modules are in place it is a good moment to hand over the > >> leadership of the sub-modules to the new module-project leaders. It will be > >> up to the module-lead to take the modules into the direction that he/she > >> feels important. I would like to take this opportunity to suggest a couple > >> of people as module-leaders and propose some action items for the modules. > >> Feel free to comment or make additional suggestions... > > > > Sign me up for help with maven configuration/reporting, unit testing, and > > generics API matters if you wish. > > Excellent, I will come back to you on this :-) > > >> ?- don;t break backwards compatibility > > > > Seems to me the last one will greatly hamper the rest of this effort. > > The next version needs to be binary compatible with 1.7? > > > What I mean is that we should try not to disrupt things as much as is > reasonable. I am all for a pragmatic approach. While trying to be > conservative I guess refactoring should be discussed on a case by case > basis. To give an example: an area where I am supporting re-factoring > is the blast parser. The package name is confusing and we probably > need some code changes to expose more details of the parser. Are you > thinking of any other situtations, where you think breaking backwards > compatibility will be inevitable? Ah yes, pragmatically backwards compatible with 1.7 is a better goal. Maintaining binary compatibility is very difficult, and something we haven't really done in the past. Consider the following biojava 1.6.1 vs biojava 1.7 clirr [1] report. michael [1] http://clirr.sf.net --- ERROR: 6004: org.biojava.bio.alignment.NeedlemanWunsch: Changed type of field CostMatrix from double[][] to int[][] ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 2 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 3 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 4 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 5 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getDelete()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getEditDistance()' has been changed to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getGapExt()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getInsert()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getMatch()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getReplace()' has been changed to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 2 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 3 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'protected double min(double, double, double)' has been changed to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public java.lang.String printCostMatrix(double[][], char[], char[])' has changed its type to int[][] ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setDelete(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setGapExt(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setInsert(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setMatch(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setReplace(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SequenceAlignment: Parameter 11 of 'public java.lang.String formatOutput(java.lang.String, java.lang.String, java.lang.String[], java.lang.String, int, int, long, int, int, long, double, long)' has changed its type to int ERROR: 7006: org.biojava.bio.alignment.SequenceAlignment: Return type of method 'public java.lang.String formatOutput(java.lang.String, java.lang.String, java.lang.String[], java.lang.String, int, int, long, int, int, long, double, long)' has been changed to java.lang.StringBuffer ERROR: 7006: org.biojava.bio.alignment.SequenceAlignment: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 2 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 3 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 4 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 5 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7006: org.biojava.bio.alignment.SmithWaterman: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setDelete(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setGapExt(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setInsert(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setMatch(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setReplace(double)' has changed its type to short ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field matrix from int[][] to short[][] ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field max from int to short ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field min from int to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 2 of 'public SubstitutionMatrix(org.biojava.bio.symbol.FiniteAlphabet, int, int)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 3 of 'public SubstitutionMatrix(org.biojava.bio.symbol.FiniteAlphabet, int, int)' has changed its type to short INFO: 7011: org.biojava.bio.alignment.SubstitutionMatrix: Method 'public SubstitutionMatrix(java.io.File)' has been added ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getMax()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getMin()' has been changed to short INFO: 7011: org.biojava.bio.alignment.SubstitutionMatrix: Method 'public org.biojava.bio.alignment.SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader)' has been added ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getValueAt(org.biojava.bio.symbol.Symbol, org.biojava.bio.symbol.Symbol)' has been changed to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 1 of 'protected int[][] parseMatrix(java.lang.String)' has changed its type to java.lang.Object ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'protected int[][] parseMatrix(java.lang.String)' has been changed to short[][] ERROR: 7009: org.biojava.bio.alignment.SubstitutionMatrix: Accessibility of method 'protected int[][] parseMatrix(java.lang.String)' has been decreased from protected to private INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public boolean canAdvance()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public org.biojava.bio.symbol.Symbol currentRes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public org.biojava.bio.symbol.Symbol lastRes()' has been removed, but an inherited definition exists. INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public ArrowGlyph(java.awt.Paint, java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public ArrowGlyph(java.awt.geom.Rectangle2D$Float, java.awt.Paint, java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public java.awt.Paint getFillPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public java.awt.Paint getOuterPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setDirection(int)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setFillPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setOuterPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.RectangleGlyph: Method 'public java.awt.Paint getPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.RectangleGlyph: Method 'public void setPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.TurnGlyph: Method 'public java.awt.Paint getPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.TurnGlyph: Method 'public void setPaint(java.awt.Paint)' has been added INFO: 6009: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Accessibility of field fList has been increased from private to protected INFO: 6009: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Accessibility of field gList has been increased from private to protected INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public boolean containsFilter(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public org.biojava.bio.seq.FeatureFilter getFeatureFilter(int)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public org.biojava.bio.gui.glyph.Glyph getGlyphForFilter(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public void removeFilterWithGlyph(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public void setGlyphForFilter(org.biojava.bio.seq.FeatureFilter, org.biojava.bio.gui.glyph.Glyph)' has been added INFO: 6009: org.biojava.bio.gui.sequence.SequencePanelWrapper: Accessibility of field seqPanels has been increased from private to protected INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. From holland at eaglegenomics.com Tue Aug 25 00:32:24 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 25 Aug 2009 05:32:24 +0100 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> Message-ID: <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> > > > What I mean is that we should try not to disrupt things as much as is > reasonable. I am all for a pragmatic approach. While trying to be > conservative I guess refactoring should be discussed on a case by case > basis. To give an example: an area where I am supporting re-factoring > is the blast parser. The package name is confusing and we probably > need some code changes to expose more details of the parser. Are you > thinking of any other situtations, where you think breaking backwards > compatibility will be inevitable? Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). cheers, Richard > Andreas > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Tue Aug 25 02:58:40 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 25 Aug 2009 14:58:40 +0800 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> Message-ID: <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> I would agree with Richard on this. I think the changes being proposed are not compatible with the current API. There are a couple of things wrong with the current model (such as the Feature, Strand, Location issues). There are also several areas where best-practices of the past (parts of BioJava are 10 years old) are not considered best practices now (some like Singletons are often thought of as anti-patterns these days). Add to that the fact that we have never been truly backwards compatible (expept maybe 1.3 and 1.3.1 ?) and I think we can justifiably try and avoid the claim that BJ1.7 should be backwards compatible. We can continue to make older Jars available for people who need them although most likely people who have a need for legacy support already have the Jars that they need bundled up with their apps. Shared libraries have very much fallen out of favor in recent years in almost all languages and system wide classpaths are asking for trouble. Hard-drives are cheap so it is no big deal to have a dedicated version of the BioJava jar bundled with each app that needs it. We could adopt the idea that backwards compatible builds get minor-version numbers eg 1.1 while other builds get major version numbers. I guess this would mean we are at BioJava 7 ? Backwards compatibility would be great to have but not if the effort required hinders innovation. - Mark On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland wrote: >> >> >> What I mean is that we should try not to disrupt things as much as is >> reasonable. I am all for a pragmatic approach. While trying to be >> conservative I guess refactoring should be discussed on a case by case >> basis. To give an example: an area where I am supporting re-factoring >> is the blast parser. The package name is confusing and we probably >> need some code changes to expose more details of the parser. Are you >> thinking of any other situtations, where you think breaking backwards >> compatibility will be inevitable? > > Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). > > My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). > > cheers, > Richard > >> Andreas >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From jacobsen at ebi.ac.uk Tue Aug 25 04:45:52 2009 From: jacobsen at ebi.ac.uk (Jules Jacobsen) Date: Tue, 25 Aug 2009 09:45:52 +0100 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> Message-ID: <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> I think Mark has a good point here - there are certain aspects of BioJava which are considered to be un-necessarily over-complicated and these things have been deal-breakers for the people concerned - I remember a couple of cases from the EBI where they have implemented their own system instead of using and supporting BioJava. Fixing areas of confusion, simplifying and moving forwards without maintaining backwards-compatibility might be a good idea for increasing user numbers and elevating the general perception of the project, whilst potentially risking alienating some existing users. I think his idea of maintaining compatibility within point releases and stating that full version releases may not have backwards compatibility would make it clearer for users as to what to expect from a release. It may also help the developers stay on track with the task and general design focus for that release by constraining them to the current system during a point release whilst highlighting confusing areas which can be dealt with in a more satifsfactory manner in the next full release. Jules On Tue, Aug 25, 2009 at 7:58 AM, Mark Schreiber wrote: > I would agree with Richard on this. I think the changes being proposed > are not compatible with the current API. There are a couple of things > wrong with the current model (such as the Feature, Strand, Location > issues). There are also several areas where best-practices of the past > (parts of BioJava are 10 years old) are not considered best practices > now (some like Singletons are often thought of as anti-patterns these > days). > > Add to that the fact that we have never been truly backwards > compatible (expept maybe 1.3 and 1.3.1 ?) ?and I think we can > justifiably try and avoid the claim that BJ1.7 should be backwards > compatible. ?We can continue to make older Jars available for people > who need them although most likely people who have a need for legacy > support already have the Jars that they need bundled up with their > apps. Shared libraries have very much fallen out of favor in recent > years in almost all languages and system wide classpaths are asking > for trouble. ?Hard-drives are cheap so it is no big deal to have a > dedicated version of the BioJava jar bundled with each app that needs > it. > > We could adopt the idea that backwards compatible builds get > minor-version numbers eg 1.1 while other builds get major version > numbers. I guess this would mean we are at BioJava 7 ? > > Backwards compatibility would be great to have but not if the effort > required hinders innovation. > > - Mark > > On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland > wrote: >>> >>> >>> What I mean is that we should try not to disrupt things as much as is >>> reasonable. I am all for a pragmatic approach. While trying to be >>> conservative I guess refactoring should be discussed on a case by case >>> basis. To give an example: an area where I am supporting re-factoring >>> is the blast parser. The package name is confusing and we probably >>> need some code changes to expose more details of the parser. Are you >>> thinking of any other situtations, where you think breaking backwards >>> compatibility will be inevitable? >> >> Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). >> >> My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). >> >> cheers, >> Richard >> >>> Andreas >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> -- >> Richard Holland, BSc MBCS >> Operations and Delivery Director, Eagle Genomics Ltd >> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >> http://www.eaglegenomics.com/ >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > Jules Jacobsen UniProt-PDB Integration EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK From andreas at sdsc.edu Tue Aug 25 13:36:45 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 25 Aug 2009 10:36:45 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> Message-ID: <59a41c430908251036s616ab5f3m825d95223e758d85@mail.gmail.com> I agree with all that has been said so far. The Sequence/Feature model is definitely not good enough and well, also does not work for protein structures. (There can be alternate positions and the numbering can be non-sequential and have negative positions.) Still the question is, do we need to throw away the backwards compatibility? The new modularization will allow a plug and play architecture and we could easily have two generations of code in different modules. That way legacy code could depend on the older "core" (perhaps we should find a different name) while newly written code will be based on biojava-sequence, which would contain Richard's new code. That way we could prepare the code for the future, while still embracing the past. One example that heavily uses the Sequence and Distributions APIs is NestedMica. It is a pretty cool machine learning software and I was hoping that we could bring that closer to biojava. (a machine learning module in BJ would be cool, no?) Andreas On Tue, Aug 25, 2009 at 1:45 AM, Jules Jacobsen wrote: > I think Mark has a good point here - there are certain aspects of > BioJava which are considered to be un-necessarily over-complicated and > these things have been deal-breakers for the people concerned - I > remember a couple of cases from the EBI where they have implemented > their own system instead of using and supporting BioJava. > > Fixing areas of confusion, simplifying and moving forwards without > maintaining backwards-compatibility might be a good idea for > increasing user numbers and elevating the general perception of the > project, whilst potentially risking alienating some existing users. > > I think his idea of maintaining compatibility within point releases > and stating that full version releases may not have backwards > compatibility would make it clearer for users as to what to expect > from a release. It may also help the developers stay on track with the > task and general design focus for that release by constraining them to > the current system during a point release whilst highlighting > confusing areas which can be dealt with in a more satifsfactory manner > in the next full release. > > ?Jules > > On Tue, Aug 25, 2009 at 7:58 AM, Mark Schreiber wrote: >> I would agree with Richard on this. I think the changes being proposed >> are not compatible with the current API. There are a couple of things >> wrong with the current model (such as the Feature, Strand, Location >> issues). There are also several areas where best-practices of the past >> (parts of BioJava are 10 years old) are not considered best practices >> now (some like Singletons are often thought of as anti-patterns these >> days). >> >> Add to that the fact that we have never been truly backwards >> compatible (expept maybe 1.3 and 1.3.1 ?) ?and I think we can >> justifiably try and avoid the claim that BJ1.7 should be backwards >> compatible. ?We can continue to make older Jars available for people >> who need them although most likely people who have a need for legacy >> support already have the Jars that they need bundled up with their >> apps. Shared libraries have very much fallen out of favor in recent >> years in almost all languages and system wide classpaths are asking >> for trouble. ?Hard-drives are cheap so it is no big deal to have a >> dedicated version of the BioJava jar bundled with each app that needs >> it. >> >> We could adopt the idea that backwards compatible builds get >> minor-version numbers eg 1.1 while other builds get major version >> numbers. I guess this would mean we are at BioJava 7 ? >> >> Backwards compatibility would be great to have but not if the effort >> required hinders innovation. >> >> - Mark >> >> On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland >> wrote: >>>> >>>> >>>> What I mean is that we should try not to disrupt things as much as is >>>> reasonable. I am all for a pragmatic approach. While trying to be >>>> conservative I guess refactoring should be discussed on a case by case >>>> basis. To give an example: an area where I am supporting re-factoring >>>> is the blast parser. The package name is confusing and we probably >>>> need some code changes to expose more details of the parser. Are you >>>> thinking of any other situtations, where you think breaking backwards >>>> compatibility will be inevitable? >>> >>> Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). >>> >>> My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). >>> >>> cheers, >>> Richard >>> >>>> Andreas >>>> >>>> _______________________________________________ >>>> biojava-dev mailing list >>>> biojava-dev at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>> >>> -- >>> Richard Holland, BSc MBCS >>> Operations and Delivery Director, Eagle Genomics Ltd >>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >>> http://www.eaglegenomics.com/ >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > > > > Jules Jacobsen > > UniProt-PDB Integration > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge > CB10 1SD > UK > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From cmasak at gmail.com Wed Aug 26 09:29:24 2009 From: cmasak at gmail.com (=?ISO-8859-1?Q?Carl_M=E4sak?=) Date: Wed, 26 Aug 2009 15:29:24 +0200 Subject: [Biojava-dev] [BUG] Infinite regress when calling DNATools.createDNASequence with a DNA string containing a '~' char Message-ID: <16d769b70908260629m15512bc4tb8798d41d53fad0f@mail.gmail.com> Hello, Two things: 1. The BioJava wiki links to a Bugzilla instance, saying bugs should be posted there ([1]). As I write this, that Bugzilla instance gives a 500 Internal Server Error ([2]). [1] [2] 2. In the face of this, I hope you don't mind I leave my bug report here for the time being. We're wrapping BioJava in the Bioclipse project. We've found what appears to be a logical bug causing an infinite regress and a stack overflow. Let's call DNATools.createDNASequence("~", ""). The following code in that method (org/biojava/bio/seq/DNATools.java:188) will be executed. public static Sequence createDNASequence(String dna, String name) throws IllegalSymbolException { //should I be calling createGappedDNASequence? if(dna.indexOf('-') != -1 || dna.indexOf('~') != -1){//there is a gap return createGappedDNASequence(dna, name); } The following code in createGappedDNASequence (DNATools.java:207) will be executed: /** Get a new dna as a GappedSequence */ public static GappedSequence createGappedDNASequence(String dna, String name) throws IllegalSymbolException{ String dna1 = dna.replaceAll("-", ""); Sequence dnaSeq = createDNASequence(dna1, name); The infinite regress is caused by these two methods calling each other, for ever. There is no bottoming-out, because none of these lines removes '~' characters. We experience this problem in Biojava 1.6, but the above code and line numbers are from 1.7, where the issue remains. Regards, // Carl M?sak From heuermh at acm.org Thu Aug 27 13:01:31 2009 From: heuermh at acm.org (Michael Heuer) Date: Thu, 27 Aug 2009 13:01:31 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: Andreas Prlic wrote: > Here a list of modules / action items and the people that I would propose to > become module leaders: > ... > > Module: biojava-sequencing Lead: Michael Heuer > - support FastQ files > - support parsing of output for various new sequencing machines I have volunteered on the open-bio mailing list to implement FASTQ support. A nice collection of test data is being created in collaboration with the other open-bio projects. If anyone has interest in a particular data set, please let me know, as I will also need data for performance tuning. michael From andreas at sdsc.edu Thu Aug 27 13:30:08 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 27 Aug 2009 10:30:08 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> Great, thanks for "volunteering", Michael. To add another Module: biojava-das : Lead: Jonathan Warren probably deprecate the old DAS code in BJ and replace it with the up to date Dasobert library Thanks to Jonathan for volunteering as well. Andreas On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: > Andreas Prlic wrote: > >> Here a list of modules / action items and the people that I would propose to >> become module leaders: >> ... >> >> Module: biojava-sequencing Lead: ?Michael Heuer >> ? - support FastQ files >> ? - support parsing of output for various new sequencing machines > > I have volunteered on the open-bio mailing list to implement FASTQ > support. ?A nice collection of test data is being created in collaboration > with the other open-bio projects. ?If anyone has interest in a particular > data set, please let me know, as I will also need data for performance > tuning. > > ? michael > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From markjschreiber at gmail.com Fri Aug 28 01:37:59 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 28 Aug 2009 13:37:59 +0800 Subject: [Biojava-dev] [Biojava-l] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> Message-ID: <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> I'm happy to volunteer code for: 1. BLASTXML parser as long as I can change the ssbind APIs (other parsers could go into a legacy module??). Actually I would prefer to completely decouple from the sequence/ feature module as many people would like a blast parser without the rest of biojava thrown in. 2. BioSQL/ JPA bindings. I have already generated JPA compliant entity beans for mapping to BioSQL as well as JPA handler code that makes sure modifications presist properly. Currently the object model very closely follows the BioSQL table structure. Also the current beans are what people call Anaemic beans in that they hold data and provide getters and setters but no biological behaivour. I can easily provide bio-smarts to the beans but it might be better to hold off until there is a module that contains sequence/feature interfaces which the beans could implement. 3. Happy to provide code for an enterprise module if there is sufficient interest. This would probably take the form of SessionBeans and WebServices that can be deployed to Glassfish/ JBoss etc to provide biological services for people who want to make client server or SOA apps. - Mark On Fri, Aug 28, 2009 at 1:30 AM, Andreas Prlic wrote: > Great, thanks for "volunteering", Michael. > > To add another Module: > > biojava-das : Lead: Jonathan Warren > probably deprecate the old DAS code in BJ and replace it with > the up to date Dasobert library > > Thanks to Jonathan for volunteering as well. > > Andreas > > > > > On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: > > Andreas Prlic wrote: > > > >> Here a list of modules / action items and the people that I would > propose to > >> become module leaders: > >> ... > >> > >> Module: biojava-sequencing Lead: Michael Heuer > >> - support FastQ files > >> - support parsing of output for various new sequencing machines > > > > I have volunteered on the open-bio mailing list to implement FASTQ > > support. A nice collection of test data is being created in > collaboration > > with the other open-bio projects. If anyone has interest in a particular > > data set, please let me know, as I will also need data for performance > > tuning. > > > > michael > > > > _______________________________________________ > > biojava-dev mailing list > > biojava-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Fri Aug 28 11:10:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 28 Aug 2009 08:10:03 -0700 Subject: [Biojava-dev] [Biojava-l] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> Message-ID: <59a41c430908280810s1720cfckbc36168f2fbc73a8@mail.gmail.com> Thanks, Mark. Guess we should start collecting all this info on a wiki page. I started to edit http://biojava.org/wiki/BioJava:Modules module leaders: feel free to edit the plans for your module... Andreas On Thu, Aug 27, 2009 at 10:37 PM, Mark Schreiber wrote: - Show quoted text - On Thu, Aug 27, 2009 at 10:37 PM, Mark Schreiber wrote: > I'm happy to volunteer code for: > > BLASTXML parser as long as I can change the ssbind APIs (other parsers could > go into a legacy module??). Actually I would prefer to completely decouple > from the sequence/ feature module as many people would like a blast parser > without the rest of biojava thrown in. > BioSQL/ JPA bindings. I have already generated JPA compliant entity beans > for mapping to BioSQL as well as JPA handler code that makes sure > modifications presist properly. Currently the object model very closely > follows the BioSQL table structure.? Also the current beans are what people > call Anaemic beans in that they hold data and provide getters and setters > but no biological behaivour. I can easily provide bio-smarts to the beans > but it might be better to hold off until there is a module that contains > sequence/feature interfaces which the beans could implement. > Happy to provide code for an enterprise module if there is sufficient > interest. This would probably take the form of SessionBeans and WebServices > that can be deployed to Glassfish/ JBoss etc to provide biological services > for people who want to make client server or SOA apps. > > - Mark > > > On Fri, Aug 28, 2009 at 1:30 AM, Andreas Prlic wrote: >> >> Great, thanks for "volunteering", Michael. >> >> To add another Module: >> >> biojava-das : Lead: Jonathan Warren >> probably deprecate the old DAS code in BJ and replace it with >> the up to date Dasobert library >> >> Thanks to Jonathan for volunteering as well. >> >> Andreas >> >> >> >> >> On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: >> > Andreas Prlic wrote: >> > >> >> Here a list of modules / action items and the people that I would >> >> propose to >> >> become module leaders: >> >> ... >> >> >> >> Module: biojava-sequencing Lead: ?Michael Heuer >> >> ? - support FastQ files >> >> ? - support parsing of output for various new sequencing machines >> > >> > I have volunteered on the open-bio mailing list to implement FASTQ >> > support. ?A nice collection of test data is being created in >> > collaboration >> > with the other open-bio projects. ?If anyone has interest in a >> > particular >> > data set, please let me know, as I will also need data for performance >> > tuning. >> > >> > ? michael >> > >> > _______________________________________________ >> > biojava-dev mailing list >> > biojava-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > From andreas at sdsc.edu Sun Aug 30 21:23:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 30 Aug 2009 18:23:03 -0700 Subject: [Biojava-dev] maven progress Message-ID: <59a41c430908301823s6e2e3d7fi6caffc47e1a8c0ff@mail.gmail.com> Hi, I started to split up biojava into submodules and am mavenizing the build process. The new SVN location is emerging here: http://dev.open-bio.org/home/svn-repositories/biojava/biojava-live/biojava or in your browser: http://code.open-bio.org/svnweb/index.cgi/biojava/browse/biojava-live/biojava A few questions so far from my side. 1) bytecode.jar: at the present the core module depends on this. So far it is in the /jars subfolder of the module and needs to be installed by hand. What is the best way to deal with this in SVN? 2) Sequence module (Richard's original biojava v.3 branch) Since this consists of sub-modules I have set it up as a few hierarchically organized submodules. There is some biosql code there as well. Richard/Mark not sure now to arrange this. I think it would be good to have a biosql module. Shall I refactor the current biosql code out of core into a new biosql module or will the current code be obsoleted and replaced with the new code in the sequence module? Andreas From gmicha at gmail.com Sat Aug 1 15:49:50 2009 From: gmicha at gmail.com (Micha Sammeth) Date: Sat, 01 Aug 2009 17:49:50 +0200 Subject: [Biojava-dev] apidoc in org.biojava.bio.symbol.SimpleSymbolList Message-ID: <4A74641E.80104@gmail.com> Hi, the class header in my copy (1.7) contains the example .. FiniteAlphabet dna = (FiniteAlphabet) AlphabetManager.alphabetForName("DNA"); SymbolParser parser = dna.getParser("token"); .. but the version I check out from the CVS does not contain a method FiniteAlphabet.getParser(). I think it should read parser = dna.getTokenization("token"); right? Just wanted to bring to attention.. Best, micha. From bugzilla-daemon at portal.open-bio.org Sun Aug 2 17:31:09 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 2 Aug 2009 13:31:09 -0400 Subject: [Biojava-dev] [Bug 2540] RichSequenceIterator does not skip sequence when exception is thrown In-Reply-To: Message-ID: <200908021731.n72HV9W4010985@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2540 ------- Comment #1 from vdmerwe.karen at gmail.com 2009-08-02 13:31 EST ------- Created an attachment (id=1352) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1352&action=view) Code to make the RichSequenceIterator skip sequence when exception is thrown Any feedback regarding the use of this proposed solution will be appreciated. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gmicha at gmail.com Sun Aug 2 19:28:10 2009 From: gmicha at gmail.com (Micha Sammeth) Date: Sun, 02 Aug 2009 21:28:10 +0200 Subject: [Biojava-dev] Sequence and Feature Message-ID: <4A75E8CA.3040904@gmail.com> Hi, I am writing a parser for aligned sequencing reads and I plan to separate the read information (sequence, qualities) from the alignment information by reasons of redundancy and sortings. I planned the following classes: Read extends AbstractChangeable implements Sequence, Qualitative Alignment extends AbstractChangeable implements Feature Alignment I put directly as inner class of Read, to delegate the Feature.getSequence() directly via the outer Object. I also have sort of alignment groups which are inserted as additional Feature in between these two, but I think for the sketched toy example they are not important. One doubt is: Alignment links a subpart of the read with a subpart of the genomic sequence, which is big and probably I will never hold an instance of it. So, getSequence() here refers to the subpart of the read that gets aligned and I have a couple of custom attributes that annotate the location in the genome. Is this in the philosophy of the class hierachy design? It would be nice if someone with a bit more experience in Biojava could leave a comment if I go the right direction, or if there is a more natural way to get my hierachy into biojava. Thanks and cheers! micha. From holland at eaglegenomics.com Mon Aug 3 08:01:57 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 09:01:57 +0100 Subject: [Biojava-dev] Sequence and Feature In-Reply-To: <4A75E8CA.3040904@gmail.com> References: <4A75E8CA.3040904@gmail.com> Message-ID: <2DEC4F45-25E2-497B-A0E7-100A2AD1693C@eaglegenomics.com> Yes, Feature.getSequence() is intended only to return the sequence of the feature itself - so it would be fine not to store the whole genomic sequence, and instead just store locations referring to it. Have you looked into the existing Alignment classes in BioJava? They might be of some help to you. cheers, Richard On 2 Aug 2009, at 20:28, Micha Sammeth wrote: > Hi, > > I am writing a parser for aligned sequencing reads and I plan to > separate the read information (sequence, qualities) from the > alignment information by reasons of redundancy and sortings. > > I planned the following classes: > > Read extends AbstractChangeable implements Sequence, Qualitative > > Alignment extends AbstractChangeable implements Feature > > Alignment I put directly as inner class of Read, to delegate the > Feature.getSequence() directly via the outer Object. I also have > sort of alignment groups which are inserted as additional Feature in > between these two, but I think for the sketched toy example they are > not important. > > One doubt is: Alignment links a subpart of the read with a subpart > of the genomic sequence, which is big and probably I will never hold > an instance of it. So, getSequence() here refers to the subpart of > the read that gets aligned and I have a couple of custom attributes > that annotate the location in the genome. Is this in the philosophy > of the class hierachy design? > > It would be nice if someone with a bit more experience in Biojava > could leave a comment if I go the right direction, or if there is a > more natural way to get my hierachy into biojava. > > Thanks and cheers! > > micha. > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Aug 3 11:51:19 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 12:51:19 +0100 Subject: [Biojava-dev] Hackathon update Message-ID: Hi guys, 10 people responded (including me). 5 of those are in Cambridge, UK, 3 are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the hackathon with a holiday, and 3 suggested linking the hackathon with a conference, which would almost certainly increase chances of getting funding for travel/accommodation from employers. So, I have two options. Venues in both cases to be worked out later: 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle of the winter in the UK, but on the bright side, the Cambridge Winter Beer Festival runs from the 22nd-24th, so that's something to cheer you up at the end of the hackathon. 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is 9th-10th (TBC), then ISMB which is 11th-14th). Both have pros and cons - the Cambridge meeting means 50% of the delegates could attend for free and we might even be able to get a free venue, whereas the Boston meeting would be attractive to anyone already planning to attend BOSC or ISMB who might otherwise not be able to find funding for travel. I'm going to stick my neck out and suggest that BOSC/ISMB is the better choice, simply because of the wider range of potential delegates to attend the hackathon. We could always have a Cambridge mini-meeting at some other time. So, unless anyone objects, pencil in your diary for July 5th-8th in Boston. Please could all those interested vote yes or no for this plan so that I can find a suitably sized venue. Attendance will need to be confirmed by the date the venue sets for final booking/payment. cheers, Richard -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Aug 3 13:29:17 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 3 Aug 2009 14:29:17 +0100 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> Good plan - my worry is whether or not people can get 2 weeks off in the same year for the purposes of a hackathon. But, if people are willing, I'm happy to set up both. It does mean extra cost in terms of venue hire etc. - do you have any ideas as to good sponsors? On 3 Aug 2009, at 14:10, Scooter Willis wrote: > Richard > > It probably wouldn?t hurt to try and do both. Waiting a year delays > getting started and because the two events are six months apart it > increases the odds of those who may be able to attend both. This way > at BOSC/ISMB we can have good momentum and stability for the current > modules. The BOSC/ISMB can then be focused on recruiting new > developers with a focus on new modules, code examples, docs etc. > > It also probably makes sense to try and identify/recruit Java based > bioinformatics open source applications that have needed or > interesting functionality to ?biojava? enable the algorithm of the > application. This could be a good theme for the BOSC/ISMB conference > to have current Biojava developers work with developers of other > java bioinformatics application to port key functionality so that it > works with Biojava core. > > Scooter > > > On 8/3/09 7:51 AM, "Richard Holland" > wrote: > > Hi guys, > > 10 people responded (including me). 5 of those are in Cambridge, UK, 3 > are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the > hackathon with a holiday, and 3 suggested linking the hackathon with a > conference, which would almost certainly increase chances of getting > funding for travel/accommodation from employers. > > So, I have two options. Venues in both cases to be worked out later: > > 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle > of the winter in the UK, but on the bright side, the Cambridge Winter > Beer Festival runs from the 22nd-24th, so that's something to cheer > you up at the end of the hackathon. > > 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is > 9th-10th (TBC), then ISMB which is 11th-14th). > > Both have pros and cons - the Cambridge meeting means 50% of the > delegates could attend for free and we might even be able to get a > free venue, whereas the Boston meeting would be attractive to anyone > already planning to attend BOSC or ISMB who might otherwise not be > able to find funding for travel. > > I'm going to stick my neck out and suggest that BOSC/ISMB is the > better choice, simply because of the wider range of potential > delegates to attend the hackathon. We could always have a Cambridge > mini-meeting at some other time. So, unless anyone objects, pencil in > your diary for July 5th-8th in Boston. > > Please could all those interested vote yes or no for this plan so that > I can find a suitably sized venue. Attendance will need to be > confirmed by the date the venue sets for final booking/payment. > > cheers, > Richard > > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Mon Aug 3 16:38:32 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 4 Aug 2009 00:38:32 +0800 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <93b45ca50908030938j7899572et780fd2ccd0f2f417@mail.gmail.com> Boston++ On 3 Aug 2009, 8:52 PM, "Richard Holland" wrote: Hi guys, 10 people responded (including me). 5 of those are in Cambridge, UK, 3 are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the hackathon with a holiday, and 3 suggested linking the hackathon with a conference, which would almost certainly increase chances of getting funding for travel/accommodation from employers. So, I have two options. Venues in both cases to be worked out later: 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle of the winter in the UK, but on the bright side, the Cambridge Winter Beer Festival runs from the 22nd-24th, so that's something to cheer you up at the end of the hackathon. 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is 9th-10th (TBC), then ISMB which is 11th-14th). Both have pros and cons - the Cambridge meeting means 50% of the delegates could attend for free and we might even be able to get a free venue, whereas the Boston meeting would be attractive to anyone already planning to attend BOSC or ISMB who might otherwise not be able to find funding for travel. I'm going to stick my neck out and suggest that BOSC/ISMB is the better choice, simply because of the wider range of potential delegates to attend the hackathon. We could always have a Cambridge mini-meeting at some other time. So, unless anyone objects, pencil in your diary for July 5th-8th in Boston. Please could all those interested vote yes or no for this plan so that I can find a suitably sized venue. Attendance will need to be confirmed by the date the venue sets for final booking/payment. cheers, Richard -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From andreas at sdsc.edu Tue Aug 4 06:09:37 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 3 Aug 2009 23:09:37 -0700 Subject: [Biojava-dev] Hackathon update In-Reply-To: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> References: <0BD11B39-1695-4C07-9695-20D095172A9C@eaglegenomics.com> Message-ID: <59a41c430908032309l7b380c92hf018c12d38dd566f@mail.gmail.com> Hi Richard, I think it is a great idea to plan a hackaton prior to next BOSC. Still this is still almost a year ahead and as such a long time away. Ideally I would like to have something already earlier than that... San Diego is far away from the UK, but I would be happy to organize and host something here, if people would be up for the longish-journey... Andreas On Mon, Aug 3, 2009 at 6:29 AM, Richard Holland wrote: > Good plan - my worry is whether or not people can get 2 weeks off in the > same year for the purposes of a hackathon. > > But, if people are willing, I'm happy to set up both. It does mean extra > cost in terms of venue hire etc. - do you have any ideas as to good > sponsors? > > > On 3 Aug 2009, at 14:10, Scooter Willis wrote: > > Richard >> >> It probably wouldn?t hurt to try and do both. Waiting a year delays >> getting started and because the two events are six months apart it increases >> the odds of those who may be able to attend both. This way at BOSC/ISMB we >> can have good momentum and stability for the current modules. The BOSC/ISMB >> can then be focused on recruiting new developers with a focus on new >> modules, code examples, docs etc. >> >> It also probably makes sense to try and identify/recruit Java based >> bioinformatics open source applications that have needed or interesting >> functionality to ?biojava? enable the algorithm of the application. This >> could be a good theme for the BOSC/ISMB conference to have current Biojava >> developers work with developers of other java bioinformatics application to >> port key functionality so that it works with Biojava core. >> >> Scooter >> >> >> >> On 8/3/09 7:51 AM, "Richard Holland" wrote: >> >> Hi guys, >> >> 10 people responded (including me). 5 of those are in Cambridge, UK, 3 >> are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the >> hackathon with a holiday, and 3 suggested linking the hackathon with a >> conference, which would almost certainly increase chances of getting >> funding for travel/accommodation from employers. >> >> So, I have two options. Venues in both cases to be worked out later: >> >> 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle >> of the winter in the UK, but on the bright side, the Cambridge Winter >> Beer Festival runs from the 22nd-24th, so that's something to cheer >> you up at the end of the hackathon. >> >> 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is >> 9th-10th (TBC), then ISMB which is 11th-14th). >> >> Both have pros and cons - the Cambridge meeting means 50% of the >> delegates could attend for free and we might even be able to get a >> free venue, whereas the Boston meeting would be attractive to anyone >> already planning to attend BOSC or ISMB who might otherwise not be >> able to find funding for travel. >> >> I'm going to stick my neck out and suggest that BOSC/ISMB is the >> better choice, simply because of the wider range of potential >> delegates to attend the hackathon. We could always have a Cambridge >> mini-meeting at some other time. So, unless anyone objects, pencil in >> your diary for July 5th-8th in Boston. >> >> Please could all those interested vote yes or no for this plan so that >> I can find a suitably sized venue. Attendance will need to be >> confirmed by the date the venue sets for final booking/payment. >> >> cheers, >> Richard >> >> -- >> Richard Holland, BSc MBCS >> Operations and Delivery Director, Eagle Genomics Ltd >> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >> http://www.eaglegenomics.com/ >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From bugzilla-daemon at portal.open-bio.org Tue Aug 4 17:28:58 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 4 Aug 2009 13:28:58 -0400 Subject: [Biojava-dev] [Bug 2540] RichSequenceIterator does not skip sequence when exception is thrown In-Reply-To: Message-ID: <200908041728.n74HSwfd027233@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2540 vdmerwe.karen at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1352 is|0 |1 obsolete| | ------- Comment #2 from vdmerwe.karen at gmail.com 2009-08-04 13:28 EST ------- Created an attachment (id=1356) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1356&action=view) Updated the previous solution -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From florian.mittag at uni-tuebingen.de Wed Aug 5 12:45:41 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Wed, 5 Aug 2009 14:45:41 +0200 Subject: [Biojava-dev] How to parse large Genbank files? In-Reply-To: References: <200907241929.08768.florian.mittag@uni-tuebingen.de> <200907281414.55156.florian.mittag@uni-tuebingen.de> Message-ID: <200908051445.42345.florian.mittag@uni-tuebingen.de> On Tuesday, 28. July 2009 14:52, Richard Holland wrote: > > Btw: Should we move this to Biojava-dev? >> probably, yes! :) done ;) > If you want to explore my ideas for a replacement Sequence model, the > code and docs are here (sequence handling is in the 'core' module with > DNA-specifics in the 'dna' module): > > http://biojava.org/wiki/BioJava3:HowTo > http://www.biojava.org/wiki/BioJava3_project > > (Methods such as file parsers would request Strings (or ideally > CharSequence - more flexible, and String extends it) as parameters > whenever they don't care about content - if they care about content > but don't care in advance about size or random access then they should > request Iterator which can be used to wrap a String and parse > on demand, and if they need full functionality then they should > request List which the default implementation of uses > ArrayLists but there's no reason a String-backed one could be written > as well). By now, I was mostly interested in a quick and dirty solution. I first attempted to create a new class StringSymbolList that would use the String as representation for the sequence and only convert to Symbols on demand. Since SimpleRichSequence uses SimpleSymbolList hard-coded, I wanted to implement a new RichSequence as well, but I was back-stabbed by Hibernate, because the bindings are set to SimpleRichSequence and when retrieving objects from the DB it uses the original BioJava classes again My solution now works and it consists out of my own implementation of GenbankFormat, RichSequenceBuilder, and RichSequence, a new class called StringSymbolList as described above and a change to SimpleRichSequence, adding the method: @Override public String seqString() { return seqstring; } which circumvents most of the array copying stuff. I also noticed that processing the Genbank files became slower with every file, so I closed the Hibernate session after each chromosome and opened a new one. (I also tried session.clean(), but somehow this didn't work). For now, it seems like everything is fine and I have no more OutOfMemory exceptions. - Florian > > cheers, > Richard > > > - Florian > > > >> On Mon, Jul 27, 2009 at 8:16 PM, Florian > >> > >> Mittag wrote: > >>> Hi Mark! > >>> > >>> On Saturday, 25. July 2009 04:20, Mark Schreiber wrote: > >>>> I don't think anyone has done much or anything to optimize these > >>>> parsers. The process you outline sounds extremely inefficient. It > >>>> is > >>>> also likely to lead to memory leaks due to the number of copy > >>>> operations. > >>> > >>> I wouldn't necessarily say that it leads to memory leaks, but it > >>> definitively leads to a high memory consumption (2GB are not > >>> enough for a > >>> 200MB file). Also, my outline of the process is based on only 2 > >>> hours of > >>> viewing the code, so actually I expected to be corrected on this. > >>> Unfortunately, it seems like I did get the right idea and it IS > >>> extremely > >>> inefficient. > >>> > >>> I mean, I understand that this is a high level of abstraction that > >>> might > >>> come in handy in many situations, but it certainly is more of an > >>> obstacle > >>> in my specific case. > >>> > >>>> As always with java, don't try and optimize without a profiler > >>>> which > >>>> will tell you which methods are taking a long time and which > >>>> objects > >>>> take the most memory. > >>> > >>> I think we should continue this discussion on the biojava-dev list > >>> or in > >>> a private conversation, as it will probably get very detailed and > >>> technical. > >>> > >>> > >>> My question to this list again: > >>> Is there a way to achieve my goal of parsing a 200MB Genbank file > >>> with > >>> the current biojava version without code changes? > >>> > >>> > >>> - Florian > >>> > >>>> On 25 Jul 2009, 1:33 AM, "Florian Mittag" > >>>> wrote: > >>>> > >>>> Hi! > >>>> > >>>> I think this is a problem worth of its own thread, so I'll start > >>>> one: > >>>> > >>>> I want to store all human chromosomes in a BioSQL database after I > >>>> loaded the > >>>> information from .gbk files. The files I get from NCBI with the > >>>> following URIs, where the id ranges from nc_000001 to nc_000024 > >>>> plus > >>>> nc_001804: > >>>> > >>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id > >>>>=n c_0 00023&rettype=gbwithparts&retmode=text > >>>> > >>>> I then try to parse the files as described in > >>>> http://biojava.org/wiki/BioJava:BioJavaXDocs#Tools_for_reading.2Fwriti > >>>>ng _fi les but it wont work. While there are no problems parsing 1804 > >>>> and > >>>> 24, chromosome > >>>> 23 leads to a OutOfMemory exception although I gave it 2GB of heap > >>>> space. > >>>> > >>>> Here is a stack trace (the line numbers might differ, because I > >>>> already > >>>> tried > >>>> to improve GenbankFormat.java in memory efficiency): > >>>> > >>>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap > >>>> space > >>>> at > >>>> org > >>>> .biojava > >>>> .bio.seq.io.ChunkedSymbolListFactory.addSymbols(ChunkedSymbol > >>>> Lis tFactory.java:222) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.SimpleRichSequenceBuilder.addSymbols(SimpleRichS > >>>> equ enceBuilder.java:256) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.jav > >>>> a:5 35) at > >>>> org > >>>> .biojavax > >>>> .bio.seq.io.RichStreamReader.nextRichSequence(RichStreamRead > >>>> er. java:110) at > >>>> org > >>>> .prodge > >>>> .sequence_viewer.db.UpdateDB_Main.updateChromosome(UpdateDB_Ma > >>>> in. java:537) at > >>>> org > >>>> .prodge > >>>> .sequence_viewer.db.UpdateDB_Main.newGenome(UpdateDB_Main.java > >>>> > >>>> :46 8) at > >>>> > >>>> org > >>>> .prodge.sequence_viewer.db.UpdateDB_Main.main(UpdateDB_Main.java: > >>>> 164) > >>>> > >>>> The line in GenbankFormat.java is: > >>>> > >>>> rlistener.addSymbols( > >>>> symParser.getAlphabet(), > >>>> (Symbol[])(sl.toList().toArray(new Symbol[0])), > >>>> 0, sl.length()); > >>>> > >>>> Sometimes it fails at the sl.toList().toArray()-part, sometimes > >>>> it fails > >>>> later > >>>> inside the addSymbols method, but it always fails. > >>>> > >>>> How can this be? I mean, the file is only 190MB in size, so 2GB of > >>>> memory should be more than enough. Browsing through the source > >>>> code, I > >>>> discovered what I think of as very inefficient handling of > >>>> sequences: > >>>> > >>>> 1) the sequence string is read from file into a StringBuffer > >>>> 2) it is converted to a string (with whitespaces removed) > >>>> 3) a SimpleSymbolList is created out of the string > >>>> 4) the SymbolList is converted to a List of Symbols > >>>> 5) the List is converted to an array of Symbols > >>>> 6) the array is passed to addSymbols > >>>> 7) there it is added to a ChunkedSymbolListFactory > >>>> 8) if at some point the sequence is requested, a SymbolList is > >>>> created > >>>> and then converted to a string. > >>>> > >>>> You see, there is a lot of copying and converting, but in the end > >>>> I have > >>>> the same string I started with. Well, I had the string, if it ever > >>>> reached the end, because it will crash before completing this > >>>> process. > >>>> > >>>> > >>>> Am I doing something wrong or is there a great potential of > >>>> improving > >>>> parsing > >>>> of Genbank files? > >>>> > >>>> > >>>> Regards, > >>>> Florian > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>> > >>> -- > >>> Dipl. Inf. Florian Mittag > >>> Universit?t Tuebingen > >>> WSI-RA, Sand 1 > >>> 72076 Tuebingen, Germany > >>> Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 > > > > -- > > Dipl. Inf. Florian Mittag > > Universit?t Tuebingen > > WSI-RA, Sand 1 > > 72076 Tuebingen, Germany > > Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From markjschreiber at gmail.com Wed Aug 5 13:16:03 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 5 Aug 2009 21:16:03 +0800 Subject: [Biojava-dev] How to parse large Genbank files? In-Reply-To: <200908051445.42345.florian.mittag@uni-tuebingen.de> References: <200907241929.08768.florian.mittag@uni-tuebingen.de> <200907281414.55156.florian.mittag@uni-tuebingen.de> <200908051445.42345.florian.mittag@uni-tuebingen.de> Message-ID: <93b45ca50908050616n210bd2a3u8391d9ad7114015a@mail.gmail.com> Would it be better for the biojava SimpleRichSequence to be backed by a String and do symbol operations on the fly? Alternatively the default hibernate mapping could be to a more stringy sequence. Arguably in the absence of JPA and entity beans Hibernate should probably be talking to biojava via DTOs. An efficient BioSQL loader would directly use the DTOs or Entity beans (which could implement biojava interfaces) and not go through all the symbol hassle. Might be worth considering for BJ3 - Mark On Aug 5, 2009 8:45 PM, "Florian Mittag" wrote: On Tuesday, 28. July 2009 14:52, Richard Holland wrote: > > Btw: Should we move this to Biojava-dev?... done ;) > If you want to explore my ideas for a replacement Sequence model, the > code and docs are here (... By now, I was mostly interested in a quick and dirty solution. I first attempted to create a new class StringSymbolList that would use the String as representation for the sequence and only convert to Symbols on demand. Since SimpleRichSequence uses SimpleSymbolList hard-coded, I wanted to implement a new RichSequence as well, but I was back-stabbed by Hibernate, because the bindings are set to SimpleRichSequence and when retrieving objects from the DB it uses the original BioJava classes again My solution now works and it consists out of my own implementation of GenbankFormat, RichSequenceBuilder, and RichSequence, a new class called StringSymbolList as described above and a change to SimpleRichSequence, adding the method: @Override public String seqString() { return seqstring; } which circumvents most of the array copying stuff. I also noticed that processing the Genbank files became slower with every file, so I closed the Hibernate session after each chromosome and opened a new one. (I also tried session.clean(), but somehow this didn't work). For now, it seems like everything is fine and I have no more OutOfMemory exceptions. - Florian > > cheers, > Richard > > > - Florian > > > >> On Mon, Jul 27, 2009 at 8:16 PM, Florian > >> > >> ... > >>>>ng _fi les but it wont work. While there are no problems parsing 1804 > >>>> and > >>>> 24, chromosome > >>>> 23 leads to a OutOfMemory exception although I gave it 2GB o... -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7... From florian.mittag at uni-tuebingen.de Wed Aug 5 15:41:24 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Wed, 5 Aug 2009 17:41:24 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate Message-ID: <200908051741.24367.florian.mittag@uni-tuebingen.de> Hi, it's me again ;-) I'm really sorry to bother you with yet another problem, but I seem to attract those problems. When I parse Genbank files and store them in a BioSQL DB, all features like "gap", "mRNA", "gene", etc. are represented by newly created Terms in the ontology "biojavax" with the comment "autocreated by biojavax". I searched for an appropriate ontology and found the Sequence Ontology, which I loaded into the DB using BioPerl's load_ontology.pl I tried setting the default ontology using RichObjectBuilder.setDefaultOntology("sequence"), but when it comes to instantiation the SimpleRichSequenceBuilder, a multi-nested exception is thrown. I followed it in the code and found the cause in Hibernate: [SEVERE] (): illegal access to loading collection >> org.hibernate.LazyInitializationException: illegal access to loading collection at org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:341) at org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersistentCollection.java:86) at org.hibernate.collection.PersistentSet.toString(PersistentSet.java:309) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at java.util.AbstractCollection.toString(AbstractCollection.java:422) at org.hibernate.engine.StatefulPersistenceContext.initializeNonLazyCollections(StatefulPersistenceContext.java:844) probably cause by this exception org.hibernate.PropertyAccessException: Null value was assigned to a property of primitive type setter of org.biojavax.SimpleRankedCrossRef.rank The code to reproduce this: sessionFactory = new Configuration().configure().buildSessionFactory(); session = sessionFactory.openSession(); RichObjectFactory.connectToBioSQL(session); RichObjectFactory.setDefaultOntologyName("sequence"); Ontology onto = RichObjectFactory.getDefaultOntology(); My DB has the following ontologies listed: - biological_process - gene_ontology - molecular_function - cellular_component - sequence - biojavax and only for "gene_ontology" and "biojavax" the above code snippet runs without failure. All ontologies were loaded with the load_ontology.pl script. What might be the cause? Thanks - Florian -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From florian.mittag at uni-tuebingen.de Thu Aug 6 13:16:50 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Thu, 6 Aug 2009 15:16:50 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <200908051741.24367.florian.mittag@uni-tuebingen.de> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> Message-ID: <200908061516.50183.florian.mittag@uni-tuebingen.de> Found the cause. After importing an ontology (Gene or Sequence Ontology) into the BioSQL using load_ontology.pl, the table "term_dbxref" has only NULL values in the rank column. I tried it with DB2 and MySQL, same results/error. The way I see it, this is not a problem of Hibernate. Can I set the "rank" to an arbitrary value to circumvent this problem? On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me again ;-) > > I'm really sorry to bother you with yet another problem, but I seem to > attract those problems. > > When I parse Genbank files and store them in a BioSQL DB, all features > like "gap", "mRNA", "gene", etc. are represented by newly created Terms in > the ontology "biojavax" with the comment "autocreated by biojavax". I > searched for an appropriate ontology and found the Sequence Ontology, which > I loaded into the DB using BioPerl's load_ontology.pl > > I tried setting the default ontology using > RichObjectBuilder.setDefaultOntology("sequence"), but when it comes to > instantiation the SimpleRichSequenceBuilder, a multi-nested exception is > thrown. I followed it in the code and found the cause in Hibernate: > > [SEVERE] (): illegal access to loading collection >> > org.hibernate.LazyInitializationException: illegal access to loading > collection > at > org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPe >rsistentCollection.java:341) at > org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersiste >ntCollection.java:86) at > org.hibernate.collection.PersistentSet.toString(PersistentSet.java:309) at > java.lang.String.valueOf(String.java:2827) > at java.lang.StringBuilder.append(StringBuilder.java:115) > at java.util.AbstractCollection.toString(AbstractCollection.java:422) > at > org.hibernate.engine.StatefulPersistenceContext.initializeNonLazyCollection >s(StatefulPersistenceContext.java:844) > > probably cause by this exception > > org.hibernate.PropertyAccessException: Null value was assigned to a > property of primitive type setter of org.biojavax.SimpleRankedCrossRef.rank > > > The code to reproduce this: > > sessionFactory = new Configuration().configure().buildSessionFactory(); > session = sessionFactory.openSession(); > RichObjectFactory.connectToBioSQL(session); > RichObjectFactory.setDefaultOntologyName("sequence"); > Ontology onto = RichObjectFactory.getDefaultOntology(); > > My DB has the following ontologies listed: > - biological_process > - gene_ontology > - molecular_function > - cellular_component > - sequence > - biojavax > > and only for "gene_ontology" and "biojavax" the above code snippet runs > without failure. All ontologies were loaded with the load_ontology.pl > script. > > > What might be the cause? > > Thanks > > - Florian -- Dipl. Inf. Florian Mittag Universit?t Tuebingen WSI-RA, Sand 1 72076 Tuebingen, Germany Phone: +49 7071 / 29 78985 Fax: +49 7071 / 29 5091 From markjschreiber at gmail.com Thu Aug 6 13:48:37 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 6 Aug 2009 21:48:37 +0800 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <200908061516.50183.florian.mittag@uni-tuebingen.de> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> <200908061516.50183.florian.mittag@uni-tuebingen.de> Message-ID: <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> There shouldn't be an issue with using an arbitrary value. The ranks in biosql are mainly to preserve the order of features etc. during roundtripping. It will affect sorting of ontology terms but this is probably not a problem. - mark On Aug 6, 2009 9:42 PM, "Florian Mittag" wrote: Found the cause. After importing an ontology (Gene or Sequence Ontology) into the BioSQL using load_ontology.pl, the table "term_dbxref" has only NULL values in the rank column. I tried it with DB2 and MySQL, same results/error. The way I see it, this is not a problem of Hibernate. Can I set the "rank" to an arbitrary value to circumvent this problem? On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me again ;-) > > I'm really s... From florian.mittag at uni-tuebingen.de Thu Aug 6 14:14:02 2009 From: florian.mittag at uni-tuebingen.de (Florian Mittag) Date: Thu, 6 Aug 2009 16:14:02 +0200 Subject: [Biojava-dev] Error loading Ontology with Hibernate In-Reply-To: <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> References: <200908051741.24367.florian.mittag@uni-tuebingen.de> <200908061516.50183.florian.mittag@uni-tuebingen.de> <93b45ca50908060648p2451096ax46a179e058a09551@mail.gmail.com> Message-ID: <200908061614.03033.florian.mittag@uni-tuebingen.de> On Thursday, 6. August 2009 15:48, you wrote: > There shouldn't be an issue with using an arbitrary value. The ranks in > biosql are mainly to preserve the order of features etc. during > roundtripping. It will affect sorting of ontology terms but this is > probably not a problem. Ok, then I will try this as a quick hack until I've found out if the NULL values are a bug and if it can be fixed. Thanks for the quick answer! - Florian > On Aug 6, 2009 9:42 PM, "Florian Mittag" > wrote: > > Found the cause. > > After importing an ontology (Gene or Sequence Ontology) into the BioSQL > using > load_ontology.pl, the table "term_dbxref" has only NULL values in the rank > column. I tried it with DB2 and MySQL, same results/error. > > The way I see it, this is not a problem of Hibernate. Can I set the "rank" > to > an arbitrary value to circumvent this problem? > > On Wednesday, 5. August 2009 17:41, Florian Mittag wrote: > Hi, it's me > again ;-) > > I'm really s... From holland at eaglegenomics.com Fri Aug 7 17:51:59 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 7 Aug 2009 18:51:59 +0100 Subject: [Biojava-dev] Hackathon update In-Reply-To: References: Message-ID: <0AA4618C-2A99-4ACD-B07D-0AA05FE77665@eaglegenomics.com> Several have said the same. I'll try to get both organised. Watch this space. cheers, Richard On 7 Aug 2009, at 18:23, Michael Heuer wrote: > Richard Holland wrote: > >> 10 people responded (including me). 5 of those are in Cambridge, >> UK, 3 >> are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine >> the >> hackathon with a holiday, and 3 suggested linking the hackathon >> with a >> conference, which would almost certainly increase chances of getting >> funding for travel/accommodation from employers. >> >> So, I have two options. Venues in both cases to be worked out later: >> >> 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle >> of the winter in the UK, but on the bright side, the Cambridge Winter >> Beer Festival runs from the 22nd-24th, so that's something to cheer >> you up at the end of the hackathon. >> >> 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is >> 9th-10th (TBC), then ISMB which is 11th-14th). > > > I would suggest trying for both. Winter in the UK means that a lot of > work would get done. Attendance would probably be better for Boston. > > I would caution that accomodations in Boston are quite expensive, and > that the 4th of July week is the busiest week of the year with > tourists. > Perhaps the hackathon in Boston might be arranged flexibly around the > actual days of the conference, evenings and late nights and so on. > > michael > -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From heuermh at acm.org Fri Aug 7 17:23:53 2009 From: heuermh at acm.org (Michael Heuer) Date: Fri, 7 Aug 2009 13:23:53 -0400 (EDT) Subject: [Biojava-dev] Hackathon update In-Reply-To: Message-ID: Richard Holland wrote: > 10 people responded (including me). 5 of those are in Cambridge, UK, 3 > are in the US, 1 in Spain, and 1 in Singapore. 2 wanted to combine the > hackathon with a holiday, and 3 suggested linking the hackathon with a > conference, which would almost certainly increase chances of getting > funding for travel/accommodation from employers. > > So, I have two options. Venues in both cases to be worked out later: > > 1. Cambridge, UK, January 18th-22nd 2010. I know this is the middle > of the winter in the UK, but on the bright side, the Cambridge Winter > Beer Festival runs from the 22nd-24th, so that's something to cheer > you up at the end of the hackathon. > > 2. Boston, USA, July 5th-8th 2010 (immediately before BOSC which is > 9th-10th (TBC), then ISMB which is 11th-14th). I would suggest trying for both. Winter in the UK means that a lot of work would get done. Attendance would probably be better for Boston. I would caution that accomodations in Boston are quite expensive, and that the 4th of July week is the busiest week of the year with tourists. Perhaps the hackathon in Boston might be arranged flexibly around the actual days of the conference, evenings and late nights and so on. michael From andreas at sdsc.edu Sun Aug 16 21:41:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 16 Aug 2009 14:41:03 -0700 Subject: [Biojava-dev] plans for next months Message-ID: <59a41c430908161441l3ae3ebao524237a1b7b868fe@mail.gmail.com> Hi, Here a quick summary of what I propose to be our action plan for the next months for BioJava: * I would like to call for a code-freeze in 2 weeks (or so) in order to finalize the new modularized and mavenized version of biojava for the developers. The current developmental trunk will remain permanently frozen and all future work should continue at a new location in SVN. As such it will be important that all developers commit any changes they are working on before that. * We will update the documentation for how to obtain a new mavenized checkout on the wiki. * After the change the new modules need to be tested and if no major problems are found, the ok will be given to continue working on the new modules (at the new location) * All developers should obtain a new checkout. * We need to identify sub-module leaders who will take over leadership of the sub-modules. In order to come up with a new release of biojava we should continue development on the new modules for a few months. Talking off list with Richard Holland it looks like we will have a hackaton in January in Cambridge, U.K. (details to be finalized and announced). I suggest that we use that opportunity to focus on further developing the modules and make a new public BioJava release shortly after that. At the present I see the following topics that would be great to work on until and during the hackaton in order to prepare a shiny new version of BioJava for public release: + Work on standardizing the organization of the modules (tests, examples, source, docu etc.) + Add new modules + Improve existing modules + Anything the module leaders deem necessary for their modules. + Use OSGI for visualisation related modules I can post a more detailed and specific list of things to work on if people are interested. Andreas From andreas at sdsc.edu Mon Aug 24 04:18:14 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 23 Aug 2009 21:18:14 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules Message-ID: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Hi, In order to push the modularization and migration to Maven, I would like to declare a code freeze on the current developmental trunk. Please commit all new changes by Thursday 27th of August 23:00 GMT. In the week after I would like to refactor the code base and commit the initial set of modules to a new developmental trunk. All future development will happen on that new trunk. You will be able to follow the ongoing status of this at http://biojava.org/wiki/BioJava:MavenMigration Once the modules are in place it is a good moment to hand over the leadership of the sub-modules to the new module-project leaders. It will be up to the module-lead to take the modules into the direction that he/she feels important. I would like to take this opportunity to suggest a couple of people as module-leaders and propose some action items for the modules. Feel free to comment or make additional suggestions... Here a list of modules / action items and the people that I would propose to become module leaders: Module: biojava-core Lead: Andreas Prlic - break the new modules out of core - bring up to modern Java standards, use Generics - declare old/unused code obsolete - don;t break backwards compatibility Module: biojava-sequence Lead: Richard Holland - Bring in Richard's new code that he started to develop on the biojava-3 branch. - provide a more scaleable and efficient basis for dealing with large sequence files Module: biojava-alignment Lead: Andreas Draeger - allow better access to underlying dynamic programming data structures - allow more customizable display of pairwise alignments (HTML/plain text, etc) Module : biojava-blast Lead: still looking for a leader - provide access to all details of the blast output - add support for RPS blast Module: biojava-phylo Lead: Scooter Willis - provide improved NJtree /Jalview Module: biojava-biosql Lead: Richard Holland - merge the new biojava-sequence module with the current biojava-biosql code Module: biojava-structure Lead: Andreas Prlic - add support for SCOP file parsing - add support for easy access of domains (in terms of coordinates) - add secondary structure assignment - improve structure alignments - better integration with 3D viewers (Jmol, RCSB viewers) Module: biojava-web services: The details seem still to be under discussion and perhaps we need multiple modules here? also what about REST vs. SOAP? To be discussed. People who expressed interest are: Niall Haslam,Scooter Willis, Sylvain Foisy Module?: biojava-ws-blast Module?: biojava-ws-biolit Module: biojava-sequencing Lead: ??? - support FastQ files - support parsing of output for various new sequencing machines This is only an initial set of modules and I think it is safe to say that more modules will be added after more discussions (and people volunteering to contribute). Andreas From simpleyrx at 163.com Mon Aug 24 16:48:01 2009 From: simpleyrx at 163.com (simpleyrx) Date: Tue, 25 Aug 2009 00:48:01 +0800 (CST) Subject: [Biojava-dev] Adding profile-profile alignment algorithms to Biojava Message-ID: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> Experts, Profile-profile alignment or HMM-HMM alignments have become more important in protein bioinformation field than ever before. So I think, if we can implement Profile-profile alignment and HMM-HMM alignments algorithms in Biojava package, it will be more useful to the researchers who interested in protein bioinformatics. From holland at eaglegenomics.com Mon Aug 24 17:30:31 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 24 Aug 2009 18:30:31 +0100 Subject: [Biojava-dev] Adding profile-profile alignment algorithms to Biojava In-Reply-To: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> References: <9551386.424471251132481047.JavaMail.coremail@app180.163.com> Message-ID: Contributions of code would be welcome! Are you volunteering? :) cheers, Richard On 24 Aug 2009, at 17:48, simpleyrx wrote: > > Experts, > > Profile-profile alignment or HMM-HMM alignments have > become more important in protein bioinformation field than ever > before. So I think, if we can implement Profile-profile alignment > and HMM-HMM alignments algorithms in Biojava package, it will be > more useful to the researchers who interested in protein > bioinformatics. > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From heuermh at acm.org Tue Aug 25 01:19:24 2009 From: heuermh at acm.org (Michael Heuer) Date: Mon, 24 Aug 2009 21:19:24 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: Andreas Prlic wrote: > In order to push the modularization and migration to Maven, I would like to > declare a code freeze on the current developmental trunk. Please commit all > new changes by > > Thursday 27th of August 23:00 GMT. > > In the week after I would like to refactor the code base and commit the > initial set of modules to a new developmental trunk. All future development > will happen on that new trunk. > > You will be able to follow the ongoing status of this at > > http://biojava.org/wiki/BioJava:MavenMigration > > > Once the modules are in place it is a good moment to hand over the > leadership of the sub-modules to the new module-project leaders. It will be > up to the module-lead to take the modules into the direction that he/she > feels important. I would like to take this opportunity to suggest a couple > of people as module-leaders and propose some action items for the modules. > Feel free to comment or make additional suggestions... Sign me up for help with maven configuration/reporting, unit testing, and generics API matters if you wish. > Here a list of modules / action items and the people that I would propose to > become module leaders: > > Module: biojava-core Lead: Andreas Prlic > - break the new modules out of core > - bring up to modern Java standards, use Generics > - declare old/unused code obsolete > - don;t break backwards compatibility Seems to me the last one will greatly hamper the rest of this effort. The next version needs to be binary compatible with 1.7? michael From andreas at sdsc.edu Tue Aug 25 02:17:00 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 24 Aug 2009 19:17:00 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> >> Once the modules are in place it is a good moment to hand over the >> leadership of the sub-modules to the new module-project leaders. It will be >> up to the module-lead to take the modules into the direction that he/she >> feels important. I would like to take this opportunity to suggest a couple >> of people as module-leaders and propose some action items for the modules. >> Feel free to comment or make additional suggestions... > > Sign me up for help with maven configuration/reporting, unit testing, and > generics API matters if you wish. Excellent, I will come back to you on this :-) >> ?- don;t break backwards compatibility > > Seems to me the last one will greatly hamper the rest of this effort. > The next version needs to be binary compatible with 1.7? What I mean is that we should try not to disrupt things as much as is reasonable. I am all for a pragmatic approach. While trying to be conservative I guess refactoring should be discussed on a case by case basis. To give an example: an area where I am supporting re-factoring is the blast parser. The package name is confusing and we probably need some code changes to expose more details of the parser. Are you thinking of any other situtations, where you think breaking backwards compatibility will be inevitable? Andreas From heuermh at acm.org Tue Aug 25 02:50:09 2009 From: heuermh at acm.org (Michael Heuer) Date: Mon, 24 Aug 2009 22:50:09 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> Message-ID: Andreas Prlic wrote: > >> Once the modules are in place it is a good moment to hand over the > >> leadership of the sub-modules to the new module-project leaders. It will be > >> up to the module-lead to take the modules into the direction that he/she > >> feels important. I would like to take this opportunity to suggest a couple > >> of people as module-leaders and propose some action items for the modules. > >> Feel free to comment or make additional suggestions... > > > > Sign me up for help with maven configuration/reporting, unit testing, and > > generics API matters if you wish. > > Excellent, I will come back to you on this :-) > > >> ?- don;t break backwards compatibility > > > > Seems to me the last one will greatly hamper the rest of this effort. > > The next version needs to be binary compatible with 1.7? > > > What I mean is that we should try not to disrupt things as much as is > reasonable. I am all for a pragmatic approach. While trying to be > conservative I guess refactoring should be discussed on a case by case > basis. To give an example: an area where I am supporting re-factoring > is the blast parser. The package name is confusing and we probably > need some code changes to expose more details of the parser. Are you > thinking of any other situtations, where you think breaking backwards > compatibility will be inevitable? Ah yes, pragmatically backwards compatible with 1.7 is a better goal. Maintaining binary compatibility is very difficult, and something we haven't really done in the past. Consider the following biojava 1.6.1 vs biojava 1.7 clirr [1] report. michael [1] http://clirr.sf.net --- ERROR: 6004: org.biojava.bio.alignment.NeedlemanWunsch: Changed type of field CostMatrix from double[][] to int[][] ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 2 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 3 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 4 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 5 of 'public NeedlemanWunsch(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getDelete()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getEditDistance()' has been changed to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getGapExt()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getInsert()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getMatch()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double getReplace()' has been changed to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 2 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 3 of 'protected double min(double, double, double)' has changed its type to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'protected double min(double, double, double)' has been changed to int ERROR: 7006: org.biojava.bio.alignment.NeedlemanWunsch: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public java.lang.String printCostMatrix(double[][], char[], char[])' has changed its type to int[][] ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setDelete(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setGapExt(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setInsert(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setMatch(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.NeedlemanWunsch: Parameter 1 of 'public void setReplace(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SequenceAlignment: Parameter 11 of 'public java.lang.String formatOutput(java.lang.String, java.lang.String, java.lang.String[], java.lang.String, int, int, long, int, int, long, double, long)' has changed its type to int ERROR: 7006: org.biojava.bio.alignment.SequenceAlignment: Return type of method 'public java.lang.String formatOutput(java.lang.String, java.lang.String, java.lang.String[], java.lang.String, int, int, long, int, int, long, double, long)' has been changed to java.lang.StringBuffer ERROR: 7006: org.biojava.bio.alignment.SequenceAlignment: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 2 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 3 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 4 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 5 of 'public SmithWaterman(double, double, double, double, double, org.biojava.bio.alignment.SubstitutionMatrix)' has changed its type to short ERROR: 7006: org.biojava.bio.alignment.SmithWaterman: Return type of method 'public double pairwiseAlignment(org.biojava.bio.symbol.SymbolList, org.biojava.bio.symbol.SymbolList)' has been changed to int ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setDelete(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setGapExt(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setInsert(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setMatch(double)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SmithWaterman: Parameter 1 of 'public void setReplace(double)' has changed its type to short ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field matrix from int[][] to short[][] ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field max from int to short ERROR: 6004: org.biojava.bio.alignment.SubstitutionMatrix: Changed type of field min from int to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 2 of 'public SubstitutionMatrix(org.biojava.bio.symbol.FiniteAlphabet, int, int)' has changed its type to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 3 of 'public SubstitutionMatrix(org.biojava.bio.symbol.FiniteAlphabet, int, int)' has changed its type to short INFO: 7011: org.biojava.bio.alignment.SubstitutionMatrix: Method 'public SubstitutionMatrix(java.io.File)' has been added ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getMax()' has been changed to short ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getMin()' has been changed to short INFO: 7011: org.biojava.bio.alignment.SubstitutionMatrix: Method 'public org.biojava.bio.alignment.SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader)' has been added ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'public int getValueAt(org.biojava.bio.symbol.Symbol, org.biojava.bio.symbol.Symbol)' has been changed to short ERROR: 7005: org.biojava.bio.alignment.SubstitutionMatrix: Parameter 1 of 'protected int[][] parseMatrix(java.lang.String)' has changed its type to java.lang.Object ERROR: 7006: org.biojava.bio.alignment.SubstitutionMatrix: Return type of method 'protected int[][] parseMatrix(java.lang.String)' has been changed to short[][] ERROR: 7009: org.biojava.bio.alignment.SubstitutionMatrix: Accessibility of method 'protected int[][] parseMatrix(java.lang.String)' has been decreased from protected to private INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public boolean canAdvance()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public org.biojava.bio.symbol.Symbol currentRes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.dp.onehead.SmallCursor: Method 'public org.biojava.bio.symbol.Symbol lastRes()' has been removed, but an inherited definition exists. INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public ArrowGlyph(java.awt.Paint, java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public ArrowGlyph(java.awt.geom.Rectangle2D$Float, java.awt.Paint, java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public java.awt.Paint getFillPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public java.awt.Paint getOuterPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setDirection(int)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setFillPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.ArrowGlyph: Method 'public void setOuterPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.RectangleGlyph: Method 'public java.awt.Paint getPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.RectangleGlyph: Method 'public void setPaint(java.awt.Paint)' has been added INFO: 7011: org.biojava.bio.gui.glyph.TurnGlyph: Method 'public java.awt.Paint getPaint()' has been added INFO: 7011: org.biojava.bio.gui.glyph.TurnGlyph: Method 'public void setPaint(java.awt.Paint)' has been added INFO: 6009: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Accessibility of field fList has been increased from private to protected INFO: 6009: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Accessibility of field gList has been increased from private to protected INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public boolean containsFilter(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public org.biojava.bio.seq.FeatureFilter getFeatureFilter(int)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public org.biojava.bio.gui.glyph.Glyph getGlyphForFilter(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public void removeFilterWithGlyph(org.biojava.bio.seq.FeatureFilter)' has been added INFO: 7011: org.biojava.bio.gui.sequence.GlyphFeatureRenderer: Method 'public void setGlyphForFilter(org.biojava.bio.seq.FeatureFilter, org.biojava.bio.gui.glyph.Glyph)' has been added INFO: 6009: org.biojava.bio.gui.sequence.SequencePanelWrapper: Accessibility of field seqPanels has been increased from private to protected INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.BlastLikeSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.ClustalWAlignmentSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSearchSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.FastaSequenceSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.PdbSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void addPrefixMapping(java.lang.String, java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.ContentHandler getContentHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.DTDHandler getDTDHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.EntityResolver getEntityResolver()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public org.xml.sax.ErrorHandler getErrorHandler()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getFeature(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String getNamespacePrefix()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getNamespacePrefixes()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public boolean getNamespaces()' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.Object getProperty(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String getURIFromPrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void parse(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public java.lang.String prefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setContentHandler(org.xml.sax.ContentHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setDTDHandler(org.xml.sax.DTDHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setEntityResolver(org.xml.sax.EntityResolver)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setErrorHandler(org.xml.sax.ErrorHandler)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setFeature(java.lang.String, boolean)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setNamespacePrefix(java.lang.String)' has been removed, but an inherited definition exists. INFO: 7003: org.biojava.bio.program.sax.SequenceAlignmentSAXParser: Method 'public void setProperty(java.lang.String, java.lang.Object)' has been removed, but an inherited definition exists. From holland at eaglegenomics.com Tue Aug 25 04:32:24 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 25 Aug 2009 05:32:24 +0100 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> Message-ID: <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> > > > What I mean is that we should try not to disrupt things as much as is > reasonable. I am all for a pragmatic approach. While trying to be > conservative I guess refactoring should be discussed on a case by case > basis. To give an example: an area where I am supporting re-factoring > is the blast parser. The package name is confusing and we probably > need some code changes to expose more details of the parser. Are you > thinking of any other situtations, where you think breaking backwards > compatibility will be inevitable? Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). cheers, Richard > Andreas > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev -- Richard Holland, BSc MBCS Operations and Delivery Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Tue Aug 25 06:58:40 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 25 Aug 2009 14:58:40 +0800 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> Message-ID: <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> I would agree with Richard on this. I think the changes being proposed are not compatible with the current API. There are a couple of things wrong with the current model (such as the Feature, Strand, Location issues). There are also several areas where best-practices of the past (parts of BioJava are 10 years old) are not considered best practices now (some like Singletons are often thought of as anti-patterns these days). Add to that the fact that we have never been truly backwards compatible (expept maybe 1.3 and 1.3.1 ?) and I think we can justifiably try and avoid the claim that BJ1.7 should be backwards compatible. We can continue to make older Jars available for people who need them although most likely people who have a need for legacy support already have the Jars that they need bundled up with their apps. Shared libraries have very much fallen out of favor in recent years in almost all languages and system wide classpaths are asking for trouble. Hard-drives are cheap so it is no big deal to have a dedicated version of the BioJava jar bundled with each app that needs it. We could adopt the idea that backwards compatible builds get minor-version numbers eg 1.1 while other builds get major version numbers. I guess this would mean we are at BioJava 7 ? Backwards compatibility would be great to have but not if the effort required hinders innovation. - Mark On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland wrote: >> >> >> What I mean is that we should try not to disrupt things as much as is >> reasonable. I am all for a pragmatic approach. While trying to be >> conservative I guess refactoring should be discussed on a case by case >> basis. To give an example: an area where I am supporting re-factoring >> is the blast parser. The package name is confusing and we probably >> need some code changes to expose more details of the parser. Are you >> thinking of any other situtations, where you think breaking backwards >> compatibility will be inevitable? > > Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). > > My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). > > cheers, > Richard > >> Andreas >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- > Richard Holland, BSc MBCS > Operations and Delivery Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev From jacobsen at ebi.ac.uk Tue Aug 25 08:45:52 2009 From: jacobsen at ebi.ac.uk (Jules Jacobsen) Date: Tue, 25 Aug 2009 09:45:52 +0100 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> Message-ID: <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> I think Mark has a good point here - there are certain aspects of BioJava which are considered to be un-necessarily over-complicated and these things have been deal-breakers for the people concerned - I remember a couple of cases from the EBI where they have implemented their own system instead of using and supporting BioJava. Fixing areas of confusion, simplifying and moving forwards without maintaining backwards-compatibility might be a good idea for increasing user numbers and elevating the general perception of the project, whilst potentially risking alienating some existing users. I think his idea of maintaining compatibility within point releases and stating that full version releases may not have backwards compatibility would make it clearer for users as to what to expect from a release. It may also help the developers stay on track with the task and general design focus for that release by constraining them to the current system during a point release whilst highlighting confusing areas which can be dealt with in a more satifsfactory manner in the next full release. Jules On Tue, Aug 25, 2009 at 7:58 AM, Mark Schreiber wrote: > I would agree with Richard on this. I think the changes being proposed > are not compatible with the current API. There are a couple of things > wrong with the current model (such as the Feature, Strand, Location > issues). There are also several areas where best-practices of the past > (parts of BioJava are 10 years old) are not considered best practices > now (some like Singletons are often thought of as anti-patterns these > days). > > Add to that the fact that we have never been truly backwards > compatible (expept maybe 1.3 and 1.3.1 ?) ?and I think we can > justifiably try and avoid the claim that BJ1.7 should be backwards > compatible. ?We can continue to make older Jars available for people > who need them although most likely people who have a need for legacy > support already have the Jars that they need bundled up with their > apps. Shared libraries have very much fallen out of favor in recent > years in almost all languages and system wide classpaths are asking > for trouble. ?Hard-drives are cheap so it is no big deal to have a > dedicated version of the BioJava jar bundled with each app that needs > it. > > We could adopt the idea that backwards compatible builds get > minor-version numbers eg 1.1 while other builds get major version > numbers. I guess this would mean we are at BioJava 7 ? > > Backwards compatibility would be great to have but not if the effort > required hinders innovation. > > - Mark > > On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland > wrote: >>> >>> >>> What I mean is that we should try not to disrupt things as much as is >>> reasonable. I am all for a pragmatic approach. While trying to be >>> conservative I guess refactoring should be discussed on a case by case >>> basis. To give an example: an area where I am supporting re-factoring >>> is the blast parser. The package name is confusing and we probably >>> need some code changes to expose more details of the parser. Are you >>> thinking of any other situtations, where you think breaking backwards >>> compatibility will be inevitable? >> >> Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). >> >> My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). >> >> cheers, >> Richard >> >>> Andreas >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> -- >> Richard Holland, BSc MBCS >> Operations and Delivery Director, Eagle Genomics Ltd >> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >> http://www.eaglegenomics.com/ >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > Jules Jacobsen UniProt-PDB Integration EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK From andreas at sdsc.edu Tue Aug 25 17:36:45 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 25 Aug 2009 10:36:45 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908241917r6beb5329wb862ce8913ac74d7@mail.gmail.com> <459AAD48-B5F5-4725-9142-287726BBB931@eaglegenomics.com> <93b45ca50908242358x4181df07ye61197a2d23b6a0@mail.gmail.com> <12c279870908250145waf21d9fmed256a3573a9ee1d@mail.gmail.com> Message-ID: <59a41c430908251036s616ab5f3m825d95223e758d85@mail.gmail.com> I agree with all that has been said so far. The Sequence/Feature model is definitely not good enough and well, also does not work for protein structures. (There can be alternate positions and the numbering can be non-sequential and have negative positions.) Still the question is, do we need to throw away the backwards compatibility? The new modularization will allow a plug and play architecture and we could easily have two generations of code in different modules. That way legacy code could depend on the older "core" (perhaps we should find a different name) while newly written code will be based on biojava-sequence, which would contain Richard's new code. That way we could prepare the code for the future, while still embracing the past. One example that heavily uses the Sequence and Distributions APIs is NestedMica. It is a pretty cool machine learning software and I was hoping that we could bring that closer to biojava. (a machine learning module in BJ would be cool, no?) Andreas On Tue, Aug 25, 2009 at 1:45 AM, Jules Jacobsen wrote: > I think Mark has a good point here - there are certain aspects of > BioJava which are considered to be un-necessarily over-complicated and > these things have been deal-breakers for the people concerned - I > remember a couple of cases from the EBI where they have implemented > their own system instead of using and supporting BioJava. > > Fixing areas of confusion, simplifying and moving forwards without > maintaining backwards-compatibility might be a good idea for > increasing user numbers and elevating the general perception of the > project, whilst potentially risking alienating some existing users. > > I think his idea of maintaining compatibility within point releases > and stating that full version releases may not have backwards > compatibility would make it clearer for users as to what to expect > from a release. It may also help the developers stay on track with the > task and general design focus for that release by constraining them to > the current system during a point release whilst highlighting > confusing areas which can be dealt with in a more satifsfactory manner > in the next full release. > > ?Jules > > On Tue, Aug 25, 2009 at 7:58 AM, Mark Schreiber wrote: >> I would agree with Richard on this. I think the changes being proposed >> are not compatible with the current API. There are a couple of things >> wrong with the current model (such as the Feature, Strand, Location >> issues). There are also several areas where best-practices of the past >> (parts of BioJava are 10 years old) are not considered best practices >> now (some like Singletons are often thought of as anti-patterns these >> days). >> >> Add to that the fact that we have never been truly backwards >> compatible (expept maybe 1.3 and 1.3.1 ?) ?and I think we can >> justifiably try and avoid the claim that BJ1.7 should be backwards >> compatible. ?We can continue to make older Jars available for people >> who need them although most likely people who have a need for legacy >> support already have the Jars that they need bundled up with their >> apps. Shared libraries have very much fallen out of favor in recent >> years in almost all languages and system wide classpaths are asking >> for trouble. ?Hard-drives are cheap so it is no big deal to have a >> dedicated version of the BioJava jar bundled with each app that needs >> it. >> >> We could adopt the idea that backwards compatible builds get >> minor-version numbers eg 1.1 while other builds get major version >> numbers. I guess this would mean we are at BioJava 7 ? >> >> Backwards compatibility would be great to have but not if the effort >> required hinders innovation. >> >> - Mark >> >> On Tue, Aug 25, 2009 at 12:32 PM, Richard Holland >> wrote: >>>> >>>> >>>> What I mean is that we should try not to disrupt things as much as is >>>> reasonable. I am all for a pragmatic approach. While trying to be >>>> conservative I guess refactoring should be discussed on a case by case >>>> basis. To give an example: an area where I am supporting re-factoring >>>> is the blast parser. The package name is confusing and we probably >>>> need some code changes to expose more details of the parser. Are you >>>> thinking of any other situtations, where you think breaking backwards >>>> compatibility will be inevitable? >>> >>> Almost all the parsers would fit this category, as would any realistic attempt to 'fix' the sequence model by moving bits of the APIs around (for instance, Sequences have Features which have Strands, but Locations do _not_ have Strands - which is all wrong, because Strand is a Location-level concept, not a Feature-level concept). >>> >>> My original plan was to not even attempt to make new versions backward compatible, and instead to have a separate module which coerced the new objects into complying with the old API interface declarations (by using the facade model). >>> >>> cheers, >>> Richard >>> >>>> Andreas >>>> >>>> _______________________________________________ >>>> biojava-dev mailing list >>>> biojava-dev at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>> >>> -- >>> Richard Holland, BSc MBCS >>> Operations and Delivery Director, Eagle Genomics Ltd >>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com >>> http://www.eaglegenomics.com/ >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > > > > Jules Jacobsen > > UniProt-PDB Integration > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge > CB10 1SD > UK > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From cmasak at gmail.com Wed Aug 26 13:29:24 2009 From: cmasak at gmail.com (=?ISO-8859-1?Q?Carl_M=E4sak?=) Date: Wed, 26 Aug 2009 15:29:24 +0200 Subject: [Biojava-dev] [BUG] Infinite regress when calling DNATools.createDNASequence with a DNA string containing a '~' char Message-ID: <16d769b70908260629m15512bc4tb8798d41d53fad0f@mail.gmail.com> Hello, Two things: 1. The BioJava wiki links to a Bugzilla instance, saying bugs should be posted there ([1]). As I write this, that Bugzilla instance gives a 500 Internal Server Error ([2]). [1] [2] 2. In the face of this, I hope you don't mind I leave my bug report here for the time being. We're wrapping BioJava in the Bioclipse project. We've found what appears to be a logical bug causing an infinite regress and a stack overflow. Let's call DNATools.createDNASequence("~", ""). The following code in that method (org/biojava/bio/seq/DNATools.java:188) will be executed. public static Sequence createDNASequence(String dna, String name) throws IllegalSymbolException { //should I be calling createGappedDNASequence? if(dna.indexOf('-') != -1 || dna.indexOf('~') != -1){//there is a gap return createGappedDNASequence(dna, name); } The following code in createGappedDNASequence (DNATools.java:207) will be executed: /** Get a new dna as a GappedSequence */ public static GappedSequence createGappedDNASequence(String dna, String name) throws IllegalSymbolException{ String dna1 = dna.replaceAll("-", ""); Sequence dnaSeq = createDNASequence(dna1, name); The infinite regress is caused by these two methods calling each other, for ever. There is no bottoming-out, because none of these lines removes '~' characters. We experience this problem in Biojava 1.6, but the above code and line numbers are from 1.7, where the issue remains. Regards, // Carl M?sak From heuermh at acm.org Thu Aug 27 17:01:31 2009 From: heuermh at acm.org (Michael Heuer) Date: Thu, 27 Aug 2009 13:01:31 -0400 (EDT) Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: Andreas Prlic wrote: > Here a list of modules / action items and the people that I would propose to > become module leaders: > ... > > Module: biojava-sequencing Lead: Michael Heuer > - support FastQ files > - support parsing of output for various new sequencing machines I have volunteered on the open-bio mailing list to implement FASTQ support. A nice collection of test data is being created in collaboration with the other open-bio projects. If anyone has interest in a particular data set, please let me know, as I will also need data for performance tuning. michael From andreas at sdsc.edu Thu Aug 27 17:30:08 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 27 Aug 2009 10:30:08 -0700 Subject: [Biojava-dev] BioJava code freeze, modularization and action items for sub modules In-Reply-To: References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> Message-ID: <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> Great, thanks for "volunteering", Michael. To add another Module: biojava-das : Lead: Jonathan Warren probably deprecate the old DAS code in BJ and replace it with the up to date Dasobert library Thanks to Jonathan for volunteering as well. Andreas On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: > Andreas Prlic wrote: > >> Here a list of modules / action items and the people that I would propose to >> become module leaders: >> ... >> >> Module: biojava-sequencing Lead: ?Michael Heuer >> ? - support FastQ files >> ? - support parsing of output for various new sequencing machines > > I have volunteered on the open-bio mailing list to implement FASTQ > support. ?A nice collection of test data is being created in collaboration > with the other open-bio projects. ?If anyone has interest in a particular > data set, please let me know, as I will also need data for performance > tuning. > > ? michael > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > From markjschreiber at gmail.com Fri Aug 28 05:37:59 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 28 Aug 2009 13:37:59 +0800 Subject: [Biojava-dev] [Biojava-l] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> Message-ID: <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> I'm happy to volunteer code for: 1. BLASTXML parser as long as I can change the ssbind APIs (other parsers could go into a legacy module??). Actually I would prefer to completely decouple from the sequence/ feature module as many people would like a blast parser without the rest of biojava thrown in. 2. BioSQL/ JPA bindings. I have already generated JPA compliant entity beans for mapping to BioSQL as well as JPA handler code that makes sure modifications presist properly. Currently the object model very closely follows the BioSQL table structure. Also the current beans are what people call Anaemic beans in that they hold data and provide getters and setters but no biological behaivour. I can easily provide bio-smarts to the beans but it might be better to hold off until there is a module that contains sequence/feature interfaces which the beans could implement. 3. Happy to provide code for an enterprise module if there is sufficient interest. This would probably take the form of SessionBeans and WebServices that can be deployed to Glassfish/ JBoss etc to provide biological services for people who want to make client server or SOA apps. - Mark On Fri, Aug 28, 2009 at 1:30 AM, Andreas Prlic wrote: > Great, thanks for "volunteering", Michael. > > To add another Module: > > biojava-das : Lead: Jonathan Warren > probably deprecate the old DAS code in BJ and replace it with > the up to date Dasobert library > > Thanks to Jonathan for volunteering as well. > > Andreas > > > > > On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: > > Andreas Prlic wrote: > > > >> Here a list of modules / action items and the people that I would > propose to > >> become module leaders: > >> ... > >> > >> Module: biojava-sequencing Lead: Michael Heuer > >> - support FastQ files > >> - support parsing of output for various new sequencing machines > > > > I have volunteered on the open-bio mailing list to implement FASTQ > > support. A nice collection of test data is being created in > collaboration > > with the other open-bio projects. If anyone has interest in a particular > > data set, please let me know, as I will also need data for performance > > tuning. > > > > michael > > > > _______________________________________________ > > biojava-dev mailing list > > biojava-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Fri Aug 28 15:10:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 28 Aug 2009 08:10:03 -0700 Subject: [Biojava-dev] [Biojava-l] BioJava code freeze, modularization and action items for sub modules In-Reply-To: <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> References: <59a41c430908232118k2fff9564of1a45fba447eb922@mail.gmail.com> <59a41c430908271030p7318c468u8d145f5750369cb3@mail.gmail.com> <93b45ca50908272237k2485a1d8le343a8b1dc10ae12@mail.gmail.com> Message-ID: <59a41c430908280810s1720cfckbc36168f2fbc73a8@mail.gmail.com> Thanks, Mark. Guess we should start collecting all this info on a wiki page. I started to edit http://biojava.org/wiki/BioJava:Modules module leaders: feel free to edit the plans for your module... Andreas On Thu, Aug 27, 2009 at 10:37 PM, Mark Schreiber wrote: - Show quoted text - On Thu, Aug 27, 2009 at 10:37 PM, Mark Schreiber wrote: > I'm happy to volunteer code for: > > BLASTXML parser as long as I can change the ssbind APIs (other parsers could > go into a legacy module??). Actually I would prefer to completely decouple > from the sequence/ feature module as many people would like a blast parser > without the rest of biojava thrown in. > BioSQL/ JPA bindings. I have already generated JPA compliant entity beans > for mapping to BioSQL as well as JPA handler code that makes sure > modifications presist properly. Currently the object model very closely > follows the BioSQL table structure.? Also the current beans are what people > call Anaemic beans in that they hold data and provide getters and setters > but no biological behaivour. I can easily provide bio-smarts to the beans > but it might be better to hold off until there is a module that contains > sequence/feature interfaces which the beans could implement. > Happy to provide code for an enterprise module if there is sufficient > interest. This would probably take the form of SessionBeans and WebServices > that can be deployed to Glassfish/ JBoss etc to provide biological services > for people who want to make client server or SOA apps. > > - Mark > > > On Fri, Aug 28, 2009 at 1:30 AM, Andreas Prlic wrote: >> >> Great, thanks for "volunteering", Michael. >> >> To add another Module: >> >> biojava-das : Lead: Jonathan Warren >> probably deprecate the old DAS code in BJ and replace it with >> the up to date Dasobert library >> >> Thanks to Jonathan for volunteering as well. >> >> Andreas >> >> >> >> >> On Thu, Aug 27, 2009 at 10:01 AM, Michael Heuer wrote: >> > Andreas Prlic wrote: >> > >> >> Here a list of modules / action items and the people that I would >> >> propose to >> >> become module leaders: >> >> ... >> >> >> >> Module: biojava-sequencing Lead: ?Michael Heuer >> >> ? - support FastQ files >> >> ? - support parsing of output for various new sequencing machines >> > >> > I have volunteered on the open-bio mailing list to implement FASTQ >> > support. ?A nice collection of test data is being created in >> > collaboration >> > with the other open-bio projects. ?If anyone has interest in a >> > particular >> > data set, please let me know, as I will also need data for performance >> > tuning. >> > >> > ? michael >> > >> > _______________________________________________ >> > biojava-dev mailing list >> > biojava-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > From andreas at sdsc.edu Mon Aug 31 01:23:03 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 30 Aug 2009 18:23:03 -0700 Subject: [Biojava-dev] maven progress Message-ID: <59a41c430908301823s6e2e3d7fi6caffc47e1a8c0ff@mail.gmail.com> Hi, I started to split up biojava into submodules and am mavenizing the build process. The new SVN location is emerging here: http://dev.open-bio.org/home/svn-repositories/biojava/biojava-live/biojava or in your browser: http://code.open-bio.org/svnweb/index.cgi/biojava/browse/biojava-live/biojava A few questions so far from my side. 1) bytecode.jar: at the present the core module depends on this. So far it is in the /jars subfolder of the module and needs to be installed by hand. What is the best way to deal with this in SVN? 2) Sequence module (Richard's original biojava v.3 branch) Since this consists of sub-modules I have set it up as a few hierarchically organized submodules. There is some biosql code there as well. Richard/Mark not sure now to arrange this. I think it would be good to have a biosql module. Shall I refactor the current biosql code out of core into a new biosql module or will the current code be obsoleted and replaced with the new code in the sequence module? Andreas