From bradford.powell at gmail.com Fri Jul 8 21:26:15 2005 From: bradford.powell at gmail.com (bradford powell) Date: Fri Jul 8 21:17:14 2005 Subject: [Biojava-dev] misplaced close() in BioSQLSequenceDB.java Message-ID: <5418df3e05070818263a943110@mail.gmail.com> I put off tracking this down for a while, but whenever I would load a new database the first attempt to access it would result in the loading of the ontology followed by an exception. A second run would just continue. The problem is that lines 267 and 268 of org.biojava.bio.seq.db.biosql.BioSQLSequenceDB are: conn.close() dbid = getDBHelper().getInsertID(conn, "biodatabase", "biodatabase_id"); The second line throws an exception because 'conn' has already been closed. This progresses down to the catch block at line 274, where because 'conn' has been closed but is not null, an attempt is made to close conn again, leading to the BioException which halts the program. So, lines 267 and 268 should be reversed in order. And it may be useful to change the criterion in the catch block of 274 to 'if (conn != null && !conn.isClosed())'. The latter change is only important if there are other SQLExceptions that may be thrown after the connection has been closed. From mark.schreiber at novartis.com Sun Jul 10 21:48:12 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Jul 10 21:39:29 2005 Subject: [Biojava-dev] Region & Junction Message-ID: I think this is similar to what Matthew was proposing for his BJv2. Not sure if that ever got very far but it is a good idea. The problem is, who would want to reinvent the entire code base to correct a design flaw we have had from the start. Actually, it's not really even a design flaw more of an inellegance. If you really want this behaivour you could wrap SymbolList with a view that translates interbase coordinates into biological coordinates. There is also the rarely used BetweenLocation which sounds like your Junction class. One thing I came across which actually is a design flaw and related to this is that Strand needs to be specified at the level of Location not Feature. This is cause some GenBank records contain assemblies of different chunks of sequence from different records and (believe it or not) different strands (complement)!! This cannot be done without breaking biojava. Maybe it could be done with something like a compound feature which is like a CompoundLocation but combines StrandedFeatures as a whole. So I guess it could be done with the current API but would not be pretty. - Mark Michael Heuer Sent by: biojava-dev-bounces@portal.open-bio.org 06/24/2005 06:32 AM To: biojava-dev@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Region & Junction Hello, Just floating an idea that's been in my head for a bit. What if for a "biojava next" we replace Sequence and typed Features etc. with classes extending the two basic SOFA classes _region_ and _junction_. A region is a length of symbols and a junction is the space between two symbols [1, 2]. At the API level Region would look something like a SymbolList but with interbase (zero-based) coordinates. Subclasses of _region_, such as _exon_ and _transposable_element_, and of _junction_, like _insertion_site_, might be generated as java classes from SOFA itself (or maybe an OWL representation thereof). Methods could be generated between these classes representing the various SOFA relationship types (_kind_of_, _derives_from_, _part_of_). Of course, in java land these class names would take on proper JavaClassNameCapitalization instead of lowercase_with_underscores. michael [1] > http://song.sf.net [2] > http://genomebiology.com/2005/6/5/R44 _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev From mark.schreiber at novartis.com Mon Jul 11 05:35:45 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Mon Jul 11 05:26:58 2005 Subject: [Biojava-dev] Announce: Biojava1.4 released Message-ID: BioJava 1.4 has been officially released. This represents a major new step in biojava's development. It has been about two years in the making and offers considerably more functionality and stability over the previous official release (biojava 1.3). We highly recommend you upgrade as soon as possible. Thanks to the entire biojava community for making this possible! Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From rpudimat at informatik.uni-jena.de Mon Jul 11 07:41:40 2005 From: rpudimat at informatik.uni-jena.de (Rainer Pudimat) Date: Mon Jul 11 07:33:06 2005 Subject: [Biojava-dev] SymbolTokenizer for Meme class Message-ID: <42D25AF4.5090506@informatik.uni-jena.de> Hello, once again, I try to use the Meme class for reading ASCII results of the meme software. However, I don't know which SymbolTokenizer should be used in the arguments of Meme constructor. I tried to use SymbolTokenization st = new CharacterTokenization(DNATools.getDNA(),true); An exception was thrown: "This tokenization doesn't contain character: 'A'." So, which one to use instead? Thank you, Bye, Rainer. -- Rainer Pudimat Bioinformatics Institute for Computer Science Friedrich-Schiller-University Ernst-Abbe-Platz 2 /Room 3427 D - 07743 - Jena Phone: +49 3641 946 456 Web: http://www.bio.inf.uni-jena.de/~rpudimat/ From mark.schreiber at novartis.com Mon Jul 11 21:04:36 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Mon Jul 11 20:56:19 2005 Subject: [Biojava-dev] SymbolTokenizer for Meme class Message-ID: Probably depends on which type of sequence you ran meme on. If it was DNA use the DNATools.getDNA() if protein use ProteinTools.getAlphabet() Let us know if this still doesn't work. Would be good if you could tell us the version of biojava and paste in some of the meme file. - Mark Rainer Pudimat Sent by: biojava-dev-bounces@portal.open-bio.org 07/11/2005 07:41 PM Please respond to rpudimat To: biojava-dev@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] SymbolTokenizer for Meme class Hello, once again, I try to use the Meme class for reading ASCII results of the meme software. However, I don't know which SymbolTokenizer should be used in the arguments of Meme constructor. I tried to use SymbolTokenization st = new CharacterTokenization(DNATools.getDNA(),true); An exception was thrown: "This tokenization doesn't contain character: 'A'." So, which one to use instead? Thank you, Bye, Rainer. -- Rainer Pudimat Bioinformatics Institute for Computer Science Friedrich-Schiller-University Ernst-Abbe-Platz 2 /Room 3427 D - 07743 - Jena Phone: +49 3641 946 456 Web: http://www.bio.inf.uni-jena.de/~rpudimat/ _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev From mark.schreiber at novartis.com Tue Jul 12 04:51:30 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Tue Jul 12 04:42:44 2005 Subject: [Biojava-dev] SymbolTokenizer for Meme class Message-ID: >I have used MEME for DNA sequences and produced text output (no html). >BioJava version: 1.3 I would strongly recommend upgrading to biojava 1.4 (now the official release) unless you have a strong attachment to version 1.3, that version is over 2 years old now. Looking in CVS at least one change was made to update the file to read meme v3 output. That should fix the bug you see with "log", i believe I made the same change you did. >I don't how, jow java's StreamTokenizer works, but the Meme constructor >seems to look for the keyword "ALPHABET". Then i guess it looks for the >first TT_WORD after that keyword, which is ACGT >(ALPHABET: ACGT) >It breaks when trying to build a SimpleSymbolList from ACGT using the >SymbolTokenization I gave as parameter. > >However it works when I construct the parser in another way: >SymbolTokenization ct = DNATools.getDNA().getTokenization("token"); > >instead of > >SymbolTokenization ct = new CharacterTokenization(DNATools.getDNA(),true); Sorry, I didn't read your email carefully. As you have discovered the technique you use is the best way to get a SymbolTokenization. I should put this in Biojava in Anger. >There is another thing that does not work. >The column distributions of the weight matrix class >are not allowed to get negative values. On the one hand this is >semantically correct since it is a probability distribution. On the >other hand the Meme constructor tries to read the log-odds-score matrix. >(looks for keyword "log"). I've changed the constructor (at my local >installation) to look for keyword "letter". Now it reads the >letter-probability matrix which is also given in the result files. I believe this is fixed in biojava 1.4 (see above). Let me know if this doesn't work. >Is there a class for log-odds matrices? Not really, WeightMatrices are backed by Distributions which are not log-odds. However WeightMatrices can use a log-odds ScoreType which calculates the log odds of a Distribution versus its Null Distribution. - Mark From maillist at roomity.com Fri Jul 15 13:04:30 2005 From: maillist at roomity.com (Rex) Date: Fri Jul 15 12:55:07 2005 Subject: [Biojava-dev] Test Mesg Message-ID: <9938272.01121447070495.JavaMail.agent1@slave1.roomity.com> Can't see any message i here -------------------------------------------------------- Mail posted through roomity.com id - 1121447070316 -------------------------------------------------------- From kalle.naslund at genpat.uu.se Fri Jul 15 13:12:32 2005 From: kalle.naslund at genpat.uu.se (=?ISO-8859-1?Q?Kalle_N=E4slund?=) Date: Fri Jul 15 13:02:34 2005 Subject: [Biojava-dev] Test Mesg In-Reply-To: <9938272.01121447070495.JavaMail.agent1@slave1.roomity.com> References: <9938272.01121447070495.JavaMail.agent1@slave1.roomity.com> Message-ID: <42D7EE80.2010908@genpat.uu.se> Rex wrote: >Can't see any message i here > > > I can see your mail, if that is what you are asking. >-------------------------------------------------------- >Mail posted through roomity.com id - 1121447070316 >-------------------------------------------------------- > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > From heuermh at acm.org Thu Jul 28 23:47:17 2005 From: heuermh at acm.org (Michael Heuer) Date: Thu Jul 28 23:36:24 2005 Subject: [Biojava-dev] [proposal] Commons Exec (fwd) Message-ID: Forwarding this proposal from the Apache commons-dev mailing list for a commons component to support executing external processes. Chances are this could be something good to build on for support in biojava in dealing with external bioinformatics applications. The code will intially live in the commons sandbox, see > http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/exec/ michael ---------- Forwarded message ---------- Date: Fri, 29 Jul 2005 03:32:38 +0200 From: Niklas Gustavsson Reply-To: Jakarta Commons Developers List To: commons-dev@jakarta.apache.org Subject: [proposal] Commons Exec Proposal for Exec Package Rationale ------------------------------------ Executing external processes from Java is a well-known problem area. It is inheriently platform dependent and requires the developer to know and test for platform specific behaviors, for example using cmd.exe on Windows or limited buffer sizes causing deadlocks. The JRE support for this is very limited, albeit better with the new Java SE 1.5 ProcessBuilder class. Reliably executing external processes can also require knowledge of the environment variables before or after the command is executed. In J2SE 1.1-1.4 there is not support for this, since the method, System.getenv(), for retriving environment variables is deprecated. The are currently several different libraries that for their own purposes has implemented frameworks around Runtime.exec() to handle the various issue outlined above. The proposed project should aim at coordinating and learning from these initatives to create and maintain a simple, reusable and well-tested package. Since some of the more problematic platforms are not readily available, it is my hope that the broad Apache community can be a great help. Scope of the package ------------------------------------ The package shall create and maintain a process execution package written in the Java language to be distributed under the ASF license. The Java code might also be complemented with scripts (e.g. Perl scripts) to fully enable execution on some operating systems. The package should aim for supporting a wide range of operating systems while still having a consistent API for all platforms. Interaction with other packages ------------------------------------ This package will using Commons Logging for logging debug and error information. Identify the initial source for the package ------------------------------------ Several implementations exists and should be researched before finalizing the design: * Ant 1.X contains probably the most mature code within the exec task. This code has been stripped of the Ant specifics and cleaned up by Niklas Gustavsson and can be donated under the ASF license. * Ant 2.X contains a new exec implementation, especially targeted for reusability (see http://ant.apache.org/ant2/actionlist.html#exec). * plexus-utils has a similar but slimmer implementation than Ant and has also indicated interest through Trygve Laugstøl the particate in the development. Identify the base name for the package ------------------------------------ org.apache.commons.exec Identify the coding conventions for this package ------------------------------------ Sun conventions. Identify any Jakarta-Commons resources to be created ------------------------------------ * Mailing list Until traffic justifies, the package will use the Jakarta-Commons lists for communications. * SVN repositories A new SVN directory under Commons Sandbox * Bugzilla The package should be listed as a component of under the Jakarta-Commons Bugzilla entry. * Integration test builds If possible, some form of integration test builds on various platforms (like the SourceForge compile farm) would be invaluable. I'm unsure of what for example Gump and the current Apache infrastructure has to offer in this area. Identify the initial set of committers to be listed in the Status File ------------------------------------ Brett Porter Stefan Bodewig Niklas Gustavsson (I'm not currently an Apache commiter so I don't know if this is possible) /niklas --------------------- Niklas Gustavsson niklas@protocol7.com http://www.protocol7.com --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org From mark.schreiber at novartis.com Fri Jul 29 02:44:07 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Fri Jul 29 02:34:40 2005 Subject: [Biojava-dev] [proposal] Commons Exec (fwd) Message-ID: Worth watching. It would be very useful when combined with EMBOSS or similar. Michael Heuer Sent by: biojava-dev-bounces@portal.open-bio.org 07/29/2005 11:47 AM To: biojava-dev@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] [proposal] Commons Exec (fwd) Forwarding this proposal from the Apache commons-dev mailing list for a commons component to support executing external processes. Chances are this could be something good to build on for support in biojava in dealing with external bioinformatics applications. The code will intially live in the commons sandbox, see > http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/exec/ michael ---------- Forwarded message ---------- Date: Fri, 29 Jul 2005 03:32:38 +0200 From: Niklas Gustavsson Reply-To: Jakarta Commons Developers List To: commons-dev@jakarta.apache.org Subject: [proposal] Commons Exec Proposal for Exec Package Rationale ------------------------------------ Executing external processes from Java is a well-known problem area. It is inheriently platform dependent and requires the developer to know and test for platform specific behaviors, for example using cmd.exe on Windows or limited buffer sizes causing deadlocks. The JRE support for this is very limited, albeit better with the new Java SE 1.5 ProcessBuilder class. Reliably executing external processes can also require knowledge of the environment variables before or after the command is executed. In J2SE 1.1-1.4 there is not support for this, since the method, System.getenv(), for retriving environment variables is deprecated. The are currently several different libraries that for their own purposes has implemented frameworks around Runtime.exec() to handle the various issue outlined above. The proposed project should aim at coordinating and learning from these initatives to create and maintain a simple, reusable and well-tested package. Since some of the more problematic platforms are not readily available, it is my hope that the broad Apache community can be a great help. Scope of the package ------------------------------------ The package shall create and maintain a process execution package written in the Java language to be distributed under the ASF license. The Java code might also be complemented with scripts (e.g. Perl scripts) to fully enable execution on some operating systems. The package should aim for supporting a wide range of operating systems while still having a consistent API for all platforms. Interaction with other packages ------------------------------------ This package will using Commons Logging for logging debug and error information. Identify the initial source for the package ------------------------------------ Several implementations exists and should be researched before finalizing the design: * Ant 1.X contains probably the most mature code within the exec task. This code has been stripped of the Ant specifics and cleaned up by Niklas Gustavsson and can be donated under the ASF license. * Ant 2.X contains a new exec implementation, especially targeted for reusability (see http://ant.apache.org/ant2/actionlist.html#exec). * plexus-utils has a similar but slimmer implementation than Ant and has also indicated interest through Trygve Laugst?l the particate in the development. Identify the base name for the package ------------------------------------ org.apache.commons.exec Identify the coding conventions for this package ------------------------------------ Sun conventions. Identify any Jakarta-Commons resources to be created ------------------------------------ * Mailing list Until traffic justifies, the package will use the Jakarta-Commons lists for communications. * SVN repositories A new SVN directory under Commons Sandbox * Bugzilla The package should be listed as a component of under the Jakarta-Commons Bugzilla entry. * Integration test builds If possible, some form of integration test builds on various platforms (like the SourceForge compile farm) would be invaluable. I'm unsure of what for example Gump and the current Apache infrastructure has to offer in this area. Identify the initial set of committers to be listed in the Status File ------------------------------------ Brett Porter Stefan Bodewig Niklas Gustavsson (I'm not currently an Apache commiter so I don't know if this is possible) /niklas --------------------- Niklas Gustavsson niklas@protocol7.com http://www.protocol7.com --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev