From autobuilder at derkholm.net Mon Sep 1 00:18:46 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Mon Sep 1 00:23:44 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062389932068.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030901 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/program/formats/Ligand.java * biojava-live/src/org/biojava/bio/seq/db/IndexStore.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From kudiratdanjuma at go.com Mon Sep 1 15:58:51 2003 From: kudiratdanjuma at go.com (DR.(MRS) KUDIRAT DANJUMA) Date: Mon Sep 1 07:56:53 2003 Subject: [Biojava-dev] Trust Message-ID: <200309011156.h81BuFfg010342@localhost.localdomain> DEAR SIR, VERY URGENT, PEASE SAVE MY SOUL. I AM KUDIRAT DANJUMA (MRS.), WIDOW OF THE LATE COL. RICHARD DANJUMA THE FORMER GOVERNOR OF KANO STATE OF NIGERIA MY LATE HUSBAND WAS ONE OF THE VICTIMS OF THE NOVEMBER 7TH 1996 NIGERIA A, D, C AIRCRAFT BOEING 727 THAT CRASHED IN LAGOS. I HAVE BEEN INFORMED BY MY FAMILY ATTORNEY, BARRISTER JIMOH SARAKI, THAT MY LATE HUSBAND OPERATED A SECRET ACCOUNT WITH FICTITIOUS NAME IN A NIGERIAN BANK INTO WHICH A TOTAL SUM OF TWENTY-FIVE MILLION, FIVE HUNDRED THOUSAND U.S DOLLARS ($25.5M) WAS TRANSFERRED AND CREDITED IN HIS FAVOUR, THE ATTORNEY NOW ADVISED ME TO SEEK IN CONFIDENCE A FOREIGN ACCOUNT INTO WHICH THIS FUND COULD BE TRANSFERRED FOR DISBURSEMENT AS DIRECTED BY MY LATE HUSBAND IN HIS "WILL". IT HAS BEEN RESOLVED THAT 25% WILL BE YOUR SHARE FOR NOMINATING AN ACCOUNT, FOR THIS PURPOSE AND ANY OTHER ASSISTANCE YOU WILL GIVE IN THIS REGARD, 5% HAS BEEN MAPPED OUT TO PAY BACK ALL LOCAL AND INTERNATIONAL EXPENSES WHICH MAY BE INCURRED IN THE TRANSFER PROCESS AND 5% HAS BEEN CONCEDED TO THE LOCAL BANK MANAGER HERE ASSISTING IN FACILITATING THE TRANSFER. FINALLY, 65% WILL COME TO MYSELF AND MY CHILDREN AND A GOOD PARTY OF THIS SHALL BE DIRECTED TOWARDS EXECUTING HIS "WILL" WHICH IS TO BUY SHARES AND STOCKS IN FOREIGN COUNTRIES TO SECURE HIS CHILDREN FUTURE. TO FACILITATE THE CONCLUSION OF THIS TRANSACTION, IF ACCEPTED, DO SEND TO E-MAIL MY FAMILY ATTORNEY THE FOLLOWINGS: ON E-MAIL NO. jimoh_attorney@yahoo.com, TEL: 234-8037132524. 1. THE ACCOUNT NUMBER TO BE USED FOR REMITTANCE 2. NAME AND ADDRESS OF YOUR BANK 3. FAX, TEL AND E-MAIL ADDRESS THROUGH WHICH YOU WILL BE CONTACTED PROMPTLY WHENEVER YOUR ATTENTION/ASSISTANCE MAY BE REQUIRED. PLEASE NOTE THAT I HAVE BEEN ASSURED THAT THE TRANSACTION WILL BE CONCLUDED IN A SHORTEST BANKING WORKING DAYS UPON MY RECEIVING FROM YOU THE ABOVE LISTED INFORMATION. I SHALL COMMENCE THE PROCESS OF RETRIEVING THE FUND IMMEDIATELY I HEAR FROM YOU. MAY I AT THIS POINT, EMPHASIS THE HIGH LEVEL OF CONFIDENTIALITY, WHICH THIS BUSINESS DEMANDS, AND HOPE YOU WILL NOT BETRAY THE TRUST AND CONFIDENCE, WHICH I REPOSE IN YOU. HOWEVER, YOU MAY NEED TO GIVE ME SUFFICIENT ASSURANCE THAT YOU WILL NOT SIT ON THIS FUND WHEN IT IS FINALLY REMITTED INTO YOUR ACCOUNT. PLEASE GIVE THIS MATTER A PROMPT AND QUICK REPLY. PLEASE CALL OR DIRECT YOUR RESPOND TO MY FAMILY ATTORNEY/BARRISTER JIMOH SARAKI AS INDICATED BELOW AND DISCUSS WITH HIM TOWARDS EFFECTIVE COMPLETION OF THIS TRANSACTION. TEL: 234-8037132524 CONTACT MY FAMILY ATTORNEY BARRISTER JIMOH SARAKI ON EMAIL: jimoh_attorney@yahoo.com, TEL: 234-8037132524. BEST REGARD, KUDIRAT DANJUMA (MRS). From postmaster at ebi.ac.uk Mon Sep 1 17:03:51 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Mon Sep 1 17:02:38 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309012103.h81L3pS29287@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@ebi.ac.uk Subject: Re: Approved Date: Mon Sep 1 22:03:51 2003 One or more of the attachments (details.pif) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Shortcuts to MS-Dos programs are very dangerous in email (details.pif) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From autobuilder at derkholm.net Tue Sep 2 00:19:03 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Tue Sep 2 00:28:18 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062476344845.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030902 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/program/formats/Ligand.java * biojava-live/src/org/biojava/bio/symbol/LocationTools.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From lmorris at ebi.ac.uk Tue Sep 2 11:56:25 2003 From: lmorris at ebi.ac.uk (Lorna Morris) Date: Tue Sep 2 11:55:13 2003 Subject: [Biojava-dev] EmblFileFormer Message-ID: <3F54BDA9.7090903@ebi.ac.uk> Hi I'm using biojava to parse an EMBLFlatFile, add extra annotation, and dump the new file out at the end. The parser seems to be really useful for this. However the file created using SeqIOTools.writeEmbl contains errors, the lines RN, RP, RX, RA, RT, RL aren't correctly nested, these lines should occur in repeated sets, but the ouput has all the RN lines, followed by all the RP lines etc, so they are merged rather than nested. I've looked at the javadoc for the class EmblFileFormer and there is a comment that might relate to this problem: *

EmblFileFormer performs the detailed formatting of * EMBL entries for writing to a PrintStream. Currently * the formatting of the header is not correct. This really needs to * be addressed in the parser which is merging fields which should * remain separate.

I've tried to address the problem by modifying the class, SeqIOEventEmitter, but have run into difficulties, because I cannot untangle which RN, RP, RX, RA, RT, RL 'belong' together in a single block, as the annotation is just in an ArrayList. Maybe I should take note of the javadoc comment above and address the problem in the parser. Is so could you give me some pointers on which classes I should focus on, in order to fix this, and whether you think it will be a difficult problem to solve. Hope this makes sense. Many thanks, Lorna ------------------------------------------------------------------- Lorna Morris EMBL-European Bioinformatics Institute Tel: +44-(0)1223-492507 Wellcome Trust Genome Campus, Hinxton Fax: +44-(0)1223-494468 Cambridge CB10 1SD, UK email:lmorris@ebi.ac.uk From matthew.pocock at ncl.ac.uk Tue Sep 2 12:09:57 2003 From: matthew.pocock at ncl.ac.uk (Matthew Pocock) Date: Tue Sep 2 12:10:33 2003 Subject: [Biojava-dev] new seq searching classes Message-ID: <3F54C0D5.9060201@ncl.ac.uk> Hi, I've added a couple of classes in org.biojava.bio.search for finding regions of sequence content. They are SeqContentPattern and SeqContentMatcher - the API is loosly based upon KMPSearch and the 1.4 regex libs. These classes aren't javadocked yet. SeqContentPattern encapsulates the rules about what regions to select - the length, and the minimum and maximum number of occurences for each nucleotide. SeqContentMatcher is a cursor produced by scp.matcher(SymbolList) and can be used to find the next match, get the matching sub-sequence and to discover the offset of that match. E.g. to find regions of length 10 with at least 8 As, no G or T and at most 2 Cs, you could do something like: SeqContentPattern scp = new SeqContentPattern(DNATools.getDNA()); scp.setLength(10); scp.setMinCounts(DNATools.a(), 8); scp.setMaxCounts(DNATools.g(), 0); scp.setMaxCounts(DNATools.c(), 2); scp.setMaxCounts(DNATooos.t(), 0); Then to search with this you'd do something like: SeqContentMatcher scm = scp.matcher(symList); while(scm.find()) { System.out.println("Hit at: " + scm.pos()); } Anybody think this is usefull? Matthew From matthew_pocock at yahoo.co.uk Tue Sep 2 12:22:57 2003 From: matthew_pocock at yahoo.co.uk (=?iso-8859-1?q?Matthew=20Pocock?=) Date: Tue Sep 2 12:21:43 2003 Subject: [Biojava-dev] EmblFileFormer In-Reply-To: <3F54BDA9.7090903@ebi.ac.uk> Message-ID: <20030902162257.14672.qmail@web14911.mail.yahoo.com> Hi Lorna, Yes - the fault goes back to the embl parser, not the writer. The parser should be keeping track of RN RP, etc. lines, and whenever a complete set goes through, it should be spitting out a single annotation event (perhaps called REFERENCE?) with all the data from a single block in it. This would then be sensibly put into a list, with one element for each reference block. The file former would then need to be modified to unpack the REFERENCE list, but this would not be a big deal. If you are keen to do this, then we can talk you through it, either here or on chat (irc.freenode.net, #biojava). Matthew --- Lorna Morris wrote: > Hi > > I'm using biojava to parse an EMBLFlatFile, add > extra annotation, and > dump the new file out at the end. The parser seems > to be really useful > for this. However the file created using > SeqIOTools.writeEmbl contains > errors, the lines RN, RP, RX, RA, RT, RL aren't > correctly nested, these > lines should occur in repeated sets, but the ouput > has all the RN lines, > followed by all the RP lines etc, so they are merged > rather than nested. > > I've looked at the javadoc for the class > EmblFileFormer and there is a > comment that might relate to this problem: > > *

EmblFileFormer performs the > detailed formatting of > * EMBL entries for writing to a > PrintStream. Currently > * the formatting of the header is not correct. This > really needs to > * be addressed in the parser which is merging > fields which should > * remain separate.

> > I've tried to address the problem by modifying the > class, > SeqIOEventEmitter, but have run into difficulties, > because I cannot > untangle which RN, RP, RX, RA, RT, RL 'belong' > together in a single > block, as the annotation is just in an ArrayList. > Maybe I should take > note of the javadoc comment above and address the > problem in the parser. > Is so could you give me some pointers on which > classes I should focus > on, in order to fix this, and whether you think it > will be a difficult > problem to solve. > > Hope this makes sense. > > Many thanks, > > Lorna > > > ------------------------------------------------------------------- > Lorna Morris > EMBL-European Bioinformatics Institute > Tel: +44-(0)1223-492507 > Wellcome Trust Genome Campus, Hinxton Fax: > +44-(0)1223-494468 > Cambridge > CB10 1SD, UK > > email:lmorris@ebi.ac.uk > > > _______________________________________________ > biojava-dev mailing list > biojava-dev@biojava.org > http://biojava.org/mailman/listinfo/biojava-dev ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://uk.messenger.yahoo.com/ From autobuilder at derkholm.net Wed Sep 3 00:11:51 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Wed Sep 3 00:28:54 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <11945013.1062562311802.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030903 Binary build: FAILED! Javadocs build: OK Core test suite: NOT RUN Problems occurred during this build cycle -- please investigate as soon as possible! The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/program/gff/GFFParser.java * biojava-live/src/org/biojava/bio/search/SeqContentMatcher.java * biojava-live/src/org/biojava/bio/search/SeqContentPattern.java * biojava-live/tests/org/biojava/bio/search/SeqContentPatternTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Wed Sep 3 05:42:24 2003 From: matthew_pocock at yahoo.co.uk (=?iso-8859-1?q?Matthew=20Pocock?=) Date: Wed Sep 3 05:41:09 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report In-Reply-To: <11945013.1062562311802.JavaMail.thomas@firechild.derkholm.net> Message-ID: <20030903094224.94189.qmail@web14902.mail.yahoo.com> My bad. Could these messages include the compiler error (or a link to it) - it would help track those things that fails for the autobuild but works for the developer. Matthew --- autobuilder@derkholm.net wrote: > BioJava automatic build system, run 20030903 > > Binary build: FAILED! > Javadocs build: OK > Core test suite: NOT RUN > > Problems occurred during this build cycle -- please > investigate as soon as possible! > > The following files were modified in the last 24 > hours: > > * > biojava-live/src/org/biojava/bio/program/gff/GFFParser.java > * > biojava-live/src/org/biojava/bio/search/SeqContentMatcher.java > * > biojava-live/src/org/biojava/bio/search/SeqContentPattern.java > * > biojava-live/tests/org/biojava/bio/search/SeqContentPatternTest.java > > A patch file reflecting these changes is available > from > > http://www.derkholm.net/autobuild/patches/ > > -- > BioJava Autobuilder, maintained by Thomas Down > If you notice any problems, contact > autobuilder@derkholm.net > > _______________________________________________ > biojava-dev mailing list > biojava-dev@biojava.org > http://biojava.org/mailman/listinfo/biojava-dev ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://uk.messenger.yahoo.com/ From postmaster at ebi.ac.uk Wed Sep 3 17:34:18 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Wed Sep 3 17:33:01 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309032134.h83LYIB22722@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@maui.ebi.ac.uk Subject: Re: Thank you! Date: Wed Sep 3 22:34:18 2003 One or more of the attachments (thank_you.pif) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Shortcuts to MS-Dos programs are very dangerous in email (thank_you.pif) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From mark.schreiber at agresearch.co.nz Wed Sep 3 18:32:59 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Wed Sep 3 18:32:04 2003 Subject: [Biojava-dev] new seq searching classes Message-ID: Hey, That sounds really cool. -----Original Message----- From: Matthew Pocock [mailto:matthew.pocock@ncl.ac.uk] Sent: Wednesday, 3 September 2003 4:10 a.m. To: biojava-dev@biojava.org Subject: [Biojava-dev] new seq searching classes Hi, I've added a couple of classes in org.biojava.bio.search for finding regions of sequence content. They are SeqContentPattern and SeqContentMatcher - the API is loosly based upon KMPSearch and the 1.4 regex libs. These classes aren't javadocked yet. SeqContentPattern encapsulates the rules about what regions to select - the length, and the minimum and maximum number of occurences for each nucleotide. SeqContentMatcher is a cursor produced by scp.matcher(SymbolList) and can be used to find the next match, get the matching sub-sequence and to discover the offset of that match. E.g. to find regions of length 10 with at least 8 As, no G or T and at most 2 Cs, you could do something like: SeqContentPattern scp = new SeqContentPattern(DNATools.getDNA()); scp.setLength(10); scp.setMinCounts(DNATools.a(), 8); scp.setMaxCounts(DNATools.g(), 0); scp.setMaxCounts(DNATools.c(), 2); scp.setMaxCounts(DNATooos.t(), 0); Then to search with this you'd do something like: SeqContentMatcher scm = scp.matcher(symList); while(scm.find()) { System.out.println("Hit at: " + scm.pos()); } Anybody think this is usefull? Matthew _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From len at reeltwo.com Wed Sep 3 23:57:05 2003 From: len at reeltwo.com (Len Trigg) Date: Wed Sep 3 23:56:15 2003 Subject: [Biojava-dev] New RDBMS support: hsqldb Message-ID: Hi guys, I just committed support for hsqldb to the biosql package. At the moment I'm the only one with the appropriate schema files, but once I've got them looking/working pretty I'll post them to the biosql people. Why is hsqldb handy? Well, it's lightweight and fully java, which means the following possibilities are much easier (and things that I intend to have a crack at): 1) Make a fully self-contained multiplatform sequence database application on a CD. 2) Make some unit tests for the biosql packages. I've started on this, because I've been experiencing problems with sequence features not being removed properly. Step 1 is to make a unit test to reproduce the problem, Step 2 is to fix it. hsqldb has a nice mode where you can run it entirely in memory, which is nice for unit tests. Cheers, Len. From mark.schreiber at agresearch.co.nz Thu Sep 4 00:03:06 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 4 00:01:59 2003 Subject: [Biojava-dev] New RDBMS support: hsqldb Message-ID: Sounds great Len, Unit tests on BioSQL will be great. It would also make for a nice demo in the demos section if you felt like putting one in there. Is the HSQLDB free to distribute? - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Thursday, 4 September 2003 3:57 p.m. To: biojava-dev Subject: [Biojava-dev] New RDBMS support: hsqldb Hi guys, I just committed support for hsqldb to the biosql package. At the moment I'm the only one with the appropriate schema files, but once I've got them looking/working pretty I'll post them to the biosql people. Why is hsqldb handy? Well, it's lightweight and fully java, which means the following possibilities are much easier (and things that I intend to have a crack at): 1) Make a fully self-contained multiplatform sequence database application on a CD. 2) Make some unit tests for the biosql packages. I've started on this, because I've been experiencing problems with sequence features not being removed properly. Step 1 is to make a unit test to reproduce the problem, Step 2 is to fix it. hsqldb has a nice mode where you can run it entirely in memory, which is nice for unit tests. Cheers, Len. _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From len at reeltwo.com Thu Sep 4 00:24:10 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 4 00:23:17 2003 Subject: [Biojava-dev] RE: New RDBMS support: hsqldb In-Reply-To: <200309040402.h84424sX021148@portal.> Message-ID: Mark, > Unit tests on BioSQL will be great. It would also make for a nice demo > in the demos section if you felt like putting one in there. Is the > HSQLDB free to distribute? Pretty much -- you just have to include their copyright notice and have an acknowledgment in any product advertising materials. So, when the time comes to commit the tests/demos, should I add the hsqldb.jar (and copyright notices etc) to cvs, or just make the build scripts detect whether you've got it installed (like they do for the junit classes)? Cheers, Len. From autobuilder at derkholm.net Thu Sep 4 00:18:48 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Thu Sep 4 00:23:49 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062649131870.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030904 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/dist/DistributionTools.java * biojava-live/src/org/biojava/bio/search/SeqContentMatcher.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java * biojava-live/src/org/biojava/bio/seq/db/biosql/DBHelper.java * biojava-live/src/org/biojava/bio/seq/db/biosql/HypersonicDBHelper.java * biojava-live/src/org/biojava/bio/seq/db/biosql/MySQLDBHelper.java * biojava-live/src/org/biojava/bio/symbol/CircularLocation.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From mark.schreiber at agresearch.co.nz Thu Sep 4 04:02:39 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 4 04:01:24 2003 Subject: [Biojava-dev] RE: New RDBMS support: hsqldb Message-ID: Either is probably OK, depends on the size of the jar I suspect. If its a big hit to download then I suppose we can make it an optional part of the ANT build. - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Thu 4/09/2003 4:24 p.m. To: biojava-dev@portal.open-bio.org Cc: Subject: [Biojava-dev] RE: New RDBMS support: hsqldb Mark, > Unit tests on BioSQL will be great. It would also make for a nice demo > in the demos section if you felt like putting one in there. Is the > HSQLDB free to distribute? Pretty much -- you just have to include their copyright notice and have an acknowledgment in any product advertising materials. So, when the time comes to commit the tests/demos, should I add the hsqldb.jar (and copyright notices etc) to cvs, or just make the build scripts detect whether you've got it installed (like they do for the junit classes)? Cheers, Len. _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From matthew_pocock at yahoo.co.uk Thu Sep 4 06:03:26 2003 From: matthew_pocock at yahoo.co.uk (=?iso-8859-1?q?Matthew=20Pocock?=) Date: Thu Sep 4 06:02:08 2003 Subject: [Biojava-dev] meta data in javadocs Message-ID: <20030904100326.44551.qmail@web14901.mail.yahoo.com> Hi, I've been playing with getting javadoc to dump prety-printed object graphs into its output. E.g., we could instantiate the FeatureFilter that is the default schema for a feature type & dump this out into the javadoc. The good news - it basically works. The bad news - I don't seem to be able to access biojava classes from within javadoc. Basically, I think the classpath setting for javadoc goes into resolving documentation, not into providing classes to the runtime. Anybody got an answer to this or know what is going on? I know this is turning into deep voodo, but then that's why I posted to -dev :) Matthew ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://mail.messenger.yahoo.co.uk From postmaster at ebi.ac.uk Thu Sep 4 12:43:19 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Thu Sep 4 12:42:02 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309041643.h84GhJb03986@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@ebi.ac.uk Subject: Re: Thank you! Date: Thu Sep 4 17:43:19 2003 One or more of the attachments (your_details.pif) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Shortcuts to MS-Dos programs are very dangerous in email (your_details.pif) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From postmaster at ebi.ac.uk Thu Sep 4 12:43:19 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Thu Sep 4 12:42:02 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309041643.h84GhJ403991@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@maui.ebi.ac.uk Subject: Re: Thank you! Date: Thu Sep 4 17:43:19 2003 One or more of the attachments (wicked_scr.scr) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Windows Screensavers are often used to hide viruses (wicked_scr.scr) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From postmaster at ebi.ac.uk Thu Sep 4 13:35:40 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Thu Sep 4 13:34:22 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309041735.h84HZe320620@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@maui.ebi.ac.uk Subject: Re: Re: My details Date: Thu Sep 4 18:35:40 2003 One or more of the attachments (application.pif) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Shortcuts to MS-Dos programs are very dangerous in email (application.pif) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From len at reeltwo.com Thu Sep 4 17:14:40 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 4 17:13:27 2003 Subject: [Biojava-dev] Feature Templates Message-ID: Hi guys, I had assumed that feature templates could be reused, but as part of my unit tests I discovered that this is not the case. My question is, should it be valid to reuse the same template for the creation of multiple features (in which case there is a bug to be fixed), or are they single-use-only objects? (Either way, the javadocs for Template should specify this -- I'll do that once I get an answer). Cheers, Len. From mark.schreiber at agresearch.co.nz Thu Sep 4 18:39:05 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 4 18:38:14 2003 Subject: [Biojava-dev] Feature Templates Message-ID: Hi Len, The "standard" way to "reuse" a template is to make a feature with the template and then generate another template from that feature. See http://www.biojava.org/docs/bj_in_anger/feature.htm for an example. This is not really reuse, you are basically copying the template but the effect is the same. - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Fri 5/09/2003 9:14 a.m. To: biojava-dev Cc: Subject: [Biojava-dev] Feature Templates Hi guys, I had assumed that feature templates could be reused, but as part of my unit tests I discovered that this is not the case. My question is, should it be valid to reuse the same template for the creation of multiple features (in which case there is a bug to be fixed), or are they single-use-only objects? (Either way, the javadocs for Template should specify this -- I'll do that once I get an answer). Cheers, Len. _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Thu Sep 4 19:23:33 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 4 19:22:24 2003 Subject: [Biojava-dev] Musings on Locations Features and Strands Message-ID: Hi all, As some of you will know the CircularLocation has been problematics for a number of reasons. However, one thing that it has highlighted for me is that Locations can be polar. For example the compound CircularLocation [(1..10), (40..60)] is not the same as [(40..60), (1..10)]. There is some concept of the leftmost end which gives a kind of polarity. Polarity in this case comes about because of the need to specify an arbitrary origin on a circular molecule. Polarity can also be caused by a Location being assigned to a particular strand. Currently in biojava the latter is handled by the StrandedFeature (ie the polarity of the Location is managed at the Feature level). This doesn't work in the case of CircularLocation and the lesson learned is that polarity is better managed at the Location level than the feature level. In a future release of Biojava (or its descendant BioJava2 or what ever we call it), I think either Location or a subinterface should handle polarity. I personally prefer that Location does so you don't need to check object types all the time. What then to do about StrandedLocation? I think that Strand could in many cases be derived from the Locations polarity except in the case of CircularLocations which are polar regardless. In that case an overloaded constructor (possibly hidden in a factory object?) could deal with CircularLocations. Enough ranting. Does this sound sensible? Do you think we coudl roll it into bj1.4 or will it break lots of stuff? in which case we should wait for BJ2. - Mark ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From gerardkilmartin at hotmail.com Thu Sep 4 20:07:45 2003 From: gerardkilmartin at hotmail.com (Gerard Kilmartin) Date: Thu Sep 4 20:06:26 2003 Subject: [Biojava-dev] Query Message-ID: Hi Iim using the biojava blastlikedataset and I am unable to retieve implied values for HitSummary such as readingFrame although they are in the blast document. Have you any idea what my problem might be I would appreciate any help you can provide. Many thanks Gerard _________________________________________________________________ Get Hotmail on your mobile phone http://www.msn.co.uk/msnmobile From len at reeltwo.com Thu Sep 4 21:47:48 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 4 21:46:35 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: References: Message-ID: Mark Schreiber wrote: > The "standard" way to "reuse" a template is to make a feature with the template and then generate another template from that feature. > > See http://www.biojava.org/docs/bj_in_anger/feature.htm for an example. > > This is not really reuse, you are basically copying the template but the effect is the same. I'm not too worried about how to achieve the affect, more how to avoid the misuse (by people who don't know better ;-)) of the template object. I think that either it should be valid to reuse the template objects themselves, or they should somehow be marked as unusable by the createFeature process, so that any attempt to reuse them results in an exception being thrown. What say you? Cheers, Len. From mark.schreiber at agresearch.co.nz Thu Sep 4 21:52:53 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 4 21:51:53 2003 Subject: [Biojava-dev] Feature Templates Message-ID: Either option sounds OK. I'm not sure why they are not reusable. The bowels of the feature machinery are a dark and scarey place. My preference would be that they can be reused but if they cannot then an exception (or at least some strong documentation) would be the way to go. Hopefully Thomas or Matthew can shed some light on why Feature templates behaive this way. - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Fri 5/09/2003 1:47 p.m. To: Schreiber, Mark Cc: biojava-dev Subject: Re: [Biojava-dev] Feature Templates Mark Schreiber wrote: > The "standard" way to "reuse" a template is to make a feature with the template and then generate another template from that feature. > > See http://www.biojava.org/docs/bj_in_anger/feature.htm for an example. > > This is not really reuse, you are basically copying the template but the effect is the same. I'm not too worried about how to achieve the affect, more how to avoid the misuse (by people who don't know better ;-)) of the template object. I think that either it should be valid to reuse the template objects themselves, or they should somehow be marked as unusable by the createFeature process, so that any attempt to reuse them results in an exception being thrown. What say you? Cheers, Len. ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From autobuilder at derkholm.net Fri Sep 5 00:18:52 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Fri Sep 5 00:23:57 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062735536280.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030905 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/gui/sequence/HeadlessRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/PairwiseSequencePanel.java * biojava-live/src/org/biojava/bio/gui/sequence/SequencePanel.java * biojava-live/src/org/biojava/bio/gui/sequence/SequencePoster.java * biojava-live/src/org/biojava/bio/gui/sequence/SequenceRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SubPairwiseRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SubSequenceRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SymbolSequenceRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/TranslatedSequencePanel.java * biojava-live/src/org/biojava/bio/program/formats/Ligand.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLAllFeatures.java * biojava-live/src/org/biojava/bio/seq/io/EmblFileFormer.java * biojava-live/src/org/biojava/bio/seq/io/EmblLikeFormat.java * biojava-live/src/org/biojava/bio/taxa/EbiFormat.java * biojava-live/tests/files/biosqldb-hsqldb.sql * biojava-live/tests/files/drop-biosqldb-hsqldb.sql * biojava-live/tests/org/biojava/bio/seq/db/AbstractSequenceDBTest.java * biojava-live/tests/org/biojava/bio/seq/db/HashSequenceDBTest.java * biojava-live/tests/org/biojava/bio/seq/db/biosql/BioSQLSequenceDBTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Fri Sep 5 05:25:18 2003 From: matthew_pocock at yahoo.co.uk (=?iso-8859-1?q?Matthew=20Pocock?=) Date: Fri Sep 5 05:24:02 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: Message-ID: <20030905092518.8950.qmail@web14902.mail.yahoo.com> Hi Mark, Len Feature templates should be able to be re-used. In fact, it's considered good style to re-use them where multiple features are being created with essentialy the same data e.g. varying by just the location. The new features should then have properties that vary 100% independantly of the field values of the template. One potential gotcha is the Annotation - this may be refferenced by the new feature or be coppied. Not sure which impementations do what. What is the problem you are seeing? Matthew ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://mail.messenger.yahoo.co.uk From postmaster at ebi.ac.uk Fri Sep 5 14:42:21 2003 From: postmaster at ebi.ac.uk (MailScanner) Date: Fri Sep 5 14:40:59 2003 Subject: [Biojava-dev] Warning: E-mail viruses detected Message-ID: <200309051842.h85IgLK25619@maui.ebi.ac.uk> Our virus detector has just been triggered by a message you sent:- To: senger@maui.ebi.ac.uk Subject: Thank you! Date: Fri Sep 5 19:42:20 2003 One or more of the attachments (movie0045.pif) are on the list of unacceptable attachments for this site and will not have been delivered. Consider renaming the files or putting them into a "zip" file to avoid this constraint. The virus detector said this about the message: Report: Shortcuts to MS-Dos programs are very dangerous in email (movie0045.pif) -- MailScanner Email Virus Scanner www.mailscanner.info Mailscanner thanks transtec Computers for their support From autobuilder at derkholm.net Sat Sep 6 00:18:50 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sat Sep 6 00:23:56 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062821932239.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030906 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From davuluri-1 at medctr.osu.edu Thu Sep 4 18:28:47 2003 From: davuluri-1 at medctr.osu.edu (Ramana Davuluri) Date: Sat Sep 6 12:25:26 2003 Subject: [Biojava-dev] Java toolkit for visualization of regulatory regions Message-ID: Dear BioJava Developers, We have recently developed a Java visualization tool kit for visualizing gene regulatory region annotations. We want to make it available as an open source software to Bioinformatics community. Would it be possible to integrate this library into your BioJava modules. You may have a look at this library, called GCVTK, available at : http://bioinformatics.med.ohio-state.edu/GDVTK We also welcome your suggestions. Regards - ramana Ramana Davuluri, Ph.D. Assistant Professor (Bioinformatics & Computational Biology) Human Cancer Genetics Program Department of Molecular Virology, Immunology and Medical Genetics The Ohio State University 420 West 12th Avenue, TMRF 524 Columbus, OH 43210 Email: Davuluri-1@medctr.osu.edu Tel: 614-688-3088 (off) Tel: 614-688-4776 (lab) Fax: 614-688-4006 http://www.cancergenetics.med.ohio-state.edu/FacultyDavuluri.html http://bioinformatics.med.ohio-state.edu From autobuilder at derkholm.net Sun Sep 7 00:18:50 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sun Sep 7 00:24:00 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062908335866.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030907 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From len at reeltwo.com Sun Sep 7 18:26:44 2003 From: len at reeltwo.com (Len Trigg) Date: Sun Sep 7 18:25:30 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: <20030905092518.8950.qmail@web14902.mail.yahoo.com> References: <20030905092518.8950.qmail@web14902.mail.yahoo.com> Message-ID: Matthew Pocock wrote: > One potential gotcha is the Annotation - this may be > refferenced by the new feature or be coppied. Not sure > which impementations do what. > > What is the problem you are seeing? Yeah the problem I was having was with the annotation. I just had a quick squizz at the code now, and it looks to be just setting a reference to the template annotation. Now I know what the expected behaviour is, I'll sort it. Cheers, Len. From endo at genetix-h.com Sun Sep 7 20:19:04 2003 From: endo at genetix-h.com (Takaho A. Endo) Date: Sun Sep 7 20:17:42 2003 Subject: [Biojava-dev] Developing LD visualizer Message-ID: <0DAAF02F-E192-11D7-9809-000A956A6950@genetix-h.com> Dear all, I am studying in medical faculty of Tokai University, Japan. For a certain sake I have developed a visualizing application of linkage disequilibrium. The name of the application is ALDER (autonomic linkage disequilibrium (LD) expression resources). This is a GOLD(1)-like but finer application written in Java, which calculates LD by itself and shows LD graphs reflecting marker distance. Due to current trend, we are required to show haplotype blocks in papers of linkage/association studies. I think my application is useful for such purposes. You can download binary, sample data and source codes from my personal site (2). I would like to contribute to BioJava project with my program if you accept, although it requires JRE 1.4 or more which is different from other codes in your project. I am glad with any opinions and comments for my program. Best regards. 1) http://www.sph.umich.edu/csg/abecasis/GOLD/download/index.html 2) http://homepage.mac.com/takaho_e/biology/alder/index.html --- Takaho A. Endo From mark.schreiber at agresearch.co.nz Sun Sep 7 22:15:21 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Sep 7 22:14:15 2003 Subject: [Biojava-dev] Feature Templates Message-ID: I think that a copy is better than a reference. The question is would a shallow copy be OK or should it be a deep copy? - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Mon 8/09/2003 10:26 a.m. To: Matthew Pocock Cc: Schreiber, Mark; biojava-dev@biojava.org Subject: Re: [Biojava-dev] Feature Templates Matthew Pocock wrote: > One potential gotcha is the Annotation - this may be > refferenced by the new feature or be coppied. Not sure > which impementations do what. > > What is the problem you are seeing? Yeah the problem I was having was with the annotation. I just had a quick squizz at the code now, and it looks to be just setting a reference to the template annotation. Now I know what the expected behaviour is, I'll sort it. Cheers, Len. ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Sun Sep 7 22:17:45 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Sep 7 22:16:31 2003 Subject: [Biojava-dev] Developing LD visualizer Message-ID: Hi - The development version of biojava is targetting Java 1.4 so this shouldn't be a problem. If you take a look at the core Symbol and Sequence API's you'll probably get some idea of how you might be able to integrate you LD work. If you get stuck, just ask some questions on the list. - Mark -----Original Message----- From: Takaho A. Endo [mailto:endo@genetix-h.com] Sent: Mon 8/09/2003 12:19 p.m. To: biojava-dev@biojava.org Cc: Subject: [Biojava-dev] Developing LD visualizer Dear all, I am studying in medical faculty of Tokai University, Japan. For a certain sake I have developed a visualizing application of linkage disequilibrium. The name of the application is ALDER (autonomic linkage disequilibrium (LD) expression resources). This is a GOLD(1)-like but finer application written in Java, which calculates LD by itself and shows LD graphs reflecting marker distance. Due to current trend, we are required to show haplotype blocks in papers of linkage/association studies. I think my application is useful for such purposes. You can download binary, sample data and source codes from my personal site (2). I would like to contribute to BioJava project with my program if you accept, although it requires JRE 1.4 or more which is different from other codes in your project. I am glad with any opinions and comments for my program. Best regards. 1) http://www.sph.umich.edu/csg/abecasis/GOLD/download/index.html 2) http://homepage.mac.com/takaho_e/biology/alder/index.html --- Takaho A. Endo _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Sun Sep 7 22:21:12 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Sep 7 22:19:53 2003 Subject: [Biojava-dev] Java toolkit for visualization of regulatory regions Message-ID: Hi - Looking breifly at your API's it seems that you use a track based display system which you may be able to integrate with the org.biojava.bio.gui.sequence package. It would be particularly nice to see some of your JSP oriented stuff in biojava. If you have any questions on how to best integrate it just ask questions on the list. - Mark -----Original Message----- From: Ramana Davuluri [mailto:davuluri-1@medctr.osu.edu] Sent: Fri 5/09/2003 10:28 a.m. To: biojava-dev@biojava.org Cc: Subject: [Biojava-dev] Java toolkit for visualization of regulatory regions Dear BioJava Developers, We have recently developed a Java visualization tool kit for visualizing gene regulatory region annotations. We want to make it available as an open source software to Bioinformatics community. Would it be possible to integrate this library into your BioJava modules. You may have a look at this library, called GCVTK, available at : http://bioinformatics.med.ohio-state.edu/GDVTK We also welcome your suggestions. Regards - ramana Ramana Davuluri, Ph.D. Assistant Professor (Bioinformatics & Computational Biology) Human Cancer Genetics Program Department of Molecular Virology, Immunology and Medical Genetics The Ohio State University 420 West 12th Avenue, TMRF 524 Columbus, OH 43210 Email: Davuluri-1@medctr.osu.edu Tel: 614-688-3088 (off) Tel: 614-688-4776 (lab) Fax: 614-688-4006 http://www.cancergenetics.med.ohio-state.edu/FacultyDavuluri.html http://bioinformatics.med.ohio-state.edu ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From len at reeltwo.com Sun Sep 7 23:55:43 2003 From: len at reeltwo.com (Len Trigg) Date: Sun Sep 7 23:54:40 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: References: Message-ID: Mark Schreiber wrote: > I think that a copy is better than a reference. The question is > would a shallow copy be OK or should it be a deep copy? I have changed the SimpleFeature constructor so that rather than directly referencing the template annotation, it uses the existing SimpleAnnotation(Annotation) constructor. If deep copies are the way to go, it should probably happen in that constructor. BTW: SimpleFeature could do with a pretty printer running over it... Are there BioJava coding conventions? (I've been going with the flow of whichever file I've been editing.) Cheers, Len. From endo at genetix-h.com Mon Sep 8 00:24:15 2003 From: endo at genetix-h.com (Takaho A. Endo) Date: Mon Sep 8 00:22:52 2003 Subject: [Biojava-dev] Developing LD visualizer In-Reply-To: Message-ID: <4E748212-E1B4-11D7-8019-000A956A6950@genetix-h.com> Hi Mark, Thanks for your reply despite my poor English. Another problems of my codes are that I did not follow manners of BioJava and I did not use useful libraries of it because I felt they are too general for my specific purpose at the time when I was developing. Do I have to change my codes to apply Symbol and Sequence APIs in order to avoid code-duplication? It is easy to modify package names of my source codes into the same as BioJava codes, and I intend to. However, I will ask you for advice to apply BioJava AIPs if required. Thanks. -- Takaho A. Endo, Toaki University, Japan. On 2003.9.8, at 11:17 Asia/Tokyo, Schreiber, Mark wrote: > Hi - > > The development version of biojava is targetting Java 1.4 so this > shouldn't be a problem. If you take a look at the core Symbol and > Sequence API's you'll probably get some idea of how you might be able > to integrate you LD work. > > If you get stuck, just ask some questions on the list. > > - Mark > > > -----Original Message----- > From: Takaho A. Endo [mailto:endo@genetix-h.com] > Sent: Mon 8/09/2003 12:19 p.m. > To: biojava-dev@biojava.org > Cc: > Subject: [Biojava-dev] Developing LD visualizer > > > > Dear all, > > I am studying in medical faculty of Tokai University, Japan. > For a certain sake I have developed a visualizing application of > linkage disequilibrium. > > The name of the application is ALDER (autonomic linkage > disequilibrium (LD) expression resources). This is a GOLD(1)-like but > finer application written in Java, which calculates LD by itself and > shows LD graphs reflecting marker distance. > > Due to current trend, we are required to show haplotype blocks in > papers of linkage/association studies. > I think my application is useful for such purposes. > > You can download binary, sample data and source codes from my > personal site (2). > I would like to contribute to BioJava project with my program if > you > accept, although it requires JRE 1.4 or more which is different from > other codes in your project. > I am glad with any opinions and comments for my program. > > Best regards. > > > 1) http://www.sph.umich.edu/csg/abecasis/GOLD/download/index.html > 2) http://homepage.mac.com/takaho_e/biology/alder/index.html > > --- > Takaho A. Endo > > _______________________________________________ > biojava-dev mailing list > biojava-dev@biojava.org > http://biojava.org/mailman/listinfo/biojava-dev > > > > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > From autobuilder at derkholm.net Mon Sep 8 00:18:51 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Mon Sep 8 00:24:04 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1062994737156.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030908 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLFeatureAnnotation.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceAnnotation.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java * biojava-live/src/org/biojava/bio/seq/db/biosql/FeaturesSQL.java * biojava-live/src/org/biojava/bio/seq/db/biosql/TaxonSQL.java * biojava-live/src/org/biojava/bio/seq/impl/SimpleFeature.java * biojava-live/src/org/biojava/bio/seq/io/EmblFileFormer.java * biojava-live/src/org/biojava/bio/seq/io/EmblLikeFormat.java * biojava-live/src/org/biojava/bio/taxa/EbiFormat.java * biojava-live/tests/files/biosqldb-hsqldb.sql A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From mark.schreiber at agresearch.co.nz Mon Sep 8 00:45:25 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Sep 8 00:44:09 2003 Subject: [Biojava-dev] Feature Templates Message-ID: Hi - There are not really any strict coding standards, well none that are enforcable anyway :-) A lot seems to depend on the developers IDE or editor. Mostly the code looks OK. I have tidied SimpleFeature a bit. One possibility would be to run all of the code through a template from time to time which would not be hard with a modern IDE but would make a bit of a hit on the CVS when it happened. Matthew did the last one where he changed the code to use explicit imports. - Mark -----Original Message----- From: Len Trigg [mailto:len@reeltwo.com] Sent: Mon 8/09/2003 3:55 p.m. To: Schreiber, Mark Cc: Matthew Pocock; biojava-dev@biojava.org Subject: Re: [Biojava-dev] Feature Templates Mark Schreiber wrote: > I think that a copy is better than a reference. The question is > would a shallow copy be OK or should it be a deep copy? I have changed the SimpleFeature constructor so that rather than directly referencing the template annotation, it uses the existing SimpleAnnotation(Annotation) constructor. If deep copies are the way to go, it should probably happen in that constructor. BTW: SimpleFeature could do with a pretty printer running over it... Are there BioJava coding conventions? (I've been going with the flow of whichever file I've been editing.) Cheers, Len. ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Mon Sep 8 00:54:23 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Sep 8 00:53:07 2003 Subject: [Biojava-dev] Developing LD visualizer Message-ID: Hi - It really depends on how easily your api can interact with biojava. Possibly all you need are some bridge classes. Do you have javadocs for your api? - Mark -----Original Message----- From: Takaho A. Endo [mailto:endo@genetix-h.com] Sent: Mon 8/09/2003 4:24 p.m. To: biojava-dev@biojava.org Cc: Subject: Re: [Biojava-dev] Developing LD visualizer Hi Mark, Thanks for your reply despite my poor English. Another problems of my codes are that I did not follow manners of BioJava and I did not use useful libraries of it because I felt they are too general for my specific purpose at the time when I was developing. Do I have to change my codes to apply Symbol and Sequence APIs in order to avoid code-duplication? It is easy to modify package names of my source codes into the same as BioJava codes, and I intend to. However, I will ask you for advice to apply BioJava AIPs if required. Thanks. -- Takaho A. Endo, Toaki University, Japan. On 2003.9.8, at 11:17 Asia/Tokyo, Schreiber, Mark wrote: > Hi - > > The development version of biojava is targetting Java 1.4 so this > shouldn't be a problem. If you take a look at the core Symbol and > Sequence API's you'll probably get some idea of how you might be able > to integrate you LD work. > > If you get stuck, just ask some questions on the list. > > - Mark > > > -----Original Message----- > From: Takaho A. Endo [mailto:endo@genetix-h.com] > Sent: Mon 8/09/2003 12:19 p.m. > To: biojava-dev@biojava.org > Cc: > Subject: [Biojava-dev] Developing LD visualizer > > > > Dear all, > > I am studying in medical faculty of Tokai University, Japan. > For a certain sake I have developed a visualizing application of > linkage disequilibrium. > > The name of the application is ALDER (autonomic linkage > disequilibrium (LD) expression resources). This is a GOLD(1)-like but > finer application written in Java, which calculates LD by itself and > shows LD graphs reflecting marker distance. > > Due to current trend, we are required to show haplotype blocks in > papers of linkage/association studies. > I think my application is useful for such purposes. > > You can download binary, sample data and source codes from my > personal site (2). > I would like to contribute to BioJava project with my program if > you > accept, although it requires JRE 1.4 or more which is different from > other codes in your project. > I am glad with any opinions and comments for my program. > > Best regards. > > > 1) http://www.sph.umich.edu/csg/abecasis/GOLD/download/index.html > 2) http://homepage.mac.com/takaho_e/biology/alder/index.html > > --- > Takaho A. Endo > > _______________________________________________ > biojava-dev mailing list > biojava-dev@biojava.org > http://biojava.org/mailman/listinfo/biojava-dev > > > > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From matthew_pocock at yahoo.co.uk Mon Sep 8 05:05:19 2003 From: matthew_pocock at yahoo.co.uk (=?iso-8859-1?q?Matthew=20Pocock?=) Date: Mon Sep 8 05:03:53 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: Message-ID: <20030908090519.60761.qmail@web14911.mail.yahoo.com> --- Len Trigg wrote: > Mark Schreiber wrote: > > I think that a copy is better than a reference. > The question is > > would a shallow copy be OK or should it be a deep > copy? > > I have changed the SimpleFeature constructor so that > rather than > directly referencing the template annotation, it > uses the existing > SimpleAnnotation(Annotation) constructor. If deep > copies are the way > to go, it should probably happen in that > constructor. IMHO, the only case that you would want to enforce a deep copy would be if a property of an annotation is itself an annotation. Deep coppying of other properties will lead to badness I think. > > BTW: SimpleFeature could do with a pretty printer > running over > it... Are there BioJava coding conventions? (I've > been going with the > flow of whichever file I've been editing.) The one thing that mucks things up is mixing in tabs with spaces. I know things like emacs like to put them in, but they cause the most formatting hastle of all. I tend to use 2 char indent, spaces after casts & commas, [] after type and before variable, one declaration per line, things that get broken across lines get lined up on +1 or +2 indents or alignmed with '(' - depending, and attempt to wrap at 80 char. In truth, I hit the 'format this code' button in my IDE when files look too uglee, and accept that (but then I've had a bit of a discussion with my IDE over what is and isn't acceptable). If coding style is an issue, could we get all checkins pretified by cvs prior to commit? Not sure if CVS can do this kind of thing. > > > Cheers, > Len. > Matthew ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://mail.messenger.yahoo.co.uk From kdj at sanger.ac.uk Mon Sep 8 06:29:40 2003 From: kdj at sanger.ac.uk (Keith James) Date: Mon Sep 8 06:29:41 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: <20030908090519.60761.qmail@web14911.mail.yahoo.com> References: <20030908090519.60761.qmail@web14911.mail.yahoo.com> Message-ID: >>>>> "Matthew" == Matthew Pocock writes: [...] Matthew> The one thing that mucks things up is mixing in tabs with Matthew> spaces. I know things like emacs like to put them in, but Matthew> they cause the most formatting hastle of all. Only if you tell it to use tabs, though (mine uses spaces). Matthew> I tend to use 2 char indent, spaces after casts & commas, Matthew> [] after type and before variable, one declaration per Matthew> line, things that get broken across lines get lined up on Matthew> +1 or +2 indents or alignmed with '(' - depending, and Matthew> attempt to wrap at 80 char. In truth, I hit the 'format Matthew> this code' button in my IDE when files look too uglee, Matthew> and accept that (but then I've had a bit of a discussion Matthew> with my IDE over what is and isn't acceptable). Matthew> If coding style is an issue, could we get all checkins Matthew> pretified by cvs prior to commit? Not sure if CVS can do Matthew> this kind of thing. I've been doing a survey of available Java formatters because I'm setting up a new codebase here, from scratch. Useful features would be 1. Platform independent 2. Standalone version available (& maybe an Ant task) 3. IDE plugin versions available 4. Extendable/fixable (preferably open source) None that I can find fit all of these criteria. Those that can be re-distributed: Jalopy: 1, 2, 3, 4 (but a fairly buggy beta, no longer supported by the developer - I wouldn't use it myself) astyle: 2, 4 (limited features, in C++, but the jedit plugin is Java so it may go cross-platform without requiring a compiler) Anyone know of others? There are several commercial or otherwise non-distributable packages e.g. jindent, jacobe, jpretty, trita. Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From len at reeltwo.com Mon Sep 8 18:50:26 2003 From: len at reeltwo.com (Len Trigg) Date: Mon Sep 8 18:49:08 2003 Subject: [Biojava-dev] Feature Templates In-Reply-To: <20030908090519.60761.qmail@web14911.mail.yahoo.com> References: <20030908090519.60761.qmail@web14911.mail.yahoo.com> Message-ID: Matthew Pocock wrote: > The one thing that mucks things up is mixing in tabs > with spaces. I know things like emacs like to put them > in, but they cause the most formatting hastle of all. Yeah, that's pretty pesky. I've got (setq indent-tabs-mode nil) in my emacs config. Maybe in the style section on the docs page include a link to: http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html plus any areas of deviation/clarification (e.g. the @author stuff). I'm of the opinion that for open source projects in particular, coding conventions are important because you want the code and documentation of high quality and consistency to not put off potential contributors. Conversely, open source projects are probably the hardest ones in which to enforce these types of issues. :-) > If coding style is an issue, could we get all checkins > pretified by cvs prior to commit? Not sure if CVS can > do this kind of thing. The commitinfo file could be used to run a style checker and disallow the checkin if there are too many style warnings, if you wanted to get draconian. I haven't seen any mechanism in cvs that actually lets you postprocess the file though. In our company we use an automated checker along the lines of tinderbox/autobuild that runs regularly (our main one is every 1/2 hr), and it emails committers of any problems with their checkin (e.g. compile failures/warnings, test failures, code style warnings, spelling mistakes in javadocs). It uses jikes in pedantic mode when compiling, and uses checkstyle as the style checker. That approach seems to work pretty well. Cheers, Len. From len at reeltwo.com Mon Sep 8 19:11:42 2003 From: len at reeltwo.com (Len Trigg) Date: Mon Sep 8 19:10:18 2003 Subject: [Biojava-dev] Re: Feature Templates In-Reply-To: <200309082249.h88MnasX025237@portal.> Message-ID: > Anyone know of others? There are several commercial or otherwise > non-distributable packages e.g. jindent, jacobe, jpretty, trita. For pretty printing, I use jrefactory. For style checking, I use checkstyle (I've heard that PMD is also pretty good). Cheers, Len. From endo at genetix-h.com Mon Sep 8 20:31:56 2003 From: endo at genetix-h.com (Takaho A. Endo) Date: Mon Sep 8 20:30:34 2003 Subject: [Biojava-dev] Developing LD visualizer In-Reply-To: Message-ID: <04A7E3A4-E25D-11D7-B1D0-000A956A6950@genetix-h.com> Hi Mark, I did not know such sites and I felt it is very useful for me. I realize how to use ambiguous characters of nucleotide but it might be a little inefficient for calculation in my classes (my application deals them as not objects but 'char' variables for efficiency). Therefore I will modify my codes to utilize them, estimate the efficiency, and write javadocs more. After these processes I will post again to ask you wether my codes are acceptable or not. Thanks. --- Takaho A. Endo, Tokai University, Japan. On 2003.9.9, at 08:09 Asia/Tokyo, Schreiber, Mark wrote: > Hi - > > Probably the main areas you will want to look at are the Distribution > and Symbol API's. You will probably also want to look at the biojava > in Anger documentation pages http://www.biojava.org/docs/bj_in_anger/ > there is a link to a Japanese version as well. > > Mark > From len at reeltwo.com Mon Sep 8 20:42:32 2003 From: len at reeltwo.com (Len Trigg) Date: Mon Sep 8 20:41:06 2003 Subject: [Biojava-dev] RE: New RDBMS support: hsqldb In-Reply-To: References: Message-ID: Mark Schreiber wrote: > Either is probably OK, depends on the size of the jar I suspect. If > its a big hit to download then I suppose we can make it an optional > part of the ANT build. I've committed the unit tests and the biosql schema files I ported. I have not added the hsqldb jarfile - people wanting to run these tests need to download it (I am using the stable 1.7.1 release - there is a problem with the current bleeding edge version, so don't use that) and add it to their classpath before executing the tests. If you don't have the driver in your classpath, the tests are skipped, e.g.: [junit] Running org.biojava.bio.seq.db.HashSequenceDBTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 5.273 sec [junit] Running org.biojava.bio.seq.db.biosql.BioSQLSequenceDBTest [junit] No hsqldb driver found. [junit] Tests run: 0, Failures: 0, Errors: 0, Time elapsed: 0.008 sec I found and fixed a bug in BioSQLAllFeatures.java. The bug also exists in BioSQLTiledFeatures.java, but I don't understand what that code is doing -- if someone could take a look at it, it'd be appreciated. To trigger it, the easiest way is to run the unit tests after editing BioSQLSequence.getFeatures() to always use BioSQLTiledFeatures rather than only using that implementation for long sequences. Cheers, Len. From russell.smithies at xtra.co.nz Mon Sep 8 23:17:17 2003 From: russell.smithies at xtra.co.nz (Russell Smithies) Date: Mon Sep 8 23:15:06 2003 Subject: [Biojava-dev] Feature Templates References: <200309082249.h88Mnbsa025242@portal.> Message-ID: <000501c37680$e0d012d0$e63e56d2@lex> How about writing a Java template for "tidy" http://tidy.sourgeforge.net/ It's built in with most UNIX/Linux systems and may save a bit of work? It's designed for HTML, XHTML and XML but might be worth a go? There's also jtidy, the Java port of tidy that may be tweakable. http://lempinen.net/sami/jtidy/ Russell > Message: 7 > Date: 08 Sep 2003 11:31:05 +0100 > From: Keith James > Subject: Re: [Biojava-dev] Feature Templates > To: Matthew Pocock > Cc: biojava-dev@biojava.org > Message-ID: > Content-Type: text/plain; charset=us-ascii > > >>>>> "Matthew" == Matthew Pocock writes: > > [...] > > Matthew> The one thing that mucks things up is mixing in tabs with > Matthew> spaces. I know things like emacs like to put them in, but > Matthew> they cause the most formatting hastle of all. > > Only if you tell it to use tabs, though (mine uses spaces). > > Matthew> I tend to use 2 char indent, spaces after casts & commas, > Matthew> [] after type and before variable, one declaration per > Matthew> line, things that get broken across lines get lined up on > Matthew> +1 or +2 indents or alignmed with '(' - depending, and > Matthew> attempt to wrap at 80 char. In truth, I hit the 'format > Matthew> this code' button in my IDE when files look too uglee, > Matthew> and accept that (but then I've had a bit of a discussion > Matthew> with my IDE over what is and isn't acceptable). > > Matthew> If coding style is an issue, could we get all checkins > Matthew> pretified by cvs prior to commit? Not sure if CVS can do > Matthew> this kind of thing. > > I've been doing a survey of available Java formatters because I'm > setting up a new codebase here, from scratch. Useful features would be > > 1. Platform independent > 2. Standalone version available (& maybe an Ant task) > 3. IDE plugin versions available > 4. Extendable/fixable (preferably open source) > > None that I can find fit all of these criteria. Those that can be > re-distributed: > > Jalopy: 1, 2, 3, 4 (but a fairly buggy beta, no longer supported by > the developer - I wouldn't use it myself) > > astyle: 2, 4 (limited features, in C++, but the jedit plugin is Java > so it may go cross-platform without requiring a compiler) > > Anyone know of others? There are several commercial or otherwise > non-distributable packages e.g. jindent, jacobe, jpretty, trita. > > Keith > From autobuilder at derkholm.net Tue Sep 9 00:18:54 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Tue Sep 9 00:24:05 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063081134995.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030909 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/tests/org/biojava/bio/seq/db/AbstractSequenceDBTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From mark.schreiber at agresearch.co.nz Tue Sep 9 00:43:46 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Tue Sep 9 00:42:26 2003 Subject: [Biojava-dev] offtopic Message-ID: Hi - Apologies for the offtopic post but I'm running java version "1.4.1_02" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_02-b06) Java HotSpot(TM) Client VM (build 1.4.1_02-b06, mixed mode) under windowsXP and whenever I run a program the VM spits out count = 0, total = 27 This is getting very annoying, does anyone know of a way to prevent it happening? - Mark ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From thomas at derkholm.net Tue Sep 9 02:44:13 2003 From: thomas at derkholm.net (Thomas Down) Date: Tue Sep 9 02:49:22 2003 Subject: [Biojava-dev] offtopic In-Reply-To: References: Message-ID: <20030909064413.GA26064@firechild> Once upon a time, Schreiber, Mark wrote: > Hi - > > Apologies for the offtopic post but I'm running java version "1.4.1_02" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_02-b06) Java HotSpot(TM) Client VM (build 1.4.1_02-b06, mixed mode) under windowsXP and whenever I run a program the VM spits out count = 0, total = 27 > > This is getting very annoying, does anyone know of a way to prevent it happening? I can't help directly with this (the corresponding Linux version works fine for me), but it's probably worth pointing out that 1.4.2 has been out for a while now, and includes a fair number of bug fixes -- it might be worth upgrading. Thomas. From autobuilder at derkholm.net Wed Sep 10 00:18:52 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Wed Sep 10 00:24:07 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063167534915.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030910 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/seq/db/GenbankSequenceDB.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Wed Sep 10 05:47:14 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 10 05:47:34 2003 Subject: [Biojava-dev] circular sequences Message-ID: <3F5EF322.8080105@yahoo.co.uk> Hi, Anybody got code to render circular sequences as circles? Apparently plasmids and bacteria and things don't have linear genomes :/ If nobody shouts in the next 24 hours, I will hit cvs with a preliminary API. Matthew From matthew_pocock at yahoo.co.uk Wed Sep 10 11:57:48 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 10 11:58:09 2003 Subject: [Biojava-dev] process tools test Message-ID: <3F5F49FC.9000401@yahoo.co.uk> Hi, who wrote the demo ProcessToolsTest? I've just modified it so that it compiles - bit rot against ProcessTools I guess. Matthe From autobuilder at derkholm.net Thu Sep 11 00:18:53 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Thu Sep 11 00:24:09 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063253935175.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030911 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/symbol/AbstractAlphabet.java * biojava-live/src/org/biojava/bio/symbol/CodonPrefFilter.java * biojava-live/src/org/biojava/bio/symbol/CodonPrefTools.java * biojava-live/src/org/biojava/utils/SmallMap.java * biojava-live/tests/org/biojava/bio/seq/db/AbstractSequenceDBTest.java * biojava-live/tests/org/biojava/utils/SmallMapTest.java * biojava-live/demos/process/ProcessToolsTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From len at reeltwo.com Thu Sep 11 01:11:52 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 11 01:10:30 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql Message-ID: Hi guys, I'm attempting to remove a sequence in a biosql database (both mysql and hsqldb - have yet to try oracle), and I get the following error: Caused by: java.sql.SQLException: Syntax error or access violation, message from server: "You have an error in your SQL syntax near 'using location, seqfeature where location.seqfeature_id = seqfeature.seqfeature' at line 1" at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:1651) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:889) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:956) at com.mysql.jdbc.Connection.execSQL(Connection.java:1874) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1700) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1569) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._removeSequence(BioSQLSequenceDB.java:650) ... 5 more When I look at the code, the relevant bit is: PreparedStatement delete_locs; if (dstyle == DBHelper.DELETE_POSTGRESQL) { delete_locs = conn.prepareStatement("delete from location " + " where location.seqfeature_id = seqfeature.seqfeature_id and " + " seqfeature.bioentry_id = ?"); } else { delete_locs = conn.prepareStatement("delete from location " + " using location, seqfeature " + " where location.seqfeature_id = seqfeature.seqfeature_id and " + " seqfeature.bioentry_id = ?"); } delete_locs.setInt(1, bioentry_id); delete_locs.executeUpdate(); delete_locs.close(); Now, if I comment out the " using location, seqfeature " line (and other examples of the same), the sequence removal is super green. Is the "using" clause part of standard SQL? The clause was obviously put there for a reason, so could someone explain when it should be used? Or, if it's not actually needed anymore, should I remove the whole DeleteStyle special casing altogether? Cheers, Len. From thomas at derkholm.net Thu Sep 11 05:27:27 2003 From: thomas at derkholm.net (Thomas Down) Date: Thu Sep 11 05:32:41 2003 Subject: [Biojava-dev] process tools test In-Reply-To: <3F5F49FC.9000401@yahoo.co.uk> References: <3F5F49FC.9000401@yahoo.co.uk> Message-ID: <20030911092727.GA29831@firechild> Once upon a time, Matthew Pocock wrote: > Hi, > > who wrote the demo ProcessToolsTest? I've just modified it so that it > compiles - bit rot against ProcessTools I guess. Sorry, that was me forgetting a checkin (again...). I'll add the compile-demos target to the autobuilder's script... Thomas. From kdj at sanger.ac.uk Thu Sep 11 05:38:26 2003 From: kdj at sanger.ac.uk (Keith James) Date: Thu Sep 11 05:38:27 2003 Subject: [Biojava-dev] Thread safety question Message-ID: I was looking at lazy instantiation and thread safety issues. There are a few well-publicised articles about the double checked locking idiom in Java and how it doesn't quite work e.g. http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html This rang a bell regarding AbstractChangeable which does this protected ChangeSupport getChangeSupport(ChangeType ct) { if(changeSupport != null) { return changeSupport; } synchronized(this) { if(changeSupport == null) { changeSupport = generateChangeSupport(); } } return changeSupport; } The example given in the reference is // Broken multithreaded version // "Double-Checked Locking" idiom class Foo { private Helper helper = null; public Helper getHelper() { if (helper == null) synchronized(this) { if (helper == null) helper = new Helper(); } return helper; } // other functions and members... } where the initial test is reversed wrt the biojava example. However, it looks to me like there might be a (theoretical) danger here (of ChangeSupport being initialised twice). Can someone convince me otherwise? cheers, Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From thomas at derkholm.net Thu Sep 11 05:39:03 2003 From: thomas at derkholm.net (Thomas Down) Date: Thu Sep 11 05:44:17 2003 Subject: [Biojava-dev] Thread safety question In-Reply-To: References: Message-ID: <20030911093903.GB29831@firechild> Once upon a time, Keith James wrote: > > I was looking at lazy instantiation and thread safety issues. There > are a few well-publicised articles about the double checked locking > idiom in Java and how it doesn't quite work e.g. > > http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html > > This rang a bell regarding AbstractChangeable which does this > > protected ChangeSupport getChangeSupport(ChangeType ct) { > if(changeSupport != null) { > return changeSupport; > } > > synchronized(this) { > if(changeSupport == null) { > changeSupport = generateChangeSupport(); > } > } > > return changeSupport; > } > > The example given in the reference is > > // Broken multithreaded version > // "Double-Checked Locking" idiom > class Foo { > private Helper helper = null; > public Helper getHelper() { > if (helper == null) > synchronized(this) { > if (helper == null) > helper = new Helper(); > } > return helper; > } > // other functions and members... > } > > where the initial test is reversed wrt the biojava example. However, > it looks to me like there might be a (theoretical) danger here (of > ChangeSupport being initialised twice). Can someone convince me > otherwise? Yes, I think the problem could occur here. That pattern, or similar, gets used quite a bit in BioJava. Many of the cases (for example, in the Ensembl wrappers) are harmless because the worst that can happen is that data gets loaded twice. But for ChangeSupport there is a real problem, if a listener gets added to the first ChangeSupport, which is then overwritten by a second. I seem to remember that it's possible to fix this by declaring the relevant field volatile -- is this true? [I think there are also some memory model changes proposed which are supposed to fix this, but I don't know if/when they'll actually happen] Thomas. From matthew_pocock at yahoo.co.uk Thu Sep 11 05:59:58 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Sep 11 06:00:43 2003 Subject: [Biojava-dev] Thread safety question In-Reply-To: References: Message-ID: <3F60479E.8060207@yahoo.co.uk> Hi Keith, Yes - I think you're right. So - we need to discard the first check for == null? M Keith James wrote: >I was looking at lazy instantiation and thread safety issues. There >are a few well-publicised articles about the double checked locking >idiom in Java and how it doesn't quite work e.g. > >http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html > >This rang a bell regarding AbstractChangeable which does this > >protected ChangeSupport getChangeSupport(ChangeType ct) { > if(changeSupport != null) { > return changeSupport; > } > > synchronized(this) { > if(changeSupport == null) { > changeSupport = generateChangeSupport(); > } > } > > return changeSupport; > } > >The example given in the reference is > >// Broken multithreaded version >// "Double-Checked Locking" idiom >class Foo { > private Helper helper = null; > public Helper getHelper() { > if (helper == null) > synchronized(this) { > if (helper == null) > helper = new Helper(); > } > return helper; > } > // other functions and members... > } > >where the initial test is reversed wrt the biojava example. However, >it looks to me like there might be a (theoretical) danger here (of >ChangeSupport being initialised twice). Can someone convince me >otherwise? > >cheers, > >Keith > > > From kdj at sanger.ac.uk Thu Sep 11 06:41:14 2003 From: kdj at sanger.ac.uk (Keith James) Date: Thu Sep 11 06:41:16 2003 Subject: [Biojava-dev] Thread safety question In-Reply-To: <20030911093903.GB29831@firechild> References: <20030911093903.GB29831@firechild> Message-ID: >>>>> "Thomas" == Thomas Down writes: [...] Thomas> I seem to remember that it's possible to fix this by Thomas> declaring the relevant field volatile -- is this true? Yes, I saw this mentioned. Apparently it will only work under certain circumstances: http://www.javaworld.com/javaworld/jw-02-2001/jw-0209-double-p2.html It appears that the only cast-iron approach is to synchronize the operation. cheers, Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From matthew_pocock at yahoo.co.uk Thu Sep 11 06:45:46 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Sep 11 06:46:16 2003 Subject: [Biojava-dev] gui speed Message-ID: <3F60525A.1000200@yahoo.co.uk> Hi, I just checked in some fixes for the gui code that drastically improves rendering speed. I think this should be ported to the 1.3 release. The altered classes are all in org.biojava.bio.gui.sequence. I've added GUITools, which has (at the moment) a single method for computing the visible indecies, using the clip. My app goes from being sluggish to being smooth. I hope it generaly helps. We need do document propperly how the getRange() property should be used, and how that interacts with rendering and layout. Matthew From simon.foote at nrc-cnrc.gc.ca Thu Sep 11 08:39:29 2003 From: simon.foote at nrc-cnrc.gc.ca (Simon Foote) Date: Thu Sep 11 08:40:13 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql In-Reply-To: References: Message-ID: <3F606D01.8020401@nrc-cnrc.gc.ca> Hi Len, The "using" syntax is valid as it pertains to cascading deletes. The deletes function correctly with MySQL version 4.0.14 using InnoDb tables in my test cases. The code needs the following added to check the MySQL version. PreparedStatement delete_locs; if (dstyle == DBHelper.DELETE_POSTGRESQL) { delete_locs = conn.prepareStatement("delete from location " + " where location.seqfeature_id = seqfeature.seqfeature_id and " + " seqfeature.bioentry_id = ?"); } else if ((dstyle == DBHelper.DELETE_MYSQL4) { delete_locs = conn.prepareStatement("delete from location " + " using location, seqfeature " + " where location.seqfeature_id = seqfeature.seqfeature_id and " + " seqfeature.bioentry_id = ?"); } else { delete_locs = conn.prepareStatement("delete from location " + " where location.seqfeature_id = seqfeature.seqfeature_id and " + " seqfeature.bioentry_id = ?"); } Cheers, Simon -- Bioinformatics Programmer Institute for Biological Sciences National Research Council of Canada [T] 613-990-0561 [F] 613-952-9092 simon.foote@nrc-cnrc.gc.ca Len Trigg wrote: >Hi guys, > >I'm attempting to remove a sequence in a biosql database (both mysql >and hsqldb - have yet to try oracle), and I get the following error: > >Caused by: java.sql.SQLException: Syntax error or access violation, >message from server: "You have an error in your SQL syntax >near 'using location, seqfeature where location.seqfeature_id = >seqfeature.seqfeature' at line 1" > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:1651) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:889) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:956) > at com.mysql.jdbc.Connection.execSQL(Connection.java:1874) > at >com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1700) > at >com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1569) > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._removeSequence(BioSQLSequenc >eDB.java:650) > ... 5 more > >When I look at the code, the relevant bit is: > >PreparedStatement delete_locs; >if (dstyle == DBHelper.DELETE_POSTGRESQL) { > delete_locs = conn.prepareStatement("delete from location " + > " where location.seqfeature_id = >seqfeature.seqfeature_id and " + > " seqfeature.bioentry_id = >?"); >} else { > delete_locs = conn.prepareStatement("delete from location " + > " using location, seqfeature " + > " where location.seqfeature_id = >seqfeature.seqfeature_id and " + > " seqfeature.bioentry_id = >?"); >} >delete_locs.setInt(1, bioentry_id); >delete_locs.executeUpdate(); >delete_locs.close(); > > >Now, if I comment out the " using location, seqfeature " line (and >other examples of the same), the sequence removal is super green. Is >the "using" clause part of standard SQL? The clause was obviously put >there for a reason, so could someone explain when it should be used? >Or, if it's not actually needed anymore, should I remove the whole >DeleteStyle special casing altogether? > > >Cheers, >Len. >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > -- Bioinformatics Specialist Institute for Biological Sciences National Research Council of Canada [T] 613-990-0561 [F] 613-952-9092 simon.foote@nrc-cnrc.gc.ca From jko1 at cdc.gov Thu Sep 11 14:20:26 2003 From: jko1 at cdc.gov (Osborne, John) Date: Thu Sep 11 14:22:32 2003 Subject: [Biojava-dev] Near Matches Message-ID: <73A63BD78FEAD5118678006008CC2DD706D534BD@mcdc-atl-8.ncid.cdc.gov> Hi, I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases. The mismatches can occur in ANY position in a sequence. Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this? I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job. Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n... -John From newsletter at bellabyte.ch Thu Sep 11 15:32:51 2003 From: newsletter at bellabyte.ch (BellaByte) Date: Thu Sep 11 15:32:55 2003 Subject: [Biojava-dev] GRATIS und sehr guenstig Message-ID: <200309111932.h8BJWnMh025158@portal.open-bio.org> Der aktuelle Newsletter von BellaByte (September 2003) Liebe Internauten Auch im September-Newsletter haben wir wieder ein paar sehr interessante Angebote, lassen Sie sich ueberraschen: 1. GRATIS GRATIS GRATIS 2. Professionelle Homepage fuer wenig Geld 3. Weihnachts-Kalender auf CD-Rom 4. Lebensqualitaet aus Finnland ++++++++++++++++++++++++++++++++++++++++++++++++++ 1. GRATIS GRATIS GRATIS Wir haben einige GRATIS-Angebote im Sortiment: - Suchmaschinen-Anmeldungen, von GRATIS bis sehr guenstig. http://www.bellabyte.ch/search.php - Online-Virenscanner um Ihr System zu ueberpruefen. http://www.bellabyte.ch/virus.php - Anti-Viren-Tools um Ihr System zu desinfizieren. http://www.bellabyte.ch/virentools.php ++++++++++++++++++++++++++++++++++++++++++++++++++ 2. Professionelle Homepage fuer wenig Geld Bereiten Sie sich oder Ihre Firma auf den kommenden Aufschwung vor und praesentieren Sie sich im Internet mit einer professionellen Homepage. Die Zeit der selbst gebastelten Webseiten ist endgueltig vorbei. Wir bieten Ihnen professionelle Einsteiger-Pakete zu sensationellen Preisen. http://www.bellabyte.ch/design.php ++++++++++++++++++++++++++++++++++++++++++++++++++ 3. Weihnachts-Kalender auf CD-Rom Der erste Weihnachts-Kalender auf CD-Rom mit 25 Kurzgeschichten zu Weihnachten. Vom 1. bis 25. Dezember oeffnet sich taeglich ein neues Bild-Puzzle in 5 Varianten mit anschliessender Kurzgeschichte zum Bild. Machen Sie Ihren Kleinen eine Freude und sparen Sie erst noch 20% auf den Normalpreis. http://www.bellabyte.ch/kalender.php ++++++++++++++++++++++++++++++++++++++++++++++++++ 4. Lebensqualitaet aus Finnland KOTA GRILL-, SAUNA- und FREIZEIT-Haeuser aus Finnland f?r Sie und Ihre Freunde. Ab CHF 7'800.-, Direktimport aus zuverlaessiger Hand. Informationen: IMPO GmbH, CH-8479 Altikon, Tel. +41 (0)52 336 23 75 Fax: +41 (0)52 338 11 36, http://www.swissgeneralimport.ch ++++++++++++++++++++++++++++++++++++++++++++++++++ IMPRESSUM Falls Sie diesen Newsletter abbestellen moechten, klicken Sie bitte hier: http://www.bellabyte.ch/newsletter.php BellaByte, Rosenbergstrasse 23, CH-8212 Neuhausen info@bellabyte.ch http://www.bellabyte.ch From len at reeltwo.com Thu Sep 11 15:56:58 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 11 16:01:12 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql In-Reply-To: <3F606D01.8020401@nrc-cnrc.gc.ca> References: <3F606D01.8020401@nrc-cnrc.gc.ca> Message-ID: <3F60D38A.8050700@reeltwo.com> Simon Foote wrote: > The "using" syntax is valid as it pertains to cascading deletes. The > deletes function correctly with MySQL version 4.0.14 using InnoDb tables > in my test cases. Thanks for the clarification. I'll also attempt to make getDBHelperForURL smarter and detect the specific mysql version then (I've been using 3.23.56). From a quick browse through the javadocs for DatabaseMetaData, it looks like the required information is there. Cheers, Len. From mark.schreiber at agresearch.co.nz Thu Sep 11 17:58:40 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 11 17:57:16 2003 Subject: [Biojava-dev] Near Matches Message-ID: Hi - I think Matthew checked in something that does what you want a few days back. Have a look at the list archives, or cvs records for that last few weeks. - Mark -----Original Message----- From: Osborne, John [mailto:jko1@cdc.gov] Sent: Friday, 12 September 2003 6:20 a.m. To: biojava-dev@biojava.org Subject: [Biojava-dev] Near Matches Hi, I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases. The mismatches can occur in ANY position in a sequence. Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this? I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job. Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n... -John _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From len at reeltwo.com Thu Sep 11 21:35:37 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Sep 11 21:34:14 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql In-Reply-To: <3F606D01.8020401@nrc-cnrc.gc.ca> References: <3F606D01.8020401@nrc-cnrc.gc.ca> Message-ID: Simon Foote wrote: > The "using" syntax is valid as it pertains to cascading deletes. > The deletes function correctly with MySQL version 4.0.14 using InnoDb > tables in my test cases. I put in the version checking, and both mysql and hsqldb fail on the deletion statement with (hsqldb): Caused by: java.sql.SQLException: Column not found: SEQFEATURE_ID in statement [delete from location where location.seqfeature_id = seqfeature.seqfeature_id and seqfeature.bioentry_id = 19] or (mysql 3.23.56): Caused by: java.sql.SQLException: General error, message from server: "Unknown table 'seqfeature' in where clause" Does this mean that the DELETE_GENERIC implementation is going to have to lookup the relevant seqfeature_id's and remove each location separately? My SQL knowledge is pretty basic, so I'm learning as I go along... (A quick two line explanation of what the "on delete cascade" stuff means also wouldn't go astray :-)). Cheers, Len. From autobuilder at derkholm.net Fri Sep 12 00:18:55 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Fri Sep 12 00:24:19 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063340339260.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030912 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/gui/sequence/CircularFeatureFilteringRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularFeatureRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularFeaturesRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularMLR.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularPaddedRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularRendererContext.java * biojava-live/src/org/biojava/bio/gui/sequence/CircularRendererPanel.java * biojava-live/src/org/biojava/bio/gui/sequence/FeatureBlockSequenceRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/FeatureRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/GUITools.java * biojava-live/src/org/biojava/bio/gui/sequence/HeadlessRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/PairwiseSequencePanel.java * biojava-live/src/org/biojava/bio/gui/sequence/RulerRenderer.java * biojava-live/src/org/biojava/bio/gui/sequence/SequencePanel.java * biojava-live/src/org/biojava/bio/gui/sequence/SequencePoster.java * biojava-live/src/org/biojava/bio/gui/sequence/SequenceRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SubCircularRendererContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SubSequenceRenderContext.java * biojava-live/src/org/biojava/bio/gui/sequence/SymbolSequenceRenderer.java * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java * biojava-live/src/org/biojava/bio/seq/db/biosql/DBHelper.java * biojava-live/src/org/biojava/bio/seq/db/biosql/MySQLDBHelper.java * biojava-live/tests/org/biojava/bio/seq/db/AbstractSequenceDBTest.java * biojava-live/tests/org/biojava/bio/seq/db/biosql/BioSQLSequenceDBTest.java * biojava-live/demos/files/AF438419.embl * biojava-live/demos/seqviewer/CircularEmblViewer.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Fri Sep 12 05:07:17 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Fri Sep 12 05:10:15 2003 Subject: [Biojava-dev] Near Matches In-Reply-To: References: Message-ID: <3F618CC5.1020508@yahoo.co.uk> Hi John, I commited code that does a slightly different job - accpets or rejects regions based upon sequence content. However, it will be easy to write an aproximate matcher that does at most n miss-matches. The algorithms that sudgest themselves to me are all O(mn) - not sure if we can do better without a math genius, or a good string algorithms book. Matthew Schreiber, Mark wrote: >Hi - > >I think Matthew checked in something that does what you want a few days back. Have a look at the list archives, or cvs records for that last few weeks. > >- Mark > > >-----Original Message----- >From: Osborne, John [mailto:jko1@cdc.gov] >Sent: Friday, 12 September 2003 6:20 a.m. >To: biojava-dev@biojava.org >Subject: [Biojava-dev] Near Matches > > >Hi, > >I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases. The mismatches can occur in ANY position in a sequence. Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this? I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job. > >Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n... > > -John >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev >======================================================================= >Attention: The information contained in this message and/or attachments >from AgResearch Limited is intended only for the persons or entities >to which it is addressed and may contain confidential and/or privileged >material. Any review, retransmission, dissemination or other use of, or >taking of any action in reliance upon, this information by persons or >entities other than the intended recipients is prohibited by AgResearch >Limited. If you have received this message in error, please notify the >sender immediately. >======================================================================= > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > > From foote at nrcbsa.bio.nrc.ca Fri Sep 12 07:53:10 2003 From: foote at nrcbsa.bio.nrc.ca (S. Foote) Date: Fri Sep 12 07:51:39 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql In-Reply-To: from Len Trigg at "Sep 11, 2003 09:35:37 pm" Message-ID: <200309121153.HAA09021@nrcbsa.bio.nrc.ca> Hi Len, The on delete cascade allows you to delete a row from a table and any other rows in other tables that are referenced to that row. Hence, "delete from location using location, seqfeature where location.seqfeature_id=seqfeature_seqfeature_id and seqfeature.bioentry_id=?" means: delete rows from location by finding the relevant rows using the seqfeature table id which is referenced in the location table having the given bioentry_id Thus, it automatically determines the list of seqfeature ids for the given bioentry_id that need deleting and in turn finds all the locations that relate to those features. So, with mysql < 4.0.0, you will need to find each seqfeature_id and remove it individually. Luckily, this had popped up before and someone had submitted a hack that does just this. You could probably find it in the list archives somewhere, but I just found a copy. I'll put it below. Cheers, Simon private void _removeSequence(String id) throws BioException, IllegalIDException, ChangeVetoException { Sequence seq = (Sequence) sequencesByName.get(id); if (seq != null) { seq = null; // Don't want to be holding the reference ourselves! try { Thread.sleep(100L); System.gc(); } catch (Exception ex) { ex.printStackTrace(); } seq = (Sequence) sequencesByName.get(id); if (seq != null) { throw new BioException("There are still references to sequence with ID " + id + " from this database."); } } Connection conn = null; try { conn = pool.takeConnection(); conn.setAutoCommit(false); int bioentry_id = -1; int biosequence_id = -1; ArrayList featureIdList = null; PreparedStatement get_sequence = conn.prepareStatement("select bioentry.bioentry_id, biosequence.biosequence_id " + "from bioentry, biosequence " + "where bioentry.accession = ? and " + " biosequence.bioentry_id = bioentry.bioentry_id"); get_sequence.setString(1, id); ResultSet rs = get_sequence.executeQuery(); boolean exists; if ((exists = rs.next())) { bioentry_id = rs.getInt(1); biosequence_id = rs.getInt(2); } System.out.println("Simon: " + bioentry_id); get_sequence.close(); if ( bioentry_id != -1) { // Now get all the seqfeature_ids PreparedStatement getFeatureId = conn.prepareStatement("select seqfeature_id from seqfeature where bioentry_id = ?"); getFeatureId.setInt(1, bioentry_id); ResultSet rs1 = getFeatureId.executeQuery(); while ( rs1.next() ) { if ( featureIdList == null ) featureIdList = new ArrayList(); Integer anInt = new Integer(rs1.getInt(1)); featureIdList.add(anInt); } getFeatureId.close(); PreparedStatement delete_taxa = conn.prepareStatement("delete from bioentry_taxa where bioentry_id = ?"); delete_taxa.setInt(1, bioentry_id); delete_taxa.executeUpdate(); delete_taxa.close(); PreparedStatement delete_reference = conn.prepareStatement("delete from bioentry_reference where bioentry_id = ?"); delete_reference.setInt(1, bioentry_id); delete_reference.executeUpdate(); delete_reference.close(); PreparedStatement delete_comment = conn.prepareStatement("delete from comment where bioentry_id = ?"); // REMOVED -- Oracle does not like the name "Comment" for a // table, so re-name this table to Biocomment PreparedStatement delete_comment = conn.prepareStatement("delete from comment where bioentry_id = ?"); delete_comment.setInt(1, bioentry_id); delete_comment.executeUpdate(); delete_comment.close(); PreparedStatement delete_qv = conn.prepareStatement("delete from bioentry_qualifier_value where bioentry_id = ?"); delete_qv.setInt(1, bioentry_id); delete_qv.executeUpdate(); delete_qv.close(); if ( featureIdList != null ) { Iterator iter = featureIdList.iterator(); while ( iter.hasNext() ) { Integer anInt = (Integer) iter.next(); int sf_id = anInt.intValue(); PreparedStatement delete_locs = conn.prepareStatement("delete from seqfeature_location where seqfeature_id = ? "); delete_locs.setInt(1, sf_id); delete_locs.executeUpdate(); delete_locs.close(); PreparedStatement delete_fqv = conn.prepareStatement("delete from seqfeature_qualifier_value where seqfeature_id = ?"); delete_fqv.setInt(1, sf_id); delete_fqv.executeUpdate(); delete_fqv.close(); PreparedStatement delete_rel = conn.prepareStatement("delete from seqfeature_relationship where parent_seqfeature_id = ? "); delete_rel.setInt(1, sf_id); delete_rel.executeUpdate(); delete_rel.close(); } } PreparedStatement delete_features = conn.prepareStatement("delete from seqfeature " + " where bioentry_id = ?"); delete_features.setInt(1, bioentry_id); delete_features.executeUpdate(); delete_features.close(); if ( biosequence_id != -1 ) { PreparedStatement delete_biosequence = conn.prepareStatement("delete from biosequence where biosequence_id = ?"); delete_biosequence.setInt(1, biosequence_id); delete_biosequence.executeUpdate(); delete_biosequence.close(); } PreparedStatement delete_entry = conn.prepareStatement("delete from bioentry where bioentry_id = ?"); delete_entry.setInt(1, bioentry_id); delete_entry.executeUpdate(); delete_entry.close(); } // } CHECK THIS -- WHY COMMENTED OUT ??? // get_sequence.close(); conn.commit(); pool.putConnection(conn); if (!exists) { throw new IllegalIDException("Sequence " + id + " didn't exist"); } } catch (SQLException ex) { System.err.println( ex.toString()); boolean rolledback = false; if (conn != null) { try { conn.rollback(); rolledback = true; } catch (SQLException ex2) {} } throw new BioException(ex, "Error removing from BioSQL tables" + (rolledback ? " (rolled back successfully)" : "")); } } According to Len Trigg: > Simon Foote wrote: > > The "using" syntax is valid as it pertains to cascading deletes. > > The deletes function correctly with MySQL version 4.0.14 using InnoDb > > tables in my test cases. > > I put in the version checking, and both mysql and hsqldb fail on the > deletion statement with (hsqldb): > > Caused by: java.sql.SQLException: Column not found: SEQFEATURE_ID in > statement [delete from location where location.seqfeature_id = > seqfeature.seqfeature_id and seqfeature.bioentry_id = 19] > > or (mysql 3.23.56): > > Caused by: java.sql.SQLException: General error, message from server: > "Unknown table 'seqfeature' in where clause" > > > Does this mean that the DELETE_GENERIC implementation is going to have > to lookup the relevant seqfeature_id's and remove each location > separately? My SQL knowledge is pretty basic, so I'm learning as I go > along... (A quick two line explanation of what the "on delete cascade" > stuff means also wouldn't go astray :-)). > > > Cheers, > Len. > From len at reeltwo.com Fri Sep 12 15:49:03 2003 From: len at reeltwo.com (Len Trigg) Date: Fri Sep 12 15:47:57 2003 Subject: [Biojava-dev] DeleteStyle in BioSQL using mysql In-Reply-To: <200309121153.HAA09021@nrcbsa.bio.nrc.ca> References: <200309121153.HAA09021@nrcbsa.bio.nrc.ca> Message-ID: S. Foote wrote: > So, with mysql < 4.0.0, you will need to find each seqfeature_id and remove it individually. > Luckily, this had popped up before and someone had submitted a hack that does just this. You could > probably find it in the list archives somewhere, but I just found a copy. I'll put it below. Out of interest, when was that submitted (before the days of DeleteStyle by the looks)? Those differences look pretty similar to the changes I made and checked in yesterday (I even included changes to handle an alternatively named comment table to handle Oracle). Cheers, Len. From mark.schreiber at agresearch.co.nz Sat Sep 13 00:48:50 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sat Sep 13 00:47:23 2003 Subject: [Biojava-dev] Near Matches Message-ID: Hi - I'm pretty sure you could hack the KnuthMorrisPrattSearch class to allow it to tolerate a certain number of missmatches. KMP searches are very fast although they will slow considerabley if you allow more than a few missmatches. - Mark -----Original Message----- From: Matthew Pocock [mailto:matthew_pocock@yahoo.co.uk] Sent: Fri 12/09/2003 9:07 p.m. To: Schreiber, Mark Cc: biojava-dev@biojava.org Subject: Re: [Biojava-dev] Near Matches Hi John, I commited code that does a slightly different job - accpets or rejects regions based upon sequence content. However, it will be easy to write an aproximate matcher that does at most n miss-matches. The algorithms that sudgest themselves to me are all O(mn) - not sure if we can do better without a math genius, or a good string algorithms book. Matthew Schreiber, Mark wrote: >Hi - > >I think Matthew checked in something that does what you want a few days back. Have a look at the list archives, or cvs records for that last few weeks. > >- Mark > > >-----Original Message----- >From: Osborne, John [mailto:jko1@cdc.gov] >Sent: Friday, 12 September 2003 6:20 a.m. >To: biojava-dev@biojava.org >Subject: [Biojava-dev] Near Matches > > >Hi, > >I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases. The mismatches can occur in ANY position in a sequence. Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this? I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job. > >Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n... > > -John >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev >======================================================================= >Attention: The information contained in this message and/or attachments >from AgResearch Limited is intended only for the persons or entities >to which it is addressed and may contain confidential and/or privileged >material. Any review, retransmission, dissemination or other use of, or >taking of any action in reliance upon, this information by persons or >entities other than the intended recipients is prohibited by AgResearch >Limited. If you have received this message in error, please notify the >sender immediately. >======================================================================= > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > > _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From len at reeltwo.com Sat Sep 13 21:30:36 2003 From: len at reeltwo.com (Len Trigg) Date: Sat Sep 13 21:30:37 2003 Subject: [Biojava-dev] biojava / Security In-Reply-To: <200308201549.h7KFnqvT023123@localhost.localdomain> References: <200308201549.h7KFnqvT023123@localhost.localdomain> Message-ID: <3F43D373.7020803@reeltwo.com> > Subject: Re: [Biojava-dev] biojava / Security > From: Thomas Down > Date: Fri, 15 Aug 2003 09:58:37 +0100 > To: Matthew Pocock > > > On Fri, Aug 15, 2003 at 09:33:31AM +0100, Matthew Pocock wrote: > >>Last time I checked, CVS notification sent messages >>/per file/ rather than per commit. That's painfull. >>Perhaps a daily single cvs mail would be better? I just saw this updated on freshmeat a few days ago: http://www.nongnu.org/cvsreport/ It can produce some nice summary reports that you might want to use. Cheers, Len. From autobuilder at derkholm.net Sun Sep 14 00:18:59 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sun Sep 14 00:24:21 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063513139988.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030914 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From jko1 at cdc.gov Sun Sep 14 15:59:20 2003 From: jko1 at cdc.gov (Osborne, John) Date: Sun Sep 14 16:01:37 2003 Subject: [Biojava-dev] Near Matches Message-ID: <73A63BD78FEAD5118678006008CC2DD706D534C0@mcdc-atl-8.ncid.cdc.gov> Hi Mark, I actually had to look up Knuth Pratt Morris searches (thanks for the bioinformatics lesson!) but I don't think it is going to work in my case because I am solving an easier problem(!) -the query sequence and target sequence are ALWAYS the same size. So I am actually just performing the last step of the KMP pattern matching which is checking each base pair for an exact match. I just bail out early if the mismatch threshold isn't met, I never do ANY pattern sliding so something like a modified Boyer-Moore search isn't going to help either. :( One thing I thought is to check for the longest exact substring within the 25mer, since if it is small (say less than 8 for 2 exact matches) then I know this won't cause hybridization problems. I don't think this is going to be much faster though, and I'm almost finished with my stupid implementation (which takes about 20 minutes to run) anyway. -John -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Saturday, September 13, 2003 12:49 AM To: Matthew Pocock Cc: biojava-dev@biojava.org Subject: RE: [Biojava-dev] Near Matches Hi - I'm pretty sure you could hack the KnuthMorrisPrattSearch class to allow it to tolerate a certain number of missmatches. KMP searches are very fast although they will slow considerabley if you allow more than a few missmatches. - Mark -----Original Message----- From: Matthew Pocock [mailto:matthew_pocock@yahoo.co.uk] Sent: Fri 12/09/2003 9:07 p.m. To: Schreiber, Mark Cc: biojava-dev@biojava.org Subject: Re: [Biojava-dev] Near Matches Hi John, I commited code that does a slightly different job - accpets or rejects regions based upon sequence content. However, it will be easy to write an aproximate matcher that does at most n miss-matches. The algorithms that sudgest themselves to me are all O(mn) - not sure if we can do better without a math genius, or a good string algorithms book. Matthew Schreiber, Mark wrote: >Hi - > >I think Matthew checked in something that does what you want a few days back. Have a look at the list archives, or cvs records for that last few weeks. > >- Mark > > >-----Original Message----- >From: Osborne, John [mailto:jko1@cdc.gov] >Sent: Friday, 12 September 2003 6:20 a.m. >To: biojava-dev@biojava.org >Subject: [Biojava-dev] Near Matches > > >Hi, > >I am looking for a way in Biojava to iterate quickly through a list of DNA N-mers for sequences that are almost an exact match, like 23 of 25 bases. The mismatches can occur in ANY position in a sequence. Other than iterating through a SymbolList and keeping track of the number of mismatches, is there a better (read faster) way to do this? I was thinking maybe the SuffixTree class, but since sequence order is unimportant it doesn't see like the right tool for the job. > >Right now it is going to be a little bit ugly, since I am putting this into a O(n^2) function with a big n... > > -John >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev >======================================================================= >Attention: The information contained in this message and/or attachments >from AgResearch Limited is intended only for the persons or entities >to which it is addressed and may contain confidential and/or privileged >material. Any review, retransmission, dissemination or other use of, or >taking of any action in reliance upon, this information by persons or >entities other than the intended recipients is prohibited by AgResearch >Limited. If you have received this message in error, please notify the >sender immediately. >======================================================================= > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > > _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev From autobuilder at derkholm.net Mon Sep 15 00:18:53 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Mon Sep 15 00:24:24 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063599538264.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030915 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From autobuilder at derkholm.net Tue Sep 16 00:18:59 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Tue Sep 16 00:24:26 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063685940694.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030916 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/gui/sequence/CircularRendererPanel.java * biojava-live/src/org/biojava/bio/search/BioMatcher.java * biojava-live/src/org/biojava/bio/search/BioPattern.java * biojava-live/src/org/biojava/bio/search/MaxMissmatchMatcher.java * biojava-live/src/org/biojava/bio/search/MaxMissmatchPattern.java * biojava-live/src/org/biojava/bio/search/SeqContentMatcher.java * biojava-live/src/org/biojava/bio/search/SeqContentPattern.java * biojava-live/tests/org/biojava/bio/search/MaxMissmatchPatternTest.java * biojava-live/tests/org/biojava/bio/search/SeqContentPatternTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From autobuilder at derkholm.net Wed Sep 17 00:18:59 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Wed Sep 17 00:24:32 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063772345455.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030917 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From Offers at offerbank.com Wed Sep 17 20:35:40 2003 From: Offers at offerbank.com (Discover Ink) Date: Wed Sep 17 19:34:01 2003 Subject: [Biojava-dev] Low on Ink? Save up to 75 Percent Off Retail Prices Message-ID: <178$5n3zBrtr-qbtNn3zBrtr9zuO@offer-two.offerbank.com> Low on Ink or Office Supplies? It doesn't get easier than this! Shop, Click, Ship and Save. Save up to 75% on ink, toner, paper and more.... Click Here http://www.platinum-deal.com/rd1.php?ac=CD46&bid=79&did=2310&oi= You can have Printer Ink Cartridges and Supplies delivered directly to your home! Worried about shopping online? Our Products and Your Satisfaction are 100% Guaranteed! DiscoverInk.com is the Most-Trusted Ink Supplier on the Internet. Over 5 million orders fulfilled in 5 years! Find out why we are the prefered ink and printer supply destination! Click Here http://www.platinum-deal.com/rd1.php?ac=CD46&bid=79&did=2310&oi= -- You are receiving this offer as part of the Offerbank.com recurring list. If you would prefer to not receive these messages in the future, please go to http://www.Offerbank.com/unsub.php?e=biojava-dev@biojava.org&m=300032 If there are any problems with this link, reply to this email with "Remove" in the subject line. Or to unsubscribe via postal mail, please send request to: Offerbank.com 1140 Highland Ave., Suite #302 Manhattan Beach, CA 90266 To read Offerbank.com's privacy policy, visit http://www.Offerbank.com/privacy.html The e-mail subscription address is: biojava-dev@biojava.org TM: <47;2qvY45P5-xAPRqvY45P58Y7B;300032> From autobuilder at derkholm.net Thu Sep 18 00:18:56 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Thu Sep 18 00:24:33 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063858742402.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030918 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From mark.schreiber at agresearch.co.nz Thu Sep 18 13:10:02 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Sep 18 13:09:12 2003 Subject: [Biojava-dev] RE: [Biojava-l] position weight matrix Message-ID: Not wanting to argue but i'd love to hear why :) - Mark -----Original Message----- From: Matthew Pocock [mailto:matthew_pocock@yahoo.co.uk] Sent: Thu 18/09/2003 9:14 p.m. To: Schreiber, Mark Cc: Brian Cox; biojava-l@biojava.org Subject: Re: [Biojava-l] position weight matrix Schreiber, Mark wrote: >Brian, > >I think we should change WeightMatrix so that N only scores a quater match of A where as R would score a half match. > >Do others feel this is sensible? > >- Mark > > We should be taking the odds score between the weight matrix matching at that pos and the null model - that way, ambiguity symbols devide out. I'm tied up today (writing lectures - damn those students), but it should be easy to modify WeightMatrixAnnotator to accept a ScoreType instance. For reasons that we can argue about on -dev or off-line, you can't in the general case just divide scores for columns containing an N by 4, or portion them out relative to the null model. That's why the ScoreType objects were added to the DP package. Matthew > -----Original Message----- > From: Brian Cox [mailto:cox@mshri.on.ca] > Sent: Thu 18/09/2003 2:35 p.m. > To: Schreiber, Mark > Cc: > Subject: RE: [Biojava-l] position weight matrix > > > > Thanks, > That sounds like what I was looking for, I wanted to penalize the use of an > N in the sequence. Not sure yet how to implement this but I'll give it a > shot. > thanks for the reply, > BRian > > -----Original Message----- > From: Schreiber, Mark > To: Brian Cox; biojava-l@biojava.org > Sent: 9/17/03 7:45 PM > Subject: RE: [Biojava-l] position weight matrix > > Hi Brian, > > Technically this is correct as N or X do actually match everything. Are > wanting to rule out any motif with an N or are you wanting to penalize a > motif with an N (or other ambiguity)? > > If you are working with DNA you could use > org.biojava.bio.seq.NucleotideTools, this class can be used to access > the nucleotide alphabet that treats all symbols as Atomic, even if they > are normally IUPAC ambiguity symbols. If you did this and set the weight > of N in the marix to 0.0 it would exclude those motifs. > > - Mark > > > > -----Original Message----- > From: Brian Cox [mailto:cox@mshri.on.ca] > Sent: Friday, 12 September 2003 11:07 a.m. > To: biojava-l@biojava.org > Subject: [Biojava-l] position weight matrix > > > I wrote a program to find TF binding sites using a > WeightMatrixAnnotator, but when I try to annotate a sequence if the > sequences has any N or X then everything matches. How do I get the > WeightMatrixAnnotator to ignore the Ns or Xs? > thanks, > Brian Cox > Samuel Lunenfeld Research Institute > Mount Sinai Hospital, Rm 884 > Toronto, Ontario > Canada > > 416-586-8266 > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > > > >======================================================================= >Attention: The information contained in this message and/or attachments >from AgResearch Limited is intended only for the persons or entities >to which it is addressed and may contain confidential and/or privileged >material. Any review, retransmission, dissemination or other use of, or >taking of any action in reliance upon, this information by persons or >entities other than the intended recipients is prohibited by AgResearch >Limited. If you have received this message in error, please notify the >sender immediately. >======================================================================= > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l > > > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From autobuilder at derkholm.net Fri Sep 19 00:19:12 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Fri Sep 19 00:24:35 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1063945153290.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030919 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Fri Sep 19 05:47:21 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Fri Sep 19 05:46:27 2003 Subject: [Biojava-dev] Re: [Biojava-l] position weight matrix In-Reply-To: References: Message-ID: <3F6AD0A9.9030305@yahoo.co.uk> Schreiber, Mark wrote: >Not wanting to argue but i'd love to hear why :) > >- Mark > > Ok - I'll try to give a coherent explanation, with each point in no particular order. Sometimes I wish my maths was better. To a biologist, an n means that one of the four nucleotitdes could be present, and Y means one of two could be present, and so on. To a statistician, I guess they would want to fit some probability or expectation (depending on classical or baysean) to those possibilities. HMMs are generative models. We use probabilistic HMMs for modeling the sequences, but realy what we are doing is comparing the sequences to all of those that the HMM generates, and getting joint probabilities that the HMM made a sequence like that and that we had a sequence like that in the first place, and that HMM (which is where all those pesky priors and posteriors come from). If we like the idea of generative grammars, we can treat a sequence containing ambiguity symbols as a generative model. We could if we wish iterate over all possible matching sequences composed entirely of attomic symbols. So: antt can be expanded to the four sequences aatt actt agtt attt In fact, this is exactly what we do for some of the sequence searching objects that provide regular-expression functionality. When dealing with the distribution objects, the probability of observing either a,g,c or t is going to be 1 - we must observe something, and these are the only possibilities. This means that an n is uninformative in allowing us to compare the likelihood of one distribution against another - they will both produce 1. However, if we look at the log odds for this - dividing out the null model, we get the number 0. The odds scores for the simple symbols will be rather more interesting - positive when the distribution fits the data better, and negative when the null model fits it better. Here, the 0 value is doing something usefull - it's just saying that that symbol is uninformative - neither giving support to the model or the null model. So - to cut a long story short, since our distribution objects are realy PDFs over sets of symbols, and HMMs are PDFs over sequences, and we can use log odds to make ambiguities turn into sane numbers, taking into account a null model, it is easier to make sequences containing ambiguity symbols behave as generative grammars, and sum over all sequences generated by these grammars for all HMM-related math (including HMMs and distributions). This way we have one world view, and all the sums work out without fudge-factor code being shot-gunned across the project. Is that clear, or have I garbled it again? Matthew From autobuilder at derkholm.net Sat Sep 20 00:18:56 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sat Sep 20 00:24:22 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064031538810.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030920 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/dp/DP.java * biojava-live/src/org/biojava/bio/dp/WeightMatrixAnnotator.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From autobuilder at derkholm.net Sun Sep 21 00:19:03 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sun Sep 21 00:24:33 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064117947005.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030921 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From autobuilder at derkholm.net Mon Sep 22 00:19:01 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Mon Sep 22 00:24:29 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064204342255.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030922 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From autobuilder at derkholm.net Tue Sep 23 00:18:55 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Tue Sep 23 00:24:32 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064290741586.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030923 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From freetech at gawab.com Tue Sep 23 02:15:27 2003 From: freetech at gawab.com (FreeTech) Date: Tue Sep 23 02:10:57 2003 Subject: [Biojava-dev] IMPORTANT! This could change your life! Message-ID: <200309230615.h8N6FRHQ084825@addr9.addr.com> !! READ THIS !! Important: This is absolutely 100% no strings attached FREE and does not involve downloading anything to your computer! How would you like to get FREE advertising for your website? Using FlyInAds you can! FlyInAds are a revolutionary new advertising system, better then traditional pop-ups or banner ads. By building your downline you can virtually GAIN credits without doing anything! These credits can then be used to launch a FREE campaign of ads for your website! Also FlyInAds are a new type of ads that CANNOT BE BLOCKED! And, they do not allow the user to do anything else on their computer without first closing your ad! These ads have proven to be nearly 4 times as effective as normal pop-up ads. It only takes a mere 30 seconds to sign up for free and it could increase your websites traffic by more then 2000%! So, what do you have to lose when it is TOTALLY FREE? INTERESTED? GO TO: http://www.flyinads.tk DO YOU NOT WISH TO RECIVE ANY MORE GREAT OFFERS? TO UNSUBSCRIBE IMMIDIATLY, GO TO: http://unsubscribe.listbot.tk PLEASE NOTE: This is not a trial offer and you do not have to buy anything in order to use the program. The way money is earned to support this web service is through the sale of various internet tools that can be used to further advance your website, but these are not at all necessary to purchase or use in any way! From hywang at scbit.org Tue Sep 23 22:55:18 2003 From: hywang at scbit.org (hywang@scbit.org) Date: Tue Sep 23 22:49:45 2003 Subject: [Biojava-dev] About the six frame renderer again. Message-ID: <20030924025518.4728.qmail@scbit.org> Hi, I have wrote a renderer for six frame view based on your SixFrameRenderer, it can render a frame with the input parameter: moduloFrame which could only be 0,1,2 for the different start base of translation. strand which stands for the positive or negtive strand. I can get the expected effect of six lines of Amino arrays in zoom in mode, and different sticks drawed for stop codon in zoom out mode. But if a sequence is very long, for example about 100k, the scroll action in zoom in mode will be a bit slow. I think that is probably because of the wholy repainting all the amino strings, so I have tried getclip() function in the paint() method ,but the improvement seemed to be not a satisfaction. Any help would be greatly appreciated and would stop me pulling anymore hair out! Thanks. Best Wishes ! Hywang import org.biojava.utils.AbstractChangeable; import org.biojava.bio.gui.sequence.SequenceRenderer; import org.biojava.bio.gui.sequence.SequenceRenderContext; import org.biojava.bio.gui.sequence.SequenceViewerEvent; import org.biojava.bio.seq.StrandedFeature; import org.biojava.bio.seq.DNATools; import org.biojava.bio.seq.RNATools; import org.biojava.bio.seq.ProteinTools; import org.biojava.bio.seq.io.SymbolTokenization; import org.biojava.bio.symbol.*; import org.biojava.bio.BioRuntimeException; import java.awt.*; import java.awt.event.MouseEvent; import java.awt.geom.AffineTransform; import java.awt.geom.Rectangle2D; import java.util.List; /** * Created by IntelliJ IDEA. * User: administrator * Date: 2003-9-5 * Time: 10:03:58 * To change this template use Options | File Templates. */ public class OneFrameRenderer extends AbstractChangeable implements SequenceRenderer { private double depth = 14.0; private double blockWidth = 12.0; private Paint fontcolor = Color.black; private Paint linecolor = Color.black; private int moduloFrame; private StrandedFeature.Strand strand; public OneFrameRenderer(int moduloFrame, StrandedFeature.Strand strand) { this.moduloFrame = moduloFrame; this.strand = strand; } public double getDepth(SequenceRenderContext context) { return depth + 1.0; } public double getMinimumLeader(SequenceRenderContext context) { return 0.0; } public double getMinimumTrailer(SequenceRenderContext context) { return 0.0; } public void paint(Graphics2D g2, SequenceRenderContext context) { //Rectangle2D prevClip = g2.getClipBounds(); //seems no use AffineTransform prevTransform = g2.getTransform(); g2.setPaint(fontcolor); Font font = context.getFont(); Rectangle2D maxCharBounds = font.getMaxCharBounds(g2.getFontRenderContext()); double scale = context.getScale(); if (scale >= (maxCharBounds.getWidth() * 0.2) && scale >= (maxCharBounds.getHeight() * 0.2)) { double xFontOffset = 0.0; double yFontOffset = 0.0; // These offsets are not set quite correctly yet. The // Rectangle2D from getMaxCharBounds() seems slightly // off. The "correct" application of translations based on // the Rectangle2D seem to give the wrong results. The // values below are mostly fudges. if (context.getDirection() == SequenceRenderContext.HORIZONTAL) { xFontOffset = maxCharBounds.getCenterX() * 0.25; yFontOffset = -maxCharBounds.getCenterY() + (depth * 0.5); } else { xFontOffset = -maxCharBounds.getCenterX() + (depth * 0.5); yFontOffset = -maxCharBounds.getCenterY() * 3.0; } SymbolList seq1 = context.getSymbols(); if (strand == StrandedFeature.NEGATIVE) { try { seq1 = DNATools.reverseComplement(seq1); } catch (Exception ex) { throw new BioRuntimeException(ex); } } SymbolList seq = seq1.subList(1 + moduloFrame, seq1.length() - (seq1.length() - moduloFrame) % 3); //int min = context.getRange().getMin(); int min = 1; //int max = context.getRange().getMax(); int max = seq.length(); //transcribe to RNA SymbolTokenization toke = null; TranslationTable eup = RNATools.getGeneticCode(TranslationTable.UNIVERSAL); Alphabet protein_al = ProteinTools.getAlphabet(); SymbolList protein = null; try { toke = protein_al.getTokenization("token"); seq = RNATools.transcribe(seq); //veiw the RNA sequence as codons, this is done internally by RNATool.translate() seq = SymbolListViews.windowedSymbolList(seq, 3); //translate protein = SymbolListViews.translate(seq, eup); } catch (Exception ex) { throw new BioRuntimeException(ex); } //System.out.println(protein.seqString()); //g2.drawString(protein.seqString(),(float)context.sequenceToGraphics(1),(fl oat)yFontOffset); //---the same slow for (int sPos = min; sPos <= max; sPos++) { if (context.getDirection() == SequenceRenderContext.HORIZONTAL && sPos % 3 == 0) { double gPos = context.sequenceToGraphics(sPos - 1 + moduloFrame); String s = "*"; try { s = toke.tokenizeSymbol(protein.symbolAt(sPos / 3)); //s = protein.symbolAt(sPos).toString(); } catch (Exception ex) { // We'll ignore the case of not being able to tokenize it } g2.drawString(s,(float)(gPos+ xFontOffset),(float)yFontOffset); /* char [] tmpc = s.toCharArray(); g2.drawChars(tmpc,0,1, (int) (gPos),// + xFontOffset), (int) yFontOffset);*/ } } } else { renderOneFrame(g2, context, context.getRange(), false); } //g2.setClip(prevClip);// ---seems no use g2.setTransform(prevTransform); } public SequenceViewerEvent processMouseEvent(SequenceRenderContext context, MouseEvent me, List path) { path.add(this); int sPos = context.graphicsToSequence(me.getPoint()); return new SequenceViewerEvent(this, null, sPos, me, path); } private boolean isStop(SymbolList seq, int base, StrandedFeature.Strand strand) { // tests whether there is a stop at given location. // the triplet is either base, +1, +2 or -1, -2 // depending on the strand searched if (strand == StrandedFeature.POSITIVE) { // check that search does not exceed bounds if (base + 2 > seq.length()) return false; // search top strand // first base must be t if (seq.symbolAt(base) != DNATools.t()) return false; // second base cannot be c or t if (seq.symbolAt(base + 1) == DNATools.c()) return false; if (seq.symbolAt(base + 1) == DNATools.t()) return false; // if second base is g, the third must be a if (seq.symbolAt(base + 1) == DNATools.g()) { if (seq.symbolAt(base + 2) != DNATools.a()) return false; } else { // second base is a: third must be a or g. if (seq.symbolAt(base + 2) == DNATools.c()) return false; if (seq.symbolAt(base + 2) == DNATools.t()) return false; } // oh well, must be a stop, innit? return true; } else { // check bounds if (base - 2 < 1) return false; // search bottom strand // first base must be t if (seq.symbolAt(base) != DNATools.a()) return false; // second base cannot be c or t on reverse strand if (seq.symbolAt(base - 1) == DNATools.a()) return false; if (seq.symbolAt(base - 1) == DNATools.g()) return false; // if second base is g, the third must be a if (seq.symbolAt(base - 1) == DNATools.c()) { if (seq.symbolAt(base - 2) != DNATools.t()) return false; } else { // second base is a: third must be a or g. if (seq.symbolAt(base - 2) == DNATools.a()) return false; if (seq.symbolAt(base - 2) == DNATools.g()) return false; } // ach! a stop! return true; } } private void renderOneFrame( Graphics2D g, SequenceRenderContext src, RangeLocation range, boolean onceOnly) { // method to draw by checking succeeding triplets for // stop codons. // write it for horizontal rendering first. SymbolList seq = src.getSymbols(); // get extent of sequence to render // hope it agrees with clip region! int minS = range.getMin(); int maxS = range.getMax(); // we start at the first triplet whose first base is within // the range. /* if (minS % 3 > moduloFrame) { // first triplet of my frame is in next mod-zero triplet minS = (minS / 3 + 1) * 3 + moduloFrame; } else if (minS % 3 != moduloFrame) { // first triplet is in current mod-zero triplet minS = (minS / 3) * 3 + moduloFrame; }*/ // now we search every triplet from minS upward seeking stops. for (int base = minS+moduloFrame; base <= maxS; base += 3) { // check for stop if (!isStop(seq, base, strand)) continue; // we have a stop, render a line drawLine(g, src, base, strand); // do I call it quits now? if (onceOnly) return; } } public void drawLine( Graphics2D g, SequenceRenderContext src, int base, StrandedFeature.Strand strand) { Paint prevPaint = g.getPaint(); g.setPaint(linecolor); // compute the frame to use. //int moduloFrame = base%3; // System.out.println("drawLine: base,strand,modulo" + base + " " + strand + " " + moduloFrame); // get required offset for frame double offset = 0;//modi by hywang // compute position of line to be drawn int lineP = (int) src.sequenceToGraphics(base); // draw the line if (src.getDirection() == src.HORIZONTAL) { g.drawLine(lineP, (int) offset, lineP, (int) (offset + blockWidth)); } else { g.drawLine((int) offset, lineP, (int) (offset + blockWidth), lineP); } g.setPaint(prevPaint); } } From autobuilder at derkholm.net Wed Sep 24 00:19:06 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Wed Sep 24 00:24:38 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064377146460.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030924 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Wed Sep 24 06:03:23 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 24 06:03:28 2003 Subject: [Biojava-dev] About the six frame renderer again. In-Reply-To: <20030924025518.4728.qmail@scbit.org> References: <20030924025518.4728.qmail@scbit.org> Message-ID: <3F716BEB.6010107@yahoo.co.uk> Hi, I recently added code for optimizing rendering to the current clip. If you get a recent biojava (preferably source-code - try cvs or the nightly builds), then you will see some new-in-1.4 code in the .gui.sequence package. Take a look at SymbolSequenceREnderer, arround line #117 for how to get the minimally visible range. Matthew hywang@scbit.org wrote: > Hi, > I have wrote a renderer for six frame view based on your > SixFrameRenderer, it can render a frame with the input > parameter: > moduloFrame > which could only be 0,1,2 for the different start base of translation. > strand > which stands for the positive or negtive strand. > I can get the expected effect of six lines of Amino arrays in zoom in > mode, > and different sticks drawed for stop codon in zoom out mode. > But if a sequence is very long, for example about 100k, > the scroll action in zoom in mode will be a bit slow. > I think that is probably because of the wholy repainting all the amino > strings, so I have tried getclip() function in the paint() method > ,but the improvement seemed to be not a satisfaction. > Any help would be greatly appreciated and would stop me pulling > anymore hair out! > Thanks. > Best Wishes ! > Hywang From matthew_pocock at yahoo.co.uk Wed Sep 24 12:44:51 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 24 12:44:56 2003 Subject: [Biojava-dev] problems with biosql Message-ID: <3F71CA03.3020809@yahoo.co.uk> Hi, We have been adding sequences (from embl files) into biosql (using the latest CVS biojava, and the latest CVS biosql schema) using the normal addSequence() method. We have an exception loading data into biosql for just some sequences that I've tracked back to BioSQLSequenceDB.intern_ontology_term() and OntologySQL.termID() but this code is beyond me - It's not documented and I'm not sure what the contract for these methods is meant to be. Thomas, did you write this? Any idea what is going wrong? In the mean time I will try to track down exactly what ontology term is causing the problems. Matthew org.biojava.bio.BioRuntimeException: Error adding BioSQL tables (rolled back successfully) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen ceDB.java:464) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.addSequence(BioSQLSequenc eDB.java:315) at mbsample.EmblBase.main(EmblBase.java:68) Caused by: java.sql.SQLException: Couldn't create term in legacy ontology namespace at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(BioS QLSequenceDB.java:903) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.persistBioentryProperty(B ioSQLSequenceDB.java:857) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen ceDB.java:450) From matthew_pocock at yahoo.co.uk Wed Sep 24 13:19:08 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 24 13:19:23 2003 Subject: [Biojava-dev] problems with biosql In-Reply-To: <3F71CA03.3020809@yahoo.co.uk> References: <3F71CA03.3020809@yahoo.co.uk> Message-ID: <3F71D20C.6070108@yahoo.co.uk> And here's another stack-trace we get after getting the SQLException to use initCause() - it's a monster trace :) Apparently, part of the code things that a term needs adding to the database, but another part thinks it already exists. At least that's the best I can figure from the undocumented code & exceptions that don't say very much. Sorry - long day. Matthew org.biojava.bio.BioRuntimeException: Error adding BioSQL tables (rolled back successfully) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen ceDB.java:464) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.addSequence(BioSQLSequenc eDB.java:315) at mbsample.EmblBase.main(EmblBase.java:68) Caused by: java.sql.SQLException: Couldn't create term in legacy ontology namespace at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(BioS QLSequenceDB.java:903) at org.biojava.bio.seq.db.biosql.FeaturesSQL.persistProperty(FeaturesSQL.ja va:1016) at org.biojava.bio.seq.db.biosql.FeaturesSQL.persistFeature(FeaturesSQL.jav a:880) at org.biojava.bio.seq.db.biosql.FeaturesSQL.persistFeatures(FeaturesSQL.ja va:739) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen ceDB.java:434) ... 2 more Caused by: org.biojava.bio.BioRuntimeException: Error removing from BioSQL tables (rolled back successfully) at org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 49) at org.biojava.bio.seq.db.biosql.OntologySQL.access$200(OntologySQL.java:58 ) at org.biojava.bio.seq.db.biosql.OntologySQL$OntologyMonitor.postChange(Ont ologySQL.java:416) at org.biojava.utils.ChangeSupport.firePostChangeEvent(ChangeSupport.java:3 02) at org.biojava.ontology.Ontology$Impl.addTerm(Ontology.java:312) at org.biojava.ontology.Ontology$Impl.createTerm(Ontology.java:321) at org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(BioS QLSequenceDB.java:899) ... 6 more Caused by: java.sql.SQLException: ERROR: current transaction is aborted, queries ignored until end of transaction block at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:131) at org.postgresql.jdbc1.AbstractJdbc1Connection.ExecSQL(AbstractJdbc1Connec tion.java:505) at org.postgresql.jdbc1.AbstractJdbc1Statement.execute(AbstractJdbc1Stateme nt.java:320) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme nt.java:48) at org.postgresql.jdbc1.AbstractJdbc1Statement.executeUpdate(AbstractJdbc1S tatement.java:197) at org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 64) at org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 37) ... 12 more From simon.foote at nrc-cnrc.gc.ca Wed Sep 24 14:03:26 2003 From: simon.foote at nrc-cnrc.gc.ca (Simon Foote) Date: Wed Sep 24 14:02:57 2003 Subject: [Biojava-dev] problems with biosql In-Reply-To: <3F71D20C.6070108@yahoo.co.uk> References: <3F71D20C.6070108@yahoo.co.uk> Message-ID: <3F71DC6E.5050206@nrc-cnrc.gc.ca> Hi Mathew, Which database server are you using? I ran into a similar problem importing Genbank files into a MySQL database as Genbank can contain terms that are the same, but have different cases. Off the top of my head, I traced this to either the unique indexing of the term table in the database doesn't take the case into account or the persistant storage of the terms in a map doesn't take the case into account. Although, I can't remember whether the keys in a map take case into account. There is a comment line in BioSQLSequenceDB.java around line 850 //System.err.println("Term: " + ts + " " + ex.getMessage()); That will give you the offending term if uncommented. The hack I added above that code block worked in all my Genbank cases, but I never did test it with EMBL files. It maybe time to think about removing this legacy ontology code and using whatever is supposed to replace it. I think Thomas coded the original stuff. Simon -- Bioinformatics Programmer Institute for Biological Sciences National Research Council of Canada [T] 613-990-0561 [F] 613-952-9092 simon.foote@nrc-cnrc.gc.ca Matthew Pocock wrote: >And here's another stack-trace we get after getting the SQLException to use >initCause() - it's a monster trace :) Apparently, part of the code things >that a term needs adding to the database, but another part thinks it already >exists. At least that's the best I can figure from the undocumented code & >exceptions that don't say very much. Sorry - long day. > >Matthew > >org.biojava.bio.BioRuntimeException: Error adding BioSQL tables (rolled >back successfully) > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen >ceDB.java:464) > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.addSequence(BioSQLSequenc >eDB.java:315) > at mbsample.EmblBase.main(EmblBase.java:68) >Caused by: java.sql.SQLException: Couldn't create term in legacy >ontology namespace > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(BioS >QLSequenceDB.java:903) > at >org.biojava.bio.seq.db.biosql.FeaturesSQL.persistProperty(FeaturesSQL.ja >va:1016) > at >org.biojava.bio.seq.db.biosql.FeaturesSQL.persistFeature(FeaturesSQL.jav >a:880) > at >org.biojava.bio.seq.db.biosql.FeaturesSQL.persistFeatures(FeaturesSQL.ja >va:739) > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB._addSequence(BioSQLSequen >ceDB.java:434) > ... 2 more >Caused by: org.biojava.bio.BioRuntimeException: Error removing from >BioSQL tables (rolled back successfully) > at >org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 >49) > at >org.biojava.bio.seq.db.biosql.OntologySQL.access$200(OntologySQL.java:58 >) > at >org.biojava.bio.seq.db.biosql.OntologySQL$OntologyMonitor.postChange(Ont >ologySQL.java:416) > at >org.biojava.utils.ChangeSupport.firePostChangeEvent(ChangeSupport.java:3 >02) > at org.biojava.ontology.Ontology$Impl.addTerm(Ontology.java:312) > at >org.biojava.ontology.Ontology$Impl.createTerm(Ontology.java:321) > at >org.biojava.bio.seq.db.biosql.BioSQLSequenceDB.intern_ontology_term(BioS >QLSequenceDB.java:899) > ... 6 more >Caused by: java.sql.SQLException: ERROR: current transaction is >aborted, queries ignored until end of transaction block > > at >org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:131) > at >org.postgresql.jdbc1.AbstractJdbc1Connection.ExecSQL(AbstractJdbc1Connec >tion.java:505) > at >org.postgresql.jdbc1.AbstractJdbc1Statement.execute(AbstractJdbc1Stateme >nt.java:320) > at >org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme >nt.java:48) > at >org.postgresql.jdbc1.AbstractJdbc1Statement.executeUpdate(AbstractJdbc1S >tatement.java:197) > at >org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 >64) > at >org.biojava.bio.seq.db.biosql.OntologySQL.persistTerm(OntologySQL.java:4 >37) > ... 12 more > > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > From matthew_pocock at yahoo.co.uk Wed Sep 24 14:11:09 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Wed Sep 24 14:11:17 2003 Subject: [Biojava-dev] problems with biosql In-Reply-To: <3F71DC6E.5050206@nrc-cnrc.gc.ca> References: <3F71D20C.6070108@yahoo.co.uk> <3F71DC6E.5050206@nrc-cnrc.gc.ca> Message-ID: <3F71DE3D.7080007@yahoo.co.uk> OK. I'm using postgresql and loading in complete genome EMBL files. It's possible that it is due to case clashes. The method BioSQLSequenceDB.intern_ontolgy_term() does seem to be doing case tricks - will the world explode if I dissable this? Anyway, I've modified the exception messages now so that they tell me more stuff. Hopefully that should make this easier to track. Matthew Simon Foote wrote: > Hi Mathew, > > Which database server are you using? > I ran into a similar problem importing Genbank files into a MySQL > database as Genbank can contain terms that are the same, but have > different cases. Off the top of my head, I traced this to either the > unique indexing of the term table in the database doesn't take the > case into account or the persistant storage of the terms in a map > doesn't take the case into account. Although, I can't remember > whether the keys in a map take case into account. > > There is a comment line in BioSQLSequenceDB.java around line 850 > //System.err.println("Term: " + ts + " " + ex.getMessage()); > That will give you the offending term if uncommented. > > The hack I added above that code block worked in all my Genbank cases, > but I never did test it with EMBL files. > > It maybe time to think about removing this legacy ontology code and > using whatever is supposed to replace it. > I think Thomas coded the original stuff. > > Simon > From autobuilder at derkholm.net Thu Sep 25 00:18:55 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Thu Sep 25 00:24:32 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <12577309.1064463537553.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030925 Binary build: OK Javadocs build: OK Core test suite: OK A snapshot release has been made at: http://www.derkholm.net/autobuild/ The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java * biojava-live/src/org/biojava/bio/seq/db/biosql/OntologySQL.java * biojava-live/src/org/biojava/ontology/ReasoningDomain.java * biojava-live/tests/org/biojava/ontology/ReasoningDomainTest.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew.pocock at ncl.ac.uk Thu Sep 25 10:10:13 2003 From: matthew.pocock at ncl.ac.uk (Matthew Pocock) Date: Thu Sep 25 10:10:32 2003 Subject: [Biojava-dev] build.xml Message-ID: <3F72F745.9070302@ncl.ac.uk> Hi, The ant script has got realy big and complex. The options seem to be a) try to fix the urrent mess in a single file b) have one build.xml file which builds and test everything, with other files e.g. build/core.xml, build/test/xml for each logical module. I'm happy to do the file munging - please tell me which way I should go. Matthew From td2 at sanger.ac.uk Thu Sep 25 10:35:00 2003 From: td2 at sanger.ac.uk (Thomas Down) Date: Thu Sep 25 10:33:00 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <3F72F745.9070302@ncl.ac.uk> References: <3F72F745.9070302@ncl.ac.uk> Message-ID: <20030925143459.GC374956@jabba.sanger.ac.uk> On Thu, Sep 25, 2003 at 03:10:13PM +0100, Matthew Pocock wrote: > Hi, > > The ant script has got realy big and complex. The options seem to be > > a) try to fix the urrent mess in a single file > > b) have one build.xml file which builds and test everything, with other > files e.g. build/core.xml, build/test/xml for each logical module. > > I'm happy to do the file munging - please tell me which way I should go. A large cause of bulkiness in this file seems to be the "run some subset of tests" targets, which are mostly cut-and-paste jobs from the original runtest target. We have runtests runmosttests seqtests symboltests biotests ontotests filtertests dptests searchtests At least a couple of these are my fault... Is there any way we can neatly run subsets of tests without having all this cut-and-paste cruft (at least one of which won't work at all any more, since the ontology APIs moved...). We do need some reasonly clean way to run test subsets, beccause even on a fast machine, a complete test run is frustratingly slow when debugging something. What about removing everything except `runtests' from build.xml, but adding a script which autogenerates a big ANT file with a target for every package. Thomas. From chrisa at espressosoftware.com Thu Sep 25 13:04:59 2003 From: chrisa at espressosoftware.com (Chris Abajian) Date: Thu Sep 25 13:05:01 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <3F72F745.9070302@ncl.ac.uk> References: <3F72F745.9070302@ncl.ac.uk> Message-ID: <1064509615.3119.3.camel@suzinak.abajian.net> My $0.02: > a) try to fix the urrent mess in a single file A losing battle. > b) have one build.xml file which builds and test everything, with other > files e.g. build/core.xml, build/test/xml for each logical module. Standard Operating Procedure. -- Chris Abajian Espresso Software Development, L.L.C. http://espressosoftware.com 206.910.4903 Espresso Software Development provides software development and consulting services. We develop, deploy and support scalable, multi-tiered, high-availability web, e-commerce and data-processing applications. From matthew.pocock at ncl.ac.uk Thu Sep 25 13:31:16 2003 From: matthew.pocock at ncl.ac.uk (Matthew Pocock) Date: Thu Sep 25 13:31:36 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <20030925143459.GC374956@jabba.sanger.ac.uk> References: <3F72F745.9070302@ncl.ac.uk> <20030925143459.GC374956@jabba.sanger.ac.uk> Message-ID: <3F732664.1060509@ncl.ac.uk> I have agressively re-factored build.xml so that now it is at least readable. I am happy for tests to move into an build-tests.xml file. Matthew Thomas Down wrote: >On Thu, Sep 25, 2003 at 03:10:13PM +0100, Matthew Pocock wrote: > > >>Hi, >> >>The ant script has got realy big and complex. The options seem to be >> >>a) try to fix the urrent mess in a single file >> >>b) have one build.xml file which builds and test everything, with other >>files e.g. build/core.xml, build/test/xml for each logical module. >> >>I'm happy to do the file munging - please tell me which way I should go. >> >> > >A large cause of bulkiness in this file seems to be the >"run some subset of tests" targets, which are mostly cut-and-paste >jobs from the original runtest target. We have > > runtests > runmosttests > seqtests > symboltests > biotests > ontotests > filtertests > dptests > searchtests > >At least a couple of these are my fault... > >Is there any way we can neatly run subsets of tests without having >all this cut-and-paste cruft (at least one of which won't work >at all any more, since the ontology APIs moved...). > >We do need some reasonly clean way to run test subsets, beccause >even on a fast machine, a complete test run is frustratingly >slow when debugging something. What about removing everything >except `runtests' from build.xml, but adding a script which >autogenerates a big ANT file with a target for every package. > > Thomas. > > > > > > > From matthew_pocock at yahoo.co.uk Thu Sep 25 13:42:04 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Sep 25 13:42:26 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <3F72F745.9070302@ncl.ac.uk> References: <3F72F745.9070302@ncl.ac.uk> Message-ID: <3F7328EC.2020003@yahoo.co.uk> Ok - I've given build.xml a full spring-clean. You probably want to do an ant clean after your next cvs update. The exact build-structure under ant-build is now regularised, as are many of the target names. Unfortunately, I couldn't get docbook documentatin to build - it gave a confusing error. Could somebody who knows check this? Matthew Matthew Pocock wrote: > Hi, > > The ant script has got realy big and complex. The options seem to be > > a) try to fix the urrent mess in a single file > > b) have one build.xml file which builds and test everything, with > other files e.g. build/core.xml, build/test/xml for each logical module. > > I'm happy to do the file munging - please tell me which way I should go. > > Matthew > > _______________________________________________ > biojava-dev mailing list > biojava-dev@biojava.org > http://biojava.org/mailman/listinfo/biojava-dev > From autobuilder at derkholm.net Fri Sep 26 00:17:09 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Fri Sep 26 00:24:43 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <28637909.1064549830159.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030926 Binary build: FAILED! Javadocs build: FAILED! Core test suite: OK Problems occurred during this build cycle -- please investigate as soon as possible! The following files were modified in the last 24 hours: * biojava-live/src/org/biojava/bio/seq/db/biosql/BioSQLSequenceDB.java A patch file reflecting these changes is available from http://www.derkholm.net/autobuild/patches/ -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From len at reeltwo.com Fri Sep 26 01:12:33 2003 From: len at reeltwo.com (Len Trigg) Date: Fri Sep 26 01:25:40 2003 Subject: [Biojava-dev] problems with biosql In-Reply-To: <200309260425.h8Q4PUdb026306@portal.open-bio.org> References: <200309260425.h8Q4PUdb026306@portal.open-bio.org> Message-ID: > From: Matthew Pocock > Date: Wed, 24 Sep 2003 18:19:08 +0100 > > And here's another stack-trace we get after getting the SQLException > to use initCause() - it's a monster trace :) Apparently, part of the > code things that a term needs adding to the database, but another > part thinks it already exists. At least that's the best I can figure > from the undocumented code & exceptions that don't say very > much. Sorry - long day. I've seen the exact same message with Oracle, which uses "SEQUENCES" associated with triggers to auto-assign row ids. Someone had manually added a term to the table, and the sequence got confused when the id it wanted to assign next had already been used. My workaround was to recreate the sequence starting from the next id to be assigned. Maybe the case for postgres is similar. Cheers, Len. From len at reeltwo.com Fri Sep 26 01:16:40 2003 From: len at reeltwo.com (Len Trigg) Date: Fri Sep 26 01:25:55 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <200309260425.h8Q4PUdb026306@portal.open-bio.org> References: <200309260425.h8Q4PUdb026306@portal.open-bio.org> Message-ID: > From: Thomas Down > Date: Thu, 25 Sep 2003 15:35:00 +0100 > > A large cause of bulkiness in this file seems to be the > "run some subset of tests" targets, which are mostly cut-and-paste > jobs from the original runtest target. We have > > Is there any way we can neatly run subsets of tests without having > all this cut-and-paste cruft (at least one of which won't work > at all any more, since the ontology APIs moved...). The way we do it here at work is have a class per package called AllTests that simply has a suite composed of the suites of all test classes in the package plus the AllTests suite for each subpackage. To run a subset of tests, just run the corresponding AllTests suite. Works nicely. Cheers, Len. From td2 at sanger.ac.uk Fri Sep 26 03:41:35 2003 From: td2 at sanger.ac.uk (Thomas Down) Date: Fri Sep 26 03:39:35 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report In-Reply-To: <28637909.1064549830159.JavaMail.thomas@firechild.derkholm.net> References: <28637909.1064549830159.JavaMail.thomas@firechild.derkholm.net> Message-ID: <20030926074134.GA380604@jabba.sanger.ac.uk> On Fri, Sep 26, 2003 at 05:17:09AM +0100, autobuilder@derkholm.net wrote: > BioJava automatic build system, run 20030926 > > Binary build: FAILED! > Javadocs build: FAILED! > Core test suite: OK > > Problems occurred during this build cycle -- please investigate as soon as possible! Sorry about this: it looks like it was really a problem with the builder rather BioJava. I'll sort it out tonight. Thomas From kdj at sanger.ac.uk Fri Sep 26 04:50:00 2003 From: kdj at sanger.ac.uk (Keith James) Date: Fri Sep 26 04:50:02 2003 Subject: [Biojava-dev] build.xml In-Reply-To: <3F7328EC.2020003@yahoo.co.uk> References: <3F72F745.9070302@ncl.ac.uk> <3F7328EC.2020003@yahoo.co.uk> Message-ID: >>>>> "Matthew" == Matthew Pocock writes: [...] Matthew> Unfortunately, I couldn't get docbook documentatin to Matthew> build - it gave a confusing error. Could somebody who Matthew> knows check this? Is it an error about running out of something-or-other IDs? If so, I think it's the buggy xslt transformer shipped in 1.4 (which is why we're using an old set of stylesheets as a workaround). However, the error I get is BUILD FAILED file:/hgs2/team65/kdj/dev/biojava-live/build.xml:645: java.io.FileNotFoundException: /hgs2/team65/kdj/dev/biojava-live/ant-build/src/docs/biojava-doc-main.xml (No such file or directory) I'll have a look later. One foolproof solution to docbook transformation errors is to include and use the Saxon xslt processor. Pros: it works flawlessly. Cons: it's a dependency and it's fairly big (630k jar). cheers, Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From autobuilder at derkholm.net Sat Sep 27 00:17:12 2003 From: autobuilder at derkholm.net (autobuilder@derkholm.net) Date: Sat Sep 27 00:23:08 2003 Subject: [Biojava-dev] [biojava-builder] BioJava nightly build report Message-ID: <28637909.1064636233149.JavaMail.thomas@firechild.derkholm.net> BioJava automatic build system, run 20030927 Binary build: FAILED! Javadocs build: FAILED! Core test suite: OK Problems occurred during this build cycle -- please investigate as soon as possible! No changes were made in the last 24 hours. -- BioJava Autobuilder, maintained by Thomas Down If you notice any problems, contact autobuilder@derkholm.net From matthew_pocock at yahoo.co.uk Tue Sep 30 14:06:48 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Tue Sep 30 14:08:25 2003 Subject: [Biojava-dev] ontologies Message-ID: <3F79C638.3050306@yahoo.co.uk> Hi, I've spent the last 2 days working on the ontology support. Nothing is in CVS yet as I'm not sure if any of it is correct. I've done the following: * TripleTerm now refers to a Term, not a subject, object, predicate triple. * I'm incrementally bulking out OntologyOps as I identify bottlenecks. * I've added a sablecc grammar and a parser for a simple sop language. I will paste an example below. People may object to sablecc, so this is one of the major reasons for not commiting yet * I've defined the core rules of sets, lists, inheritance, binary relations and agregate functions (e.g. for_each) in a core ontology. This made my head hurt, and is bound to be buggy. * There is now an ontology of all integers - we will need to mix in operators over these. I will add an ontology of all strings soon. Next, I will start work on the interpreter / inference engine / therome prover. This is where everything may go wrong - but what can you do. If you are interested in this stuff, please tell me - it's getting lonly working on this alone. Matthew * Comments are in {} and can follow any identifier. * Concrete entities are given names starting with a letter * Terms are introduced as expressions containing just the term name * Triples (and possibly triple terms) are introduced as predicate(subject,object) where subject or object may be triples themselves * Variables (things we will pattern-match) start with an underscore, and are implicitly scoped to the expression they are in. * All expressions resolve to boolean values - there is no way to 'return' a value. * All expressions are statements of fact. They are axioms that do hold. All data is entered as axioms. ------ how isa & instanceof relate to one another ------ in reality this probably lives ------ in the core ontology deffinitions isa { inheritance of types - equivalent to sub-set I guess } instanceof { an instance of a type - equivalent to set membership } implies { Transitive closure } (and(isa(_X, _Y), and(isa(_Y, _Z)), isa(_X, _Z)) implies { How instanceof and isa play with each other } (and(instanceof(_x, _X), isa(_X, _Y)), instanceof(_x, _Y)) ------ this is some biology - we use isa and ------ instanceof here to represent some knowledge haemaglobin { Proteins that bind haem and are used to transport oxygen } globin { Proteins that are globular-ish } protein { Poly-peptides with some structure } human-alpha-haemoglobin { The human alpha-haemoglobin protein } isa(haemoglobin, globin) isa(globin, protein) instanceof(human-alpha-haemoglobin, haemoglobin) ------ Now we can assert the truth of statements isa(haemoglobin, protein) -> true instanceof(human-alpha-haemoglobin, protein) -> true isa(protein, human-alpha-haemoglobin) -> false ------ Or perhaps pattern-match isa(_thing, protein) -> isa(protein, protein), isa(globin, protein), isa(haemoglobin, protein); _relation(human-alpha-haemoglobin, globin) -> instanceof(human-alpha-haemoglobin, globin); _relation(globin, human-alpha-haemoglobin) -> ;