From len at reeltwo.com Thu Oct 2 21:43:50 2003 From: len at reeltwo.com (Len Trigg) Date: Thu Oct 2 21:41:36 2003 Subject: [Biojava-dev] ProcessTools / ExecRunner Message-ID: Hi guys, I think it would be a good idea for someone who knows the pros and cons of ExecRunner and ProcessTools to remove one of them from CVS. Having two classes that do virtually the same thing is counterproductive. One of the guys here wasn't aware of the duplication and enhanced ExecRunner to meet his requirements, only to discover that ProcessTools already had the functionality he needed. Finding out your effort is redundant like this doesn't make a happy contributor. Also, he switched to using ProcessTools, and has noticed that occasionally a job will just hang (I'm guessing there might be a bug in the way ProcessTools extracts the stdout/stderr of the process). If it doesn't ring a bell with anyone, I'll look into it. Cheers, Len. From td2 at sanger.ac.uk Fri Oct 3 05:45:02 2003 From: td2 at sanger.ac.uk (Thomas Down) Date: Fri Oct 3 05:42:50 2003 Subject: [Biojava-dev] ProcessTools / ExecRunner In-Reply-To: References: Message-ID: <20031003094502.GA426617@jabba.sanger.ac.uk> Sorry about the mix up here -- two solutions to the same problem got checked in within a couple of minutes of one another. I think we're going to rationalize on ProcessTools before BioJava 1.4, though. On Fri, Oct 03, 2003 at 01:43:50PM +1200, Len Trigg wrote: > > Also, he switched to using ProcessTools, and has noticed that > occasionally a job will just hang (I'm guessing there might be a bug > in the way ProcessTools extracts the stdout/stderr of the process). If > it doesn't ring a bell with anyone, I'll look into it. Have you got any examples of this? Also, do you know what version of Java is installed? I believe that some Process bugs got fixed in JDK1.4.2, so it's probably worth upgrading if you haven't already. Thomas. From silvere at digitalbiosphere.com Sun Oct 5 18:54:37 2003 From: silvere at digitalbiosphere.com (=?iso-8859-1?Q?Silv=E8re?= Martin-Michiellot) Date: Sun Oct 5 18:52:20 2003 Subject: [Biojava-dev] bioJava, jsci Message-ID: <5.1.0.14.0.20031005171733.00b9f830@pop.digitalbiosphere.com> Hi, We are a group of people building the JSci package at jsci.sourceforge.net. We are looking to extend/implement biology features. We need support for biologists to help us extend the features of the API as we are not experts by ourselves. Thus far we have implemented a basic package with aminoacids, DNA, bases, mRNA, proteins. We would like to receive suggestions, ideas about possible extensions, API review, etc. It seems for us we have to work in the same direction as you do to minimalize coding effort. We are also thinking about implementing XML support but from http://www.pasteur.fr/cgi-bin/biology/bnb_s.pl?english=1&query=xml it appears we are a bit struck to what is cool, working, standard and what is not (see for example the comparison from http://www.cse.ucsc.edu/~douglas/proximl/proximl.html). We are finally wondering how to organise our packages to help users who would like to use BioJava along with JSci. Let us know. __________________________________________________________________________________________ Silvere Martin-Michiellot builds the Internet future on www.digitalbiosphere.com __________________________________________________________________________________________ From matthew_pocock at yahoo.co.uk Mon Oct 6 10:34:06 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Mon Oct 6 10:37:22 2003 Subject: [Biojava-dev] sorry Message-ID: <3F817D5E.4040109@yahoo.co.uk> Hi, I've checked in the initial ontology reasoning code. It doesn't work, but it also doesn't blow up. It's more than likely that this commit has done bad things to cvs & the build process. Can people help sort this out? Symptoms would be great. Matthew From kdj at sanger.ac.uk Mon Oct 6 11:58:18 2003 From: kdj at sanger.ac.uk (Keith James) Date: Mon Oct 6 11:58:19 2003 Subject: [Biojava-dev] sorry In-Reply-To: <3F817D5E.4040109@yahoo.co.uk> References: <3F817D5E.4040109@yahoo.co.uk> Message-ID: >>>>> "Matthew" == Matthew Pocock writes: Matthew> Hi, I've checked in the initial ontology reasoning Matthew> code. It doesn't work, but it also doesn't blow up. Matthew> It's more than likely that this commit has done bad Matthew> things to cvs & the build process. Can people help sort Matthew> this out? Symptoms would be great. Mate, you've forgotten to check in org.biojava.ontology.format.* cvs update -dP only gives me an io subdir under ontology: compile-biojava: [javac] Compiling 1099 source files to /hgs2/team65/kdj/dev/biojava-live/ant-build/classes/biojava [javac] depend attribute is not supported by the modern compiler [javac] /hgs2/team65/kdj/dev/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:9: package org.biojava.ontology.format.triples.lexer does not exist [javac] import org.biojava.ontology.format.triples.lexer.Lexer; [javac] ^ [javac] /hgs2/team65/kdj/dev/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:10: package org.biojava.ontology.format.triples.lexer does not exist [javac] import org.biojava.ontology.format.triples.lexer.LexerException; -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From mark.schreiber at agresearch.co.nz Mon Oct 6 21:46:44 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Oct 6 21:45:02 2003 Subject: [Biojava-dev] AlphabetManager problem? Message-ID: OK - The serialization problem is caused by a readResolve method using a call to AlphabetManager.symbolForName(String name) which works fine for DNA and RNA but apparently barfs on Protein? I think the problem is caused in the AlphabetManager.xml file where the DNA/RNA Symbols are delclared outside of the tag but they are declared inside the tag for the protein alphabets. I'm not brave enough to mess with that file myself (it may not even be the cause of the problem). The bug can be replicated with the following code which I was going to add to AlphabetManager.test but CVS appears to be screwed up just now: public void testSymbolForName(){ FiniteAlphabet alpha = DNATools.getDNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = DNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = RNATools.getRNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = RNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = ProteinTools.getAlphabet(); AlphabetIndex ind = AlphabetManager.getAlphabetIndex(alpha); for( int i = 0; i < alpha.size(); i++){ Symbol s = ind.symbolForIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } } Any ideas on how to fix this? - Mark -----Original Message----- From: Schreiber, Mark Sent: Tue 7/10/2003 1:01 p.m. To: msouthern@exsar.com; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception -AlphabetManagerproblem? Hi - I am not getting the exception you have below (which looks like one from before we fixed the EmblFileFormer). I am however getting an InvalidObjectException which is comming from something odd in the AlphabetManager. I'll have a look into it. java.io.InvalidObjectException: Couldn't resolve symbol:MET at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve(AlphabetManager.java:1480) - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 7/10/2003 3:50 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I have downloaded and tested the latest EmblFileFormer.java (1.24.2.1) and it can now successfully write out a swissprot format file after first having written it in (Thank you). However, i am still seeing an exception attempting to read in a serialized sequence. Test code and exception below. Best regards, Mark. //------------------------------------------ public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT, new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); // this now works //SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); // still get an error deserializing seq = (Sequence) in.readObject(); in.close(); } //------------------------------------------ org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:36) -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Wednesday, October 01, 2003 1:54 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? OK - I tracked it down to a bug in the EMBLFileFormer (which gets coopted for SwissProt writing). It assumed a DNA alphabet and therefore couldn't write protein in SwissProt format. I have checked it into CVS, I will port it back to the 1.3 branch of CVS shortly. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Wed 1/10/2003 1:25 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I did also have an error with binary serialization. I was just trying to approach to problem from a different direction. B/c that also was a problem with finding / determining a protein symbol, i wondered if was coming from AlphabetManager rather than the Swissprot writing. I include below the code fragment along with the serialization error. Best regards, Mark. public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT , new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); System.out.println("\nWriting Sequence object"); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); System.out.println("\nReading Sequence object"); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); seq = (Sequence) in.readObject(); in.close(); } Writing Sequence object Reading Sequence object java.io.InvalidObjectException: Couldn't resolve symbol:ALA at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.test.SerializeTest.main(SerializeTest.java:38) Exception in thread "main" -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Tuesday, September 30, 2003 6:28 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi - I was a bit thrown off at first cause I thought you meant there was an error in binary serialization. There seems to be a problem with SwissProt writing. I've commited an addition to SeqIOToolsTest in biojava live that replicates the error but I haven't got time to track it down just yet. If some one else doesn't get it I'll probably find it tommorrow. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 30/09/2003 10:45 a.m. To: biojava-l@biojava.org Cc: Subject: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Appologies for following up on my own post. What follows is a simpler test than the serialization I attempted before. Consider the bit of code below and corresponding error message; For some reason, the protein sequence is being treated as a dna sequence. Is there something I am missing with respect to how AlphabetManager treats dna and protein alphabets? Any explainations would be most welcome. Thanks again, Mark. //------------------------------------------------------------------------ public static void main(String[] args) throws Exception{ String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava(SeqIOConstants.SWISSPROT ,new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); } org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:24) -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Monday, September 29, 2003 2:01 PM Cc: 'biojava-l@biojava.org' Subject: Sequence serialization exception I am getting the following exception when trying to serialize a protein sequence. I am using biojava 1.3. Can anyone please explain to me why? Many thanks, Mark. java.io.InvalidObjectException: Couldn't resolve symbol:SER at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1441) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.ViewSequence.readObject(ViewSequence.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:985) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:986) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.hdex.model.calc.Test.main(Test.java:104) _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From matthew_pocock at yahoo.co.uk Tue Oct 7 05:16:58 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Tue Oct 7 05:20:25 2003 Subject: [Biojava-dev] sorry In-Reply-To: References: <3F817D5E.4040109@yahoo.co.uk> Message-ID: <3F82848A.10701@yahoo.co.uk> Ah. Could everyone go to www.sablecc.org/home and download sablecc (3.*) - drop sablecc.jar into your ant/lib directory and try building again. Matthew Keith James wrote: >>>>>>"Matthew" == Matthew Pocock writes: >>>>>> >>>>>> > > Matthew> Hi, I've checked in the initial ontology reasoning > Matthew> code. It doesn't work, but it also doesn't blow up. > > Matthew> It's more than likely that this commit has done bad > Matthew> things to cvs & the build process. Can people help sort > Matthew> this out? Symptoms would be great. > >Mate, you've forgotten to check in org.biojava.ontology.format.* >cvs update -dP only gives me an io subdir under ontology: > >compile-biojava: > [javac] Compiling 1099 source files to /hgs2/team65/kdj/dev/biojava-live/ant-build/classes/biojava > [javac] depend attribute is not supported by the modern compiler > [javac] /hgs2/team65/kdj/dev/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:9: package org.biojava.ontology.format.triples.lexer does not exist > [javac] import org.biojava.ontology.format.triples.lexer.Lexer; > [javac] ^ > [javac] /hgs2/team65/kdj/dev/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:10: package org.biojava.ontology.format.triples.lexer does not exist > [javac] import org.biojava.ontology.format.triples.lexer.LexerException; > > > From td2 at sanger.ac.uk Tue Oct 7 05:33:01 2003 From: td2 at sanger.ac.uk (Thomas Down) Date: Tue Oct 7 05:30:51 2003 Subject: [Biojava-dev] Re: AlphabetManager problem? In-Reply-To: References: Message-ID: <20031007093300.GA451938@jabba.sanger.ac.uk> On Tue, Oct 07, 2003 at 02:46:44PM +1300, Schreiber, Mark wrote: > OK - > > The serialization problem is caused by a readResolve method using a call to AlphabetManager.symbolForName(String name) which works fine for DNA and RNA but apparently barfs on Protein? > > I think the problem is caused in the AlphabetManager.xml file where the DNA/RNA Symbols are delclared outside of the tag but they are declared inside the tag for the protein alphabets. I'm not brave enough to mess with that file myself (it may not even be the cause of the problem). This has been a persistent problem. AlphabetManager changed radically several times in the 1.3 timeframe to try and sort this out, but we never quite hit a solution that worked all the time. My personal favourite (and the one which came clostest to working) was to keep *all* well-known symbols scoped by alphabet. This got serialization working perfectly, but was shot down on the basis that a symbol in the `pure' PROTEIN alphabet was not identical to the corresponding symbol in the PROTEIN+TERMINATION alphabet. Arguably, the problem here is the existence of PROTEIN-TERM, but its probably too late to change that... How about we give all well known alphabets and symbols totally unambiguous textual identifiers (perhaps LSIDs scoped in the open-bio.org namespace) which are specified explicitly in AlphabetManager.xml. Then we just need one LSID->alphabet and one LSID->symbol map in AlphabetManager, and resolve everything through that. The LSIDs can be stored in the Annotation objects which are already attached to alphabets and symbols (but not used for very much). Serialization/deserialization code can go on the standard implementations of Symbol and Alphabet. I think we can probably ditch the magic implementations for the well-known case. The serialization support code throws an exception if it can't find an LSID in the appropriate place in the symbol/alphabet. This means it ought to be possible to have user-created alphabets which serialize sensibly, without things getting messy -- just specify LSIDs and it will serialize safely. No LSIDs == error on serialization. Does this sound doable? What's going to break if we try this? Thomas. From len at reeltwo.com Tue Oct 7 23:18:25 2003 From: len at reeltwo.com (Len Trigg) Date: Tue Oct 7 23:16:09 2003 Subject: [Biojava-dev] ProcessTools / ExecRunner In-Reply-To: <20031003094502.GA426617@jabba.sanger.ac.uk> References: <20031003094502.GA426617@jabba.sanger.ac.uk> Message-ID: Thomas Down wrote: > On Fri, Oct 03, 2003 at 01:43:50PM +1200, Len Trigg wrote: > > Also, he switched to using ProcessTools, and has noticed that > > occasionally a job will just hang (I'm guessing there might be a bug > > in the way ProcessTools extracts the stdout/stderr of the process). If > > it doesn't ring a bell with anyone, I'll look into it. > > Have you got any examples of this? Not at the moment, but we'll see if we can reproduce it reliably. I think our code is using ExecRunner at the moment, but we'll convert the troublesome application to ProcessTools (which seems to make more efficient use of the CPU than ExecRunner). We are using jdk 1.4.2, so it doesn't seem to be version related. One minor thing, should ProcessTools.java:518 be calling close() on the writer? I work by the general rule that the code responsible for opening a stream should also be responsible for closing it. Cheers, Len. From jfmadu01 at tiscali.co.uk Wed Oct 8 13:14:28 2003 From: jfmadu01 at tiscali.co.uk (Johnson Frank Madu) Date: Wed Oct 8 13:12:07 2003 Subject: [Biojava-dev] BE THE NEXT OF KIN TO LATE Mr. Richard Burson Message-ID: <200306210147.h5L1lNBa018763@gawab.com>" Dear Sir, My name is Barrister Johnson Frank Madu. I am a Nigerian a country in West Africa. I am a lawyer by profession with office address at 210 Broad Street, Lagos Island. Lagos. Please forgive my indignation if this message comes to you as surprise or if writing you through this channel offends you without your prior consent. Please, I have a problem at hand now which I will appreciate your advise, comments and decision. The problem is that there is one foreigner who lived in my country long ago. His name is Mr.Richard Burson. He is from United States of America. Richard Burson was one of my clients. In short, I was Mr. Raymond Beck's personal attorney who handles all his legal matters in Nigeria. Regrettable, Mr. Richard Burson and his family were among the victims of EGYPT AIR BOEING 767 FLIGHT NO.990 that Crashed in USA. Http://www.greatdreams.com/PassEAir990.htm In the year 2000 immediately after the sad news, the decease (Late Richard Burson)'s bank in Nigeria here wrote me to come forward with the decease relatives so that they (the bank) will release the decease balance with the bank to the relatives. Pursuant to this directive, I have done some checking in an attempt to locate the decease relatives but till date, these efforts has not yielded any positive result. Conseqently, The bank issued me a notice to provide The next of kin or have the account confiscated within The next twenty official working days. Since I have been Unsuccessful in locating the the relatives For over 2 years now I seek your consent to present You as the next of kin of the deceased since you are From the same country and you, share the same surname So that the proceeds of this account valued at US$7 Million dollars can be paid to you and then you and me Can share the money. 50% to me and 40% to you, while 10% Should be for expenses or tax as your government may Require. I have all necessary legal documents that can be used to back up Any claim we may make. All I require is Your honest cooperation to enable us sees this deal Through. I guarantee that this will be executed under A legitimate arrangement that will protect you from Any breach of the law. Please get in touch with me by my email to Enable us discuss further. Best regards, Barrister Frank J. Madu.{esq} From sss at SuperStoreSpecials.com Wed Oct 8 21:33:13 2003 From: sss at SuperStoreSpecials.com (Prize Allocation Dept) Date: Wed Oct 8 20:30:47 2003 Subject: [Biojava-dev] You've Won Message-ID: <1063$7DXUdH5H-KL5IDXUdH5HoUxE@stdout-01.bpsmailer.com> You've Won! Click here to accept your prize: http://bpsmailer.com:8080/track?m=1974017&l=0 "Prize issued by our sponsor. Please read privacy policy and terms and conditions of the site for more information." -- You are receiving this offer as part of the SuperStoreSpecials recurring list. If you would prefer to not receive these messages in the future, please go to http://www.SuperStoreSpecials.com/unsub.php?e=biojava-dev@biojava.org&m=1974017 If there are any problems with this link, reply to this email with "Remove" in the subject line. Or to unsubscribe via postal mail, please send request to: SuperStoreSpecials 1140 Highland Ave., Suite #302 Manhattan Beach, CA 90266 To read SuperStoreSpecials' privacy policy, visit http://www.SuperStoreSpecials.com/privacy.html The e-mail subscription address is: biojava-dev@biojava.org TM: <47;4we6O4b4-HSbDwe6O4b4K6Pn;1974017> From gtkacik at Princeton.EDU Wed Oct 8 23:49:09 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Wed Oct 8 23:46:42 2003 Subject: [Biojava-dev] EMBL access Message-ID: <3F84DAB5.9030508@princeton.edu> Hi everybody! I am new to biojava, so this may be a trivial question from a beginner. Using the following code snippet: public static void main(String args[]) { try { Registry reg = SystemRegistry.instance(); SequenceDBLite sdb = reg.getDatabase("embl"); Sequence ecoli = sdb.getSequence("U00096"); SeqIOTools.writeEmbl(System.out, ecoli); } catch (Exception e) { System.out.println(e); } } I wanted to get the whole E.coli genome from EMBL database. However, it turns out that I get only the first out of 400 segments of the DNA. I tried to search the web foradvice but could find nothing useful. Any help or at least a reference to where I should look? Thank you all in advance, Gasper. From matthew_pocock at yahoo.co.uk Thu Oct 9 14:44:59 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Oct 9 14:48:56 2003 Subject: [Biojava-dev] reasoning Message-ID: <3F85ACAB.3060106@yahoo.co.uk> Hi, I have checked in new ontology code. It doesn't crash, but doesn't entirly work yet. Please add sablecc 3.* to your ant/lib directory to enable the new ontology parser code to be generated. Good points: * much more sane core ontology with a workable type-system - strictly typed higher-order predicate logic! * a fairly nice text format for writing things down - I have yet to find anything that can't be represented * the reasoner can execute proofs that require variable substitution e.g. axiom: instance_of(_x, ANY) proposition: instance_of(dog, ANY) proof: instance_of(_x, ANY)[_x = dog] proposition: sub_type_of(_x, mammal) proof: sub_type_of(_x, mammal)[_x = dog] ; axiom(dog, mammal) Bad points: * I haven't been able to get it to find proofs for things that require an inference step without manually pruning the search space e.g. axiom: sub_type_of(dog, mammal) axiom: sub_type_of(mammal, animal) axiom: implies( and(sub_type_of(_x, _y), sub_type_of(_y, _z)), sub_type_of(_x, _z)) proposition: sub_type_of(dog, animal) Well, it can do this, but it gets stuck in an infinite recursion before it gets arround to using the right variable substitutions. * It runs slowly - there is loads of scope for naive performance tuning, and once inference works propperly, it should be generating it's own interpreter in bytecode, but not today. The code is in CVS, so if you are interested in any of this then please get involved. I am wondering which bits will survive - certainly, some of the code needs gutting for functional and aesthetic reasons. However, I'm not even sure if this is the correct approach. Matthew From itbuy at tom.com Fri Oct 10 07:32:47 2003 From: itbuy at tom.com (=?GB2312?B?zsLW3bary7PA8ca3udLA+g==?=) Date: Fri Oct 10 07:30:05 2003 Subject: [Biojava-dev] (no subject) Message-ID: <200310101130.h9ABTsdc030246@portal.open-bio.org> From ahmed at arbornet.org Sat Oct 11 16:19:08 2003 From: ahmed at arbornet.org (Ahmed Moustafa) Date: Sat Oct 11 20:09:28 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Message-ID: <3F8865BC.1020807@arbornet.org> Hello, I am working on a Java package of implementations of sequence alignment algorithms. I have released an implementation of Smith-Waterman algorithm with Gotoh's improvement. The time complexity is O(n2) and the space complexity is O(m * n + n) . The package name is JAligner and it is hosted at sourceforge.net . There is a front-end demo using Swing and Java Web Start. Could JAligner be incorporated into the BioJava project? Best Regards, Ahmed From mark.schreiber at agresearch.co.nz Sun Oct 12 01:44:11 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Oct 12 01:41:47 2003 Subject: [Biojava-dev] reasoning Message-ID: Matthew - Would this kind of thing be useful for text (natural language) processing? If so would it be possible for you to make a toy example to serve as a starting point for lesser mortals? - Mark -----Original Message----- From: Matthew Pocock [mailto:matthew_pocock@yahoo.co.uk] Sent: Fri 10/10/2003 7:44 a.m. To: BioJava Dev List Cc: Subject: [Biojava-dev] reasoning Hi, I have checked in new ontology code. It doesn't crash, but doesn't entirly work yet. Please add sablecc 3.* to your ant/lib directory to enable the new ontology parser code to be generated. Good points: * much more sane core ontology with a workable type-system - strictly typed higher-order predicate logic! * a fairly nice text format for writing things down - I have yet to find anything that can't be represented * the reasoner can execute proofs that require variable substitution e.g. axiom: instance_of(_x, ANY) proposition: instance_of(dog, ANY) proof: instance_of(_x, ANY)[_x = dog] proposition: sub_type_of(_x, mammal) proof: sub_type_of(_x, mammal)[_x = dog] ; axiom(dog, mammal) Bad points: * I haven't been able to get it to find proofs for things that require an inference step without manually pruning the search space e.g. axiom: sub_type_of(dog, mammal) axiom: sub_type_of(mammal, animal) axiom: implies( and(sub_type_of(_x, _y), sub_type_of(_y, _z)), sub_type_of(_x, _z)) proposition: sub_type_of(dog, animal) Well, it can do this, but it gets stuck in an infinite recursion before it gets arround to using the right variable substitutions. * It runs slowly - there is loads of scope for naive performance tuning, and once inference works propperly, it should be generating it's own interpreter in bytecode, but not today. The code is in CVS, so if you are interested in any of this then please get involved. I am wondering which bits will survive - certainly, some of the code needs gutting for functional and aesthetic reasons. However, I'm not even sure if this is the correct approach. Matthew _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Sun Oct 12 01:47:06 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Oct 12 01:44:44 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Message-ID: Hi - We have traditionally done pairwise alignments using HMMs. However there have been numerous requests for an implementation of Smith-Waterman. If you want some help coverting your classes to biojava give us a yell on the list. - Mark -----Original Message----- From: Ahmed Moustafa [mailto:ahmed@arbornet.org] Sent: Sun 12/10/2003 9:19 a.m. To: biojava-dev@biojava.org Cc: Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Hello, I am working on a Java package of implementations of sequence alignment algorithms. I have released an implementation of Smith-Waterman algorithm with Gotoh's improvement. The time complexity is O(n2) and the space complexity is O(m * n + n) . The package name is JAligner and it is hosted at sourceforge.net . There is a front-end demo using Swing and Java Web Start. Could JAligner be incorporated into the BioJava project? Best Regards, Ahmed ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From ahmed at arbornet.org Sun Oct 12 02:06:06 2003 From: ahmed at arbornet.org (Ahmed Moustafa) Date: Sun Oct 12 02:02:34 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm In-Reply-To: Message-ID: <20031012020452.S4687-100000@m-net.arbornet.org> Hi Mark, I believe the current API is reusable. Is it necessary to convert the already existing implementation? Anyway, how can I convert my classes to biojava? Thanks! Ahmed On Sun, 12 Oct 2003, Schreiber, Mark wrote: > Hi - > > We have traditionally done pairwise alignments using HMMs. However there have been numerous requests for an implementation of Smith-Waterman. If you want some help coverting your classes to biojava give us a yell on the list. > > - Mark > > > -----Original Message----- > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > Sent: Sun 12/10/2003 9:19 a.m. > To: biojava-dev@biojava.org > Cc: > Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm > > > > Hello, > > I am working on a Java package of implementations of sequence alignment > algorithms. I have released an implementation of Smith-Waterman > algorithm with Gotoh's improvement. The time complexity is O(n2) and the > space complexity is O(m * n + n) . > > The package name is JAligner and it is hosted at sourceforge.net > . There is a front-end demo using Swing > and Java Web Start. > > Could JAligner be incorporated into the BioJava project? > > Best Regards, > > Ahmed From mark.schreiber at agresearch.co.nz Sun Oct 12 16:55:42 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Sun Oct 12 16:53:33 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Message-ID: Hi - Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. - Mark > -----Original Message----- > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > Sent: Sunday, 12 October 2003 7:06 p.m. > To: Schreiber, Mark > Cc: biojava-dev@biojava.org > Subject: RE: [Biojava-dev] Java implementation of > Smith-Waterman algorithm > > > Hi Mark, > > I believe the current API is reusable. Is it necessary to > convert the already existing implementation? > > Anyway, how can I convert my classes to biojava? > > Thanks! > > Ahmed > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > Hi - > > > > We have traditionally done pairwise alignments using HMMs. However > > there have been numerous requests for an implementation of > > Smith-Waterman. If you want some help coverting your classes to > > biojava give us a yell on the list. > > > > - Mark > > > > > > -----Original Message----- > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > Sent: Sun 12/10/2003 9:19 a.m. > > To: biojava-dev@biojava.org > > Cc: > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > algorithm > > > > > > > > Hello, > > > > I am working on a Java package of implementations of > sequence alignment > > algorithms. I have released an implementation of Smith-Waterman > > algorithm with Gotoh's improvement. The time complexity > is O(n2) and the > > space complexity is O(m * n + n) . > > > > The package name is JAligner and it is hosted at sourceforge.net > > . There is a front-end > demo using Swing > > and Java Web Start. > > > > Could JAligner be incorporated into the BioJava project? > > > > Best Regards, > > > > Ahmed > > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From gtkacik at Princeton.EDU Sun Oct 12 22:45:49 2003 From: gtkacik at Princeton.EDU (Gasper Tkacik) Date: Sun Oct 12 22:43:24 2003 Subject: [Biojava-dev] Registry Message-ID: <3F8A11DD.4090409@princeton.edu> Hi everybody! I am using the Registry mechanism to access the flat file database on my system, using the following code snippet: Registry reg = SystemRegistry.instance(); System.out.println(reg.getRegistryConfiguration().getConfigLocator()); SequenceDBLite sdb = reg.getDatabase(dbName); seq = sdb.getSequence(sequenceID); Having installed the following file in the proper location: VERSION=1.00 [utumno] protocol=flat location=d:\Documents\Princeton\Biophysics\Genomes\index I was getting exceptions. Going into the source with debugger I found out that the root cause is org.biojava.utils.Services, that tries to load a resource with all DBProviders from META-INF/services directory in the jar. However, in the distribution on www.biojava.org, version 1.3 for Java 1.4, the resource only lists THREE ServiceDB providers, and FlatDBProvider is not included. After I included it by hand, things work fine. This is probably a bug? Best regards, Gasper. From matthew_pocock at yahoo.co.uk Mon Oct 13 04:33:48 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Mon Oct 13 04:39:27 2003 Subject: [Biojava-dev] reasoning In-Reply-To: References: Message-ID: <3F8A636C.8080405@yahoo.co.uk> Hi Mark, Yes - given a parser that can extract some data from text (e.g. parse it into np, vp trees) then this can be represented using the ontology APIs and then reasoned over, letting us do semi-smart mining of text. As soon as the reasoner is behaving more sanely (not getting stuck in infinite recursions), I will submit some examples for all of you to play with. Matthew Schreiber, Mark wrote: >Matthew - > >Would this kind of thing be useful for text (natural language) processing? If so would it be possible for you to make a toy example to serve as a starting point for lesser mortals? > >- Mark > > From ahmed at arbornet.org Tue Oct 14 01:45:08 2003 From: ahmed at arbornet.org (Ahmed Moustafa) Date: Tue Oct 14 01:41:21 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm In-Reply-To: Message-ID: <20031014014357.C98765-100000@m-net.arbornet.org> Hi, What are the Biojava's objects for the scoring matrices e.g. BLOSUMs and PAMs? Thanks, Ahmed On Mon, 13 Oct 2003, Schreiber, Mark wrote: > Hi - > > Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. > > - Mark > > > > -----Original Message----- > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > Sent: Sunday, 12 October 2003 7:06 p.m. > > To: Schreiber, Mark > > Cc: biojava-dev@biojava.org > > Subject: RE: [Biojava-dev] Java implementation of > > Smith-Waterman algorithm > > > > > > Hi Mark, > > > > I believe the current API is reusable. Is it necessary to > > convert the already existing implementation? > > > > Anyway, how can I convert my classes to biojava? > > > > Thanks! > > > > Ahmed > > > > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > > > Hi - > > > > > > We have traditionally done pairwise alignments using HMMs. However > > > there have been numerous requests for an implementation of > > > Smith-Waterman. If you want some help coverting your classes to > > > biojava give us a yell on the list. > > > > > > - Mark > > > > > > > > > -----Original Message----- > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > Sent: Sun 12/10/2003 9:19 a.m. > > > To: biojava-dev@biojava.org > > > Cc: > > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > > algorithm > > > > > > > > > > > > Hello, > > > > > > I am working on a Java package of implementations of > > sequence alignment > > > algorithms. I have released an implementation of Smith-Waterman > > > algorithm with Gotoh's improvement. The time complexity > > is O(n2) and the > > > space complexity is O(m * n + n) . > > > > > > The package name is JAligner and it is hosted at sourceforge.net > > > . There is a front-end > > demo using Swing > > > and Java Web Start. > > > > > > Could JAligner be incorporated into the BioJava project? > > > > > > Best Regards, > > > > > > Ahmed > > > > > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > From mark.schreiber at agresearch.co.nz Tue Oct 14 05:18:12 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Tue Oct 14 05:15:51 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Message-ID: Hi - Because we have traditionally used HMMs there where no scoring matrices per se, just transition and emission probabilities. A scoring matrix would be easy to acheive. I would suggest you start with an interface called ScoringMatrix that would have as a minimum methods like: String getName(); which would return Blosum60 or similar depending on the implementation. int getScore(Symbol s, Symbol substitute) which returns the int score for the substitution. You might want to also make a ScoringMatrixFactory that can generate appropriate instances of the ScoringMatrix from an XML file or files that could be included in the Distribution. - Mark -----Original Message----- From: Ahmed Moustafa [mailto:ahmed@arbornet.org] Sent: Tue 14/10/2003 6:45 p.m. To: Schreiber, Mark Cc: biojava-dev@biojava.org Subject: RE: [Biojava-dev] Java implementation of Smith-Waterman algorithm Hi, What are the Biojava's objects for the scoring matrices e.g. BLOSUMs and PAMs? Thanks, Ahmed On Mon, 13 Oct 2003, Schreiber, Mark wrote: > Hi - > > Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. > > - Mark > > > > -----Original Message----- > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > Sent: Sunday, 12 October 2003 7:06 p.m. > > To: Schreiber, Mark > > Cc: biojava-dev@biojava.org > > Subject: RE: [Biojava-dev] Java implementation of > > Smith-Waterman algorithm > > > > > > Hi Mark, > > > > I believe the current API is reusable. Is it necessary to > > convert the already existing implementation? > > > > Anyway, how can I convert my classes to biojava? > > > > Thanks! > > > > Ahmed > > > > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > > > Hi - > > > > > > We have traditionally done pairwise alignments using HMMs. However > > > there have been numerous requests for an implementation of > > > Smith-Waterman. If you want some help coverting your classes to > > > biojava give us a yell on the list. > > > > > > - Mark > > > > > > > > > -----Original Message----- > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > Sent: Sun 12/10/2003 9:19 a.m. > > > To: biojava-dev@biojava.org > > > Cc: > > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > > algorithm > > > > > > > > > > > > Hello, > > > > > > I am working on a Java package of implementations of > > sequence alignment > > > algorithms. I have released an implementation of Smith-Waterman > > > algorithm with Gotoh's improvement. The time complexity > > is O(n2) and the > > > space complexity is O(m * n + n) . > > > > > > The package name is JAligner and it is hosted at sourceforge.net > > > . There is a front-end > > demo using Swing > > > and Java Web Start. > > > > > > Could JAligner be incorporated into the BioJava project? > > > > > > Best Regards, > > > > > > Ahmed > > > > > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From david.huen at ntlworld.com Tue Oct 14 17:20:41 2003 From: david.huen at ntlworld.com (David Huen) Date: Tue Oct 14 17:18:09 2003 Subject: [Biojava-dev] CVS is buggered... In-Reply-To: References: Message-ID: <200310142220.41610.david.huen@ntlworld.com> Dear Matthew, I can't compile CVS after doing an update - I end up with errors. >From the looks of it, your sablecc is failing. Regards, David H Here's the log:- Buildfile: build.xml init: [echo] JUnit present: true [echo] JUnit supported by Ant: true [echo] SableCC supported by Ant: true prepare: prepare-biojava: prepare-grammars: [mkdir] Created dir: /home/davidh/biocvs/biojava-live/ant-build/src/grammars [mkdir] Created dir: /home/davidh/biocvs/biojava-live/ant-build/src/grammars_java [mkdir] Created dir: /home/davidh/biocvs/biojava-live/ant-build/classes/grammars [mkdir] Created dir: /home/davidh/biocvs/biojava-live/ant-build/docs/grammars [copy] Copying 1 file to /home/davidh/biocvs/biojava-live/ant-build/src/grammars [copy] Copying 1 file to /home/davidh/biocvs/biojava-live/ant-build/src/grammars compile-grammars: Compiling with SableCC 1 source grammar files to /home/davidh/biocvs/biojava-live/ant-build/src/grammars_java [sablecc] -- Generating parser for ontology.grammar in /home/davidh/biocvs/biojava-live/ant-build/src/grammars_java [sablecc] Adding productions and alternative of section AST. [sablecc] Verifying identifiers. [sablecc] Adding productions and alternative transformation if necessary. [sablecc] Verifying ast identifiers. [sablecc] computing alternative symbol table identifiers. [sablecc] Verifying production transform identifiers. [sablecc] Verifying ast alternatives transform identifiers. [sablecc] Generating token classes. [sablecc] Generating production classes. [sablecc] Generating alternative classes. [sablecc] Generating analysis classes. [sablecc] Generating utility classes. [sablecc] Generating the lexer. [sablecc] State: INITIAL [sablecc] - Constructing NFA. [sablecc] ................................................................................. [sablecc] - Constructing DFA. [sablecc] ................................................................................................................................................. [sablecc] ...................................................... [sablecc] - resolving ACCEPT states. [sablecc] Generating the parser. [sablecc] ...................................................... [sablecc] ...................................................... [sablecc] ...................................................... [sablecc] .. [sablecc] .......... [sablecc] reduce/reduce conflict in state [stack: PNamespaceDecl TVariable *] on TComment in { [sablecc] [ PPredicate = TVariable * ] followed by TComment (reduce), [sablecc] [ PValue = TVariable * ] followed by TComment (reduce) [sablecc] } SableCC failed. reduce/reduce conflict in state [stack: PNamespaceDecl TVariable *] on TComment in { [ PPredicate = TVariable * ] followed by TComment (reduce), [ PValue = TVariable * ] followed by TComment (reduce) } [javac] Compiling 70 source files to /home/davidh/biocvs/biojava-live/ant-build/classes/grammars [copy] Copying 71 files to /home/davidh/biocvs/biojava-live/ant-build/classes/grammars [copy] Copied 1 empty directory to /home/davidh/biocvs/biojava-live/ant-build/classes/grammars package-grammars: [jar] Building jar: /home/davidh/biocvs/biojava-live/ant-build/grammars.jar compile-biojava: [javac] Compiling 1099 source files to /home/davidh/biocvs/biojava-live/ant-build/classes/biojava [javac] /home/davidh/biocvs/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:11: package org.biojava.ontology.format.triples.parser does not exist [javac] import org.biojava.ontology.format.triples.parser.Parser; [javac] ^ [javac] /home/davidh/biocvs/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:12: package org.biojava.ontology.format.triples.parser does not exist [javac] import org.biojava.ontology.format.triples.parser.ParserException; [javac] ^ [javac] /home/davidh/biocvs/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:29: cannot resolve symbol [javac] symbol : class ParserException [javac] location: class org.biojava.ontology.io.TriplesParser [javac] throws IOException, LexerException, ParserException { [javac] ^ [javac] /home/davidh/biocvs/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:31: cannot resolve symbol [javac] symbol : class Parser [javac] location: class org.biojava.ontology.io.TriplesParser [javac] Parser parser = new Parser(lexer); [javac] ^ [javac] /home/davidh/biocvs/biojava-live/ant-build/src/biojava/org/biojava/ontology/io/TriplesParser.java:31: cannot resolve symbol [javac] symbol : class Parser [javac] location: class org.biojava.ontology.io.TriplesParser [javac] Parser parser = new Parser(lexer); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -deprecation for details. [javac] 5 errors BUILD FAILED file:/home/davidh/biocvs/biojava-live/build.xml:431: Compile failed; see the compiler error output for details. From msouthern at exsar.com Wed Oct 15 09:11:48 2003 From: msouthern at exsar.com (Mark Southern) Date: Wed Oct 15 09:09:14 2003 Subject: [Biojava-dev] Feature at position 0 In-Reply-To: Message-ID: <2C879FB52902524C85E8B616CF276F1A1CC45F@cartasrv.carta.local> This is a real world example. Swissprot ID KAP3_RAT (AC P12369) is an example of a sequence with a feature entry at sequence location 0: FT INIT_MET 0 0 BY SIMILARITY. This raises an IllegalArgumentException when the sequence is read in via SeqIOTools.fileToBiojava (see below). I don't know how oftem this would come up but its definitely a situation that isn't handled at the moment. Thoughts anyone? Mark. java.lang.IllegalArgumentException: Location 0 is outside 1..415 at org.biojava.bio.seq.impl.SimpleFeature.(SimpleFeature.java:306) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces sorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc torAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:274) at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeature Realizer.java:138) rethrown as org.biojava.bio.BioException: Couldn't realize feature at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeature Realizer.java:144) at org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealiz er.java:94) at org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:1 98) at org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:20 4) at org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase. java:168) at org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilde r.java:87) at org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFil ter.java:98) at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:101) at com.exsar.test.SerializeTest.main(SerializeTest.java:24) From msouthern at exsar.com Wed Oct 15 09:15:21 2003 From: msouthern at exsar.com (Mark Southern) Date: Wed Oct 15 09:13:05 2003 Subject: [Biojava-dev] RE: AlphabetManager problem? In-Reply-To: Message-ID: <2C879FB52902524C85E8B616CF276F1A1CC460@cartasrv.carta.local> I have played around with AlphabetManager.xml but I'm really shooting in the dark. Can someone who understands AlphabetManager.xml please step up to the plate? :-) I have tried declaring the protein symbols outside of the tag. With the AlphabetManager.xml file unchanged, the exception upon reading in a serialized sequence object is; java.io.InvalidObjectException: Couldn't resolve symbol:ALA at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.test.SerializeTest.main(SerializeTest.java:39) Exception in thread "main" With the tags for proteins outside of the tag (as the DNA/RNA) symbols are, then the exception is; org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet PROTEIN-TERM at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: Alphabet not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:42) Exception in thread "main" Why would the protein have the PROTEIN-TERM alphabet rather than the expected PROTEIN alphabet? Cheers, Mark. -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Monday, October 06, 2003 9:47 PM To: msouthern@exsar.com; biojava-dev@biojava.org Cc: matthew_pocock@yahoo.co.uk; td2@sanger.ac.uk Subject: AlphabetManager problem? OK - The serialization problem is caused by a readResolve method using a call to AlphabetManager.symbolForName(String name) which works fine for DNA and RNA but apparently barfs on Protein? I think the problem is caused in the AlphabetManager.xml file where the DNA/RNA Symbols are delclared outside of the tag but they are declared inside the tag for the protein alphabets. I'm not brave enough to mess with that file myself (it may not even be the cause of the problem). The bug can be replicated with the following code which I was going to add to AlphabetManager.test but CVS appears to be screwed up just now: public void testSymbolForName(){ FiniteAlphabet alpha = DNATools.getDNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = DNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = RNATools.getRNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = RNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = ProteinTools.getAlphabet(); AlphabetIndex ind = AlphabetManager.getAlphabetIndex(alpha); for( int i = 0; i < alpha.size(); i++){ Symbol s = ind.symbolForIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } } Any ideas on how to fix this? - Mark -----Original Message----- From: Schreiber, Mark Sent: Tue 7/10/2003 1:01 p.m. To: msouthern@exsar.com; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception -AlphabetManagerproblem? Hi - I am not getting the exception you have below (which looks like one from before we fixed the EmblFileFormer). I am however getting an InvalidObjectException which is comming from something odd in the AlphabetManager. I'll have a look into it. java.io.InvalidObjectException: Couldn't resolve symbol:MET at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 7/10/2003 3:50 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I have downloaded and tested the latest EmblFileFormer.java (1.24.2.1) and it can now successfully write out a swissprot format file after first having written it in (Thank you). However, i am still seeing an exception attempting to read in a serialized sequence. Test code and exception below. Best regards, Mark. //------------------------------------------ public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT, new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); // this now works //SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); // still get an error deserializing seq = (Sequence) in.readObject(); in.close(); } //------------------------------------------ org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:36) -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Wednesday, October 01, 2003 1:54 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? OK - I tracked it down to a bug in the EMBLFileFormer (which gets coopted for SwissProt writing). It assumed a DNA alphabet and therefore couldn't write protein in SwissProt format. I have checked it into CVS, I will port it back to the 1.3 branch of CVS shortly. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Wed 1/10/2003 1:25 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I did also have an error with binary serialization. I was just trying to approach to problem from a different direction. B/c that also was a problem with finding / determining a protein symbol, i wondered if was coming from AlphabetManager rather than the Swissprot writing. I include below the code fragment along with the serialization error. Best regards, Mark. public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT , new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); System.out.println("\nWriting Sequence object"); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); System.out.println("\nReading Sequence object"); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); seq = (Sequence) in.readObject(); in.close(); } Writing Sequence object Reading Sequence object java.io.InvalidObjectException: Couldn't resolve symbol:ALA at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.test.SerializeTest.main(SerializeTest.java:38) Exception in thread "main" -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Tuesday, September 30, 2003 6:28 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi - I was a bit thrown off at first cause I thought you meant there was an error in binary serialization. There seems to be a problem with SwissProt writing. I've commited an addition to SeqIOToolsTest in biojava live that replicates the error but I haven't got time to track it down just yet. If some one else doesn't get it I'll probably find it tommorrow. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 30/09/2003 10:45 a.m. To: biojava-l@biojava.org Cc: Subject: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Appologies for following up on my own post. What follows is a simpler test than the serialization I attempted before. Consider the bit of code below and corresponding error message; For some reason, the protein sequence is being treated as a dna sequence. Is there something I am missing with respect to how AlphabetManager treats dna and protein alphabets? Any explainations would be most welcome. Thanks again, Mark. //------------------------------------------------------------------------ public static void main(String[] args) throws Exception{ String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava(SeqIOConstants.SWISSPROT ,new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); } org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:24) -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Monday, September 29, 2003 2:01 PM Cc: 'biojava-l@biojava.org' Subject: Sequence serialization exception I am getting the following exception when trying to serialize a protein sequence. I am using biojava 1.3. Can anyone please explain to me why? Many thanks, Mark. java.io.InvalidObjectException: Couldn't resolve symbol:SER at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1441) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.ViewSequence.readObject(ViewSequence.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:985) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:986) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.hdex.model.calc.Test.main(Test.java:104) _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From smh1008 at cus.cam.ac.uk Wed Oct 15 11:35:11 2003 From: smh1008 at cus.cam.ac.uk (David Huen) Date: Wed Oct 15 11:32:36 2003 Subject: [Biojava-dev] Feature at position 0 In-Reply-To: <2C879FB52902524C85E8B616CF276F1A1CC45F@cartasrv.carta.local> References: <2C879FB52902524C85E8B616CF276F1A1CC45F@cartasrv.carta.local> Message-ID: <200310151635.12714.smh1008@cus.cam.ac.uk> On Wednesday 15 Oct 2003 2:11 pm, Mark Southern wrote: > This is a real world example. > > Swissprot ID KAP3_RAT (AC P12369) is an example of a sequence with a > feature entry at sequence location 0: > > FT INIT_MET 0 0 BY SIMILARITY. > > > This raises an IllegalArgumentException when the sequence is read in via > SeqIOTools.fileToBiojava (see below). > > I don't know how oftem this would come up but its definitely a situation > that isn't handled at the moment. Thoughts anyone? > Hmm, this one could be a problem - our coordinate system starts from one. What do they mean by position 0? A cleaved methionine that's gone alreadY? Maybe our code ought to have the option to skip these silently? Regards, David Huen From vpublishing at bvimailbox.com Wed Oct 15 12:57:53 2003 From: vpublishing at bvimailbox.com (Publications Canadian Publications) Date: Wed Oct 15 12:55:13 2003 Subject: [Biojava-dev] ASQ Edition 2004 Message-ID: <200310151655.h9FGt5db013755@portal.open-bio.org> English CSD 2004 (Subsidies Grants Loans) contact us for details. COMMUNIQUE Publications Canadiennes 1556 Grand Marnier Val David Qc J0T 2N0 ANNUAIRE DES SUBVENTIONS AU QU?BEC 2004 MAINTENANT DISPONIBLE D?pot l?gal-Biblioth?que Nationale du Qu?bec ISBN 2-922870-06-5.......$ 49.95 Un total de 1900 programmes et subventions 600 nouveaux programmes Contactez nous pour obtenir un exemplaire Sans frais...................866-322-3376 From isayakubu770 at indiatimes.com Thu Oct 16 01:01:51 2003 From: isayakubu770 at indiatimes.com (Mr.Isa Yakubu) Date: Wed Oct 15 15:55:43 2003 Subject: [Biojava-dev] PROJECT. Message-ID: <200310151955.h9FJtAdb025294@portal.open-bio.org> Dear Sir, I am Barrister Isa Yakubu, the personal assistance to the special adviser to the president on petroleum Alhaji Rilwan Lukman. I got your contact from the Nigerian Nordic Chamber of Commerce & Industry while i searched for a reliable and confident partner to transact my business with. Without wasting much of your time, i present my business to you. Presently i have in my possesion 60 thousand barrels of crude oil, 45 thousand barrels of petroleum % 10 thousand barrels of diesel which unfortunately got mixed with water after which i contacted a company that deals with oil/water seperation here in nigeria and they have undertaken the seperating of the mixtures. I was able to recover 48 thousand barrels of crude oil, 30 thousand barrels of petroleum % 7 thousand barrels of diesel from the mixtures. Since then i have been able to pay only a part of the amount billed me for the seperations and this suggests my reason of contacting you. I require your assistance in! helping me financially to offset the balance of the bill and then we both source for buyers for the products. I want this business to be solely between both of us as the inclusion of a third party may jeopardise our chances of making good profit. Your immediate response is awaited eagerly. Isa Yakubu. From isayakubu770 at indiatimes.com Thu Oct 16 01:04:34 2003 From: isayakubu770 at indiatimes.com (Mr.Isa Yakubu) Date: Wed Oct 15 15:58:12 2003 Subject: [Biojava-dev] PROJECT. Message-ID: <200310151958.h9FJvodb025738@portal.open-bio.org> Dear Sir, I am Barrister Isa Yakubu, the personal assistance to the special adviser to the president on petroleum Alhaji Rilwan Lukman. I got your contact from the Nigerian Nordic Chamber of Commerce & Industry while i searched for a reliable and confident partner to transact my business with. Without wasting much of your time, i present my business to you. Presently i have in my possesion 60 thousand barrels of crude oil, 45 thousand barrels of petroleum % 10 thousand barrels of diesel which unfortunately got mixed with water after which i contacted a company that deals with oil/water seperation here in nigeria and they have undertaken the seperating of the mixtures. I was able to recover 48 thousand barrels of crude oil, 30 thousand barrels of petroleum % 7 thousand barrels of diesel from the mixtures. Since then i have been able to pay only a part of the amount billed me for the seperations and this suggests my reason of contacting you. I require your assistance in! helping me financially to offset the balance of the bill and then we both source for buyers for the products. I want this business to be solely between both of us as the inclusion of a third party may jeopardise our chances of making good profit. Your immediate response is awaited eagerly. Isa Yakubu. From mark.schreiber at agresearch.co.nz Wed Oct 15 16:53:12 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Wed Oct 15 16:52:20 2003 Subject: [Biojava-dev] RE: AlphabetManager problem? Message-ID: Hi Mark - This is an issue we have been discussing. The solution may be to use a thing called a lifescience id to give each symbol a name space so that both problems can be solved. The question is finding time to do it. If you have the inclination Thomas can probably give you some guidance on how it could be done. Or you could wait for a few weeks till we have some time to resolve it. Sorry we haven't been able to solve this problem sooner. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Thursday, 16 October 2003 2:15 a.m. To: Schreiber, Mark; biojava-dev@biojava.org Subject: RE: AlphabetManager problem? I have played around with AlphabetManager.xml but I'm really shooting in the dark. Can someone who understands AlphabetManager.xml please step up to the plate? :-) I have tried declaring the protein symbols outside of the tag. With the AlphabetManager.xml file unchanged, the exception upon reading in a serialized sequence object is; java.io.InvalidObjectException: Couldn't resolve symbol:ALA at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.test.SerializeTest.main(SerializeTest.java:39) Exception in thread "main" With the tags for proteins outside of the tag (as the DNA/RNA) symbols are, then the exception is; org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet PROTEIN-TERM at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: Alphabet not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:42) Exception in thread "main" Why would the protein have the PROTEIN-TERM alphabet rather than the expected PROTEIN alphabet? Cheers, Mark. -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Monday, October 06, 2003 9:47 PM To: msouthern@exsar.com; biojava-dev@biojava.org Cc: matthew_pocock@yahoo.co.uk; td2@sanger.ac.uk Subject: AlphabetManager problem? OK - The serialization problem is caused by a readResolve method using a call to AlphabetManager.symbolForName(String name) which works fine for DNA and RNA but apparently barfs on Protein? I think the problem is caused in the AlphabetManager.xml file where the DNA/RNA Symbols are delclared outside of the tag but they are declared inside the tag for the protein alphabets. I'm not brave enough to mess with that file myself (it may not even be the cause of the problem). The bug can be replicated with the following code which I was going to add to AlphabetManager.test but CVS appears to be screwed up just now: public void testSymbolForName(){ FiniteAlphabet alpha = DNATools.getDNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = DNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = RNATools.getRNA(); for( int i = 0; i < alpha.size(); i++){ Symbol s = RNATools.forIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } alpha = ProteinTools.getAlphabet(); AlphabetIndex ind = AlphabetManager.getAlphabetIndex(alpha); for( int i = 0; i < alpha.size(); i++){ Symbol s = ind.symbolForIndex(i); Symbol test = AlphabetManager.symbolForName(s.getName()); assertTrue(s == test); } } Any ideas on how to fix this? - Mark -----Original Message----- From: Schreiber, Mark Sent: Tue 7/10/2003 1:01 p.m. To: msouthern@exsar.com; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception -AlphabetManagerproblem? Hi - I am not getting the exception you have below (which looks like one from before we fixed the EmblFileFormer). I am however getting an InvalidObjectException which is comming from something odd in the AlphabetManager. I'll have a look into it. java.io.InvalidObjectException: Couldn't resolve symbol:MET at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 7/10/2003 3:50 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I have downloaded and tested the latest EmblFileFormer.java (1.24.2.1) and it can now successfully write out a swissprot format file after first having written it in (Thank you). However, i am still seeing an exception attempting to read in a serialized sequence. Test code and exception below. Best regards, Mark. //------------------------------------------ public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT, new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); // this now works //SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); // still get an error deserializing seq = (Sequence) in.readObject(); in.close(); } //------------------------------------------ org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:36) -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Wednesday, October 01, 2003 1:54 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? OK - I tracked it down to a bug in the EMBLFileFormer (which gets coopted for SwissProt writing). It assumed a DNA alphabet and therefore couldn't write protein in SwissProt format. I have checked it into CVS, I will port it back to the 1.3 branch of CVS shortly. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Wed 1/10/2003 1:25 a.m. To: Schreiber, Mark; biojava-l@biojava.org Cc: Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi Mark, I did also have an error with binary serialization. I was just trying to approach to problem from a different direction. B/c that also was a problem with finding / determining a protein symbol, i wondered if was coming from AlphabetManager rather than the Swissprot writing. I include below the code fragment along with the serialization error. Best regards, Mark. public static void main(String[] args) throws Exception{ File file = new File("c:\\temp\\sequence.ser"); String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( SeqIOConstants.SWISSPROT , new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); System.out.println("\nWriting Sequence object"); ObjectOutputStream out = new ObjectOutputStream( new FileOutputStream(file) ); out.writeObject( seq ); out.flush(); out.close(); System.out.println("\nReading Sequence object"); ObjectInputStream in = new ObjectInputStream( new FileInputStream(file) ); seq = (Sequence) in.readObject(); in.close(); } Writing Sequence object Reading Sequence object java.io.InvalidObjectException: Couldn't resolve symbol:ALA at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1480) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.test.SerializeTest.main(SerializeTest.java:38) Exception in thread "main" -----Original Message----- From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] Sent: Tuesday, September 30, 2003 6:28 AM To: msouthern@exsar.com; biojava-l@biojava.org Subject: RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi - I was a bit thrown off at first cause I thought you meant there was an error in binary serialization. There seems to be a problem with SwissProt writing. I've commited an addition to SeqIOToolsTest in biojava live that replicates the error but I haven't got time to track it down just yet. If some one else doesn't get it I'll probably find it tommorrow. - Mark -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Tue 30/09/2003 10:45 a.m. To: biojava-l@biojava.org Cc: Subject: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Appologies for following up on my own post. What follows is a simpler test than the serialization I attempted before. Consider the bit of code below and corresponding error message; For some reason, the protein sequence is being treated as a dna sequence. Is there something I am missing with respect to how AlphabetManager treats dna and protein alphabets? Any explainations would be most welcome. Thanks again, Mark. //------------------------------------------------------------------------ public static void main(String[] args) throws Exception{ String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava(SeqIOConstants.SWISSPROT ,new BufferedReader( new FileReader( seqFile ) ) ); Sequence seq = iter.nextSequence(); SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, seq ); } org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in alphabet DNA at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val idate(AlphabetManager.java:1423) at org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni zation.java:178) at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz ation.java:191) at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize Symbol(AlphabetManager.java:1276) at org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe nEmblFileFormer.java:337) at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not tokenizing at org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:125) rethrown as org.biojava.bio.BioError: An internal error occurred processing symbols at org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja va:137) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) at org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) at org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) at org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) at com.exsar.test.SerializeTest.main(SerializeTest.java:24) -----Original Message----- From: Mark Southern [mailto:msouthern@exsar.com] Sent: Monday, September 29, 2003 2:01 PM Cc: 'biojava-l@biojava.org' Subject: Sequence serialization exception I am getting the following exception when trying to serialize a protein sequence. I am using biojava 1.3. Can anyone please explain to me why? Many thanks, Mark. java.io.InvalidObjectException: Couldn't resolve symbol:SER at org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve (AlphabetManager.java:1441) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) at org.biojava.bio.seq.ViewSequence.readObject(ViewSequence.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:985) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at java.util.HashMap.readObject(HashMap.java:986) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:25) at java.lang.reflect.Method.invoke(Method.java:324) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) at com.exsar.hdex.model.calc.Test.main(Test.java:104) _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From matthew.pocock at ncl.ac.uk Thu Oct 16 10:04:06 2003 From: matthew.pocock at ncl.ac.uk (Matthew Pocock) Date: Thu Oct 16 10:05:04 2003 Subject: [Biojava-dev] ant tasks Message-ID: <3F8EA556.9070402@ncl.ac.uk> Hi, OK - we got into a mess this week (all right, I got us into a mess this week) by adding a new ant task - sablecc. You all saw the messages from people who couldn't make things work. So, we now have a solution to the problem of non-standard ant tasks we will need. I've added a directory called ant-lib to biojava-live. In here goes the jars for all non-standard ant tasks that we use (so far, sablecc). Then, we put these jars into a classpath property on the taskdef element in build.xml that loads in the task. Hey presto! A more self-contained build process. It gives us great flexibility over what versions of tasks get loaded, and whould put an end to us finding that junit v x is incompattible with tasks compiled against version y, or so we hope. The other option was to bundle a custom ant, to bundle a batch script for launching ant, or to rely on people downloading all the right ant tasks, none of which seem apealing, or have a good chance of working consistently. If you think this is a bad idea, complain early and loud before I get too used to this. Matthew From jenarm4 at eudoramail.com Thu Oct 16 19:47:21 2003 From: jenarm4 at eudoramail.com (Jen Armstrong) Date: Thu Oct 16 19:45:12 2003 Subject: [Biojava-dev] Internet Users' Winner Lottery. Message-ID: <200310162344.h9GNi2db031456@portal.open-bio.org> Sir/Madam We are pleased to inform you of the result of the Lottery Winners International programs held on the 30/9/2003. Your e-mail address attached to ticket number 275114656588-6410 with serial number 5798-510,batch number 466270566,lottery reference number 6676629735n and drew lucky numbers 8-60-88-23-66-45 which consequently won in the 1st category, you have therefore been approved for a lump sum pay out of US$700,000.00 (Seven Hundred Thousand United States Dollars) CONGRATULATIONS!!! Due to mix up of some numbers and names, we ask that you keep your winning information confidential until your claims has been processed and your money remitted to you. This is part of our security protocol to avoid double claiming and unwarranted abuse of this program by some participants. All participants were selected through a computer ballot system drawn from over 20,000 company and 30,000,000 individual email addresses and names from all over the world. This promotional program takes place every three-years. This lottery was promoted and sponsored by Bill Gates, President of the World Largest software, we hope with part of your winning you will take part in our next year USD50 million international lottery. To file for your claim, please contact the undersigned for further clarification and how to make your claims. You claim should be effected through MR. MICHAEL RICHARD GLOBAL CHALLENGES FINANCE & SECURITIES FOREIGN PAYMENT LIAISON OFFICE, SCHIPOL CLEARING HOUSE, AMSTERDAM DE-NETHERLAND. Email: gcf_financier@z6.com Remember, all winning must be claimed not later than 25th of October 2003.After this date all unclaimed funds will be included in the next stake. Please note in order to avoid unnecessary delays and complications Please remember to quote your reference number and batch numbers in all correspondence. Furthermore, should there be any change of address do inform our agent as soon as possible. Congratulations once more from our members of staff and thank you for being part of our promotional program. Note: Anybody under the age of 18 is automatically disqualified. Sincerely yours, Mrs. JEN AMSTRONG Lottery Coordinator. From simon.foote at nrc-cnrc.gc.ca Mon Oct 20 08:52:29 2003 From: simon.foote at nrc-cnrc.gc.ca (Simon Foote) Date: Mon Oct 20 08:50:56 2003 Subject: [Biojava-dev] Re: [Biojava-l] ontology exception, addSequence & BioSQLSequenceD B In-Reply-To: <3F93A026.50700@yahoo.co.uk> References: <200310171305.JAA28636@nrcbsa.bio.nrc.ca> <3F93A026.50700@yahoo.co.uk> Message-ID: <3F93DA8D.4040109@nrc-cnrc.gc.ca> Changes have now been checked in. Note to BioSQL users/developers, the cvs version now uses the Jakarata commons-dbcp & commons-pool packages to provide connection pooling. The 2 packages must be in your classpath to use BioSQLSequenceDB. Cheers, Simon -- Bioinformatics Programmer Institute for Biological Sciences National Research Council of Canada [T] 613-990-0561 [F] 613-952-9092 simon.foote@nrc-cnrc.gc.ca Matthew Pocock wrote: > S. Foote wrote: > >> I've tested out my mods and everything appears to work as expected. >> > great > >> Unfortunately, I'm away until Monday, so I will do the checkin on >> Monday morning. >> Should I use the ant-lib dir for the 2 jar files and put an ant-task >> in the build file for it, as it >> needs those 2 in the classpath for it to compile. >> > No; ant-lib is for ant tasks (junit, sablecc etc.), not for > dependancies. Put the jars in the root directory as usual. If they are > needed at run-time, then you could modify the Class-Path entry of the > biojava.jar manifest to pull them in. > > Best, > > Matthew > >> >> Maybe if you could give me a pointer on what has to be added. I >> checked the 2 jar files in previously, >> but there in the top dir, so maybe we can get them moved to the >> ant-lib dir, if necessary. >> >> Cheers, >> Simon >> >> According to Matthew Pocock: >> >> >>> Simon Foote wrote: >>> >>> >>> >>>> Hi Mathew, >>>> >>>> That would be it. >>> >>> Cool. >>> >>> >>> >>>> Ran into that problem bringing in Genbank files where the feature >>>> keys could be organism or ORGANISM >>>> I've also solved it. In MySQL, the varchar field type is compared >>>> in a case-insensitive fashion, but if you add the attribute BINARY >>>> to the field, then it compares in a case-sensitive fashion and >>>> everything works. >>> >>> So we are ready to go for updating biojava with your mods then? >>> >>> >>> >>>> I also cc'd Hilmar, for possibly changing this in the biosql schema >>>> for MySQL. I far as I can tell it doesn't cause any problems with >>>> existing databases as it only effects sorting and comparing of the >>>> field. >>> >>> Wonderful. Fingers crossed. >>> >>> Matthew >>> >>> >>> >>>> Cheers, >>>> Simon >>>> >>>> Matthew Pocock wrote: >>>> >>>> >>>> >>>>> I wonder if it is a case thing - we persist _c:81, and then it >>>>> fails for _C:18. You are using MySQL? I tested on postgresql, >>>>> which is more case sensetive. Any ideas where I should begin? >>>>> >>>>> Simon Foote wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> Attached is the complete error.log for this. >>>>>> >>>>>> Simon >>>>>> >>>>>> Matthew Pocock wrote: >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> >> >> >> > From jwidman at mail.com Mon Oct 20 15:02:53 2003 From: jwidman at mail.com (Jack Widman) Date: Mon Oct 20 15:00:05 2003 Subject: [Biojava-dev] question Message-ID: <20031020190253.82489.qmail@mail.com> Hello. Could you please tell me why BioLisp (biojava-dev@biojava.org) is not in your list of BioX's ? Thanks. Jack Widman From mark.schreiber at agresearch.co.nz Mon Oct 20 15:55:15 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Oct 20 15:52:31 2003 Subject: [Biojava-dev] question Message-ID: I think it was because it was not formally associated with the OBF (Open Bioinformatics Foundation www.open-bio.org) although I cannot see any reason why it could not be. We could link to it anyway I suppose. - Mark -----Original Message----- From: Jack Widman [mailto:jwidman@mail.com] Sent: Tue 21/10/2003 8:02 a.m. To: biojava-dev@biojava.org Cc: Subject: [Biojava-dev] question Hello. Could you please tell me why BioLisp (biojava-dev@biojava.org) is not in your list of BioX's ? Thanks. Jack Widman _______________________________________________ biojava-dev mailing list biojava-dev@biojava.org http://biojava.org/mailman/listinfo/biojava-dev ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From matthew_pocock at yahoo.co.uk Mon Oct 20 15:53:36 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Mon Oct 20 15:56:10 2003 Subject: [Biojava-dev] question In-Reply-To: <20031020190253.82489.qmail@mail.com> References: <20031020190253.82489.qmail@mail.com> Message-ID: <3F943D40.8020101@yahoo.co.uk> Hi, I think that all of the BioXs listed on our sites are affilitated with the Open Bioinformatics Foundation (although I may be wrong). The BioLisp people have never shown much interest in getting involved with us. I'm not sure when the two communities last spoke to each other though. Matthew Jack Widman wrote: >Hello. Could you please tell me why BioLisp (biojava-dev@biojava.org) >is not in your list of BioX's ? >Thanks. > >Jack Widman >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > > From silvere at digitalbiosphere.com Tue Oct 7 16:24:21 2003 From: silvere at digitalbiosphere.com (=?iso-8859-1?Q?Silv=E8re?= Martin-Michiellot) Date: Tue Oct 21 12:09:31 2003 Subject: [Biojava-dev] bioJava, jsci Message-ID: <5.1.0.14.0.20031007221352.041ec148@pop.digitalbiosphere.com> Hi, We are a group of people building the JSci package at jsci.sourceforge.net. We are looking to extend/implement biology features. We need support for biologists to help us extend the features of the API as we are not experts by ourselves. Thus far we have implemented a basic package with aminoacids, DNA, bases, mRNA, proteins. We would like to receive suggestions, ideas about possible extensions, API review, etc. It seems for us we have to work in the same direction as you do to minimalize coding effort. We are also thinking about implementing XML support but from http://www.pasteur.fr/cgi-bin/biology/bnb_s.pl?english=1&query=xml it appears we are a bit struck to what is cool, working, standard and what is not (see for example the comparison from http://www.cse.ucsc.edu/~douglas/proximl/proximl.html). We are finally wondering how to organise our packages to help users who would like to use BioJava along with JSci. Let us know. PS: This is my second post as you didn't reply to the first one (or I messed up something with your email which happens to me sometimes). As I am working with many emails, please apologize any inappropriate sending but also note that I really want to ask you about some information and that any help (link, forwarding or whatever) will be very appreciated. __________________________________________________________________________________________ Silvere Martin-Michiellot builds the Internet future on www.digitalbiosphere.com __________________________________________________________________________________________ From smh1008 at cus.cam.ac.uk Tue Oct 21 15:50:05 2003 From: smh1008 at cus.cam.ac.uk (David Huen) Date: Tue Oct 21 15:47:20 2003 Subject: [Biojava-dev] bioJava, jsci In-Reply-To: <5.1.0.14.0.20031007221352.041ec148@pop.digitalbiosphere.com> References: <5.1.0.14.0.20031007221352.041ec148@pop.digitalbiosphere.com> Message-ID: <200310212050.07285.smh1008@cus.cam.ac.uk> On Tuesday 07 Oct 2003 9:24 pm, Silv?re Martin-Michiellot wrote: > Hi, > > We are a group of people building the JSci package at > jsci.sourceforge.net. We are looking to extend/implement biology > features. We need support for biologists to help us extend the features > of the API as we are not experts by ourselves. > > Thus far we have implemented a basic package with aminoacids, DNA, bases, > mRNA, proteins. We would like to receive suggestions, ideas about > possible extensions, API review, etc. > It seems for us we have to work in the same direction as you do to > minimalize coding effort. > We are also thinking about implementing XML support but from > http://www.pasteur.fr/cgi-bin/biology/bnb_s.pl?english=1&query=xml it > appears we are a bit struck to what is cool, working, standard and what > is not (see for example the comparison from > http://www.cse.ucsc.edu/~douglas/proximl/proximl.html). > We are finally wondering how to organise our packages to help users who > would like to use BioJava along with JSci. > > Let us know. I am replying on my own behalf as a BJ developer and nothing I say should be read as representative of the BJ development community. BioJava is still a relatively young project and the developer resources we have at hand comes solely from volunteers and tends to be focused on the immediate needs of those people. There is a general model for much of the code and development is primarily based on extending and modifying taht model. Because of the limited resources, we tend to remain fairly focused on developing methods of representing and computing on biology, particularly in the representation of DNA sequence. Our volunteers generally have day jobs that keep them busy and have enough difficulties finding enough time to code the needful before even considering possible collaborative ventures. I have attempted to look at your site and if I surmise it correctly, you have within JSci various math algorithms and an implementation of MATHML. I do not have time to look into it much further than that. Perhaps you could expound on JSci's objective more extensively. You mention that you have also implemented a molecular biology package: I don't know what your APIs are but if they are fairly different from ours (and ours can be initially surprising! Sequences are not a sequence of letters...) it will be unlikely that we will want to break our interfaces to support good interoperability with what JSci APIs. Our code is very tightly bound to a specific model and changes in it will break functionality greatly. The major area we attempt to maintain some degree of interoperability with are the other OBF projects (BioPerl, OBDA, etc). perhaps you might like to point out areas that you think might be mutually explored at this stage - they don't seem immediately apparent to me at this stage. Regards, David Huen -- David Huen, Ph.D. Email: smh1008@cus.cam.ac.uk Dept. of Genetics Fax : +44 1223 333992 University of Cambridge Phone: +44 1223 333982/766748 Cambridge, CB2 3UH U.K. From smh1008 at cus.cam.ac.uk Wed Oct 22 06:53:20 2003 From: smh1008 at cus.cam.ac.uk (David Huen) Date: Wed Oct 22 06:50:29 2003 Subject: [Biojava-dev] org.biojava.bio.program.tagvalue Changes Message-ID: <200310221153.21249.smh1008@cus.cam.ac.uk> I have made a number of changes to the above package a) TagValueWrapper is now an interface. The implementation is in a class called Simple TagValueWrapper. All dependent classes have been changed to reflect this. b) an AbstractWrapper implementing the above interface now exists that basically does nothing, not even forward events. It's for those wonderful non-communicative occasions that are so called for in our lives... c) there is now a StateMachine class which can be used as the base for a TagValueListener. You can define states, and transitions and have those transitions driven by events like startRecord, startTag, etc. Each state can have its own listener. It should allow quite complex syntax parsing to be done by defining these. Also, the transition tables can have defined fallback tables for when lookup fails on a State, it will go do a look up on the fallback. These can be chained to form hierachies of state transition tables applicable to specific states, groups of states and globally. As I've touched more classes than I'd like in an update, plese report to me any build failures for immediate fixes. Regards, David Huen From billym at orionbiosolutions.com Wed Oct 22 05:30:42 2003 From: billym at orionbiosolutions.com (billym@orionbiosolutions.com) Date: Wed Oct 22 08:07:25 2003 Subject: [Biojava-dev] Orion BioSolutions Kits, $25 to you with Purchase Message-ID: <20031022093042.2790.qmail@crm.orionbiosolutions.com> Introducing Orion BioSolutions Research Kits for Cell and Molecular Biology. Try our kits and we will give you a gift of $25, or you can donate your $25 to charity. Cloning Kits: CloneFinders - High Efficiency Chemically Competent Ecoli Protein Expression Kits: ProExpressors - High level inducible expression in BL21 cells. FACS Kits: IntraCyte - Intracellular FACS Kits - Superior New Technology, Not just for lymphocytes, works on many different cell and tissue types. Visit our website at: http://www.orionbiosolutions.com to place your order today. _________________________________________________________________ You have been recognized as a customer of Life Science Research Kits. If we have reached you in error, and you are not interested in our new technologies and promotional offers, please opt out at: http://www.orionbiosolutions.com/bin/optout We do not want to invade your privacy. Please read our privacy policy at: http://www.orionbiosolutions.com From mark.schreiber at agresearch.co.nz Thu Oct 23 16:45:24 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Thu Oct 23 16:43:13 2003 Subject: [Biojava-dev] RE: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Message-ID: Hi Vasa - The problem is that Biojava requires it's Symbol instances to be Flywieght (basically Singletons). This means that the Met symbol == the Met Symbol. Ie there is only one of them. In most cases the way this is done is that the Symbols name is serialized not the Symbol. If the Symbol where serialized it would be duplicated on the receiving JVM and the == operation would no longer return true which would be a disaster. It actually works for almost everything except protein Symbols as they are members of two Alphabets PROTEIN and PROTEIN-TERM. The PROTEIN-TERM alphabet contains the * symbol which is used to represent the translation of a stop codon for representation of complete frame translations. The problem is in the class WellKnownAtomicSymbol. This is a private inner class of AlphabetManager. It has a readResolve method called during Serialization which replaces the WellKnownAtomicSymbol on the ObjectOutputStream with a place holder object called OPH (a private inner class of WellKnownAtomicSymbol) which holds the name of the Symbol. When the OPH is de serialized it's readResolve() method is called which calls the AlphabetManager method symbolForName(name). This goes and looks up a HashMap of names to Symbols. For some reason when the AlphabetManger initializes it is not putting the names of the protein Symbols into the HashMap. That is actually the root cause of the problem and not the Serialzation itself. I will desperately try to look at this over the weekend and fix it. If someone else gets to it first that's cool too. - Mark -----Original Message----- From: Vasa Curcin [mailto:vc100@doc.ic.ac.uk] Sent: Thursday, 23 October 2003 9:48 p.m. To: Schreiber, Mark Subject: Re: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? Hi, Could you possibly tell me how Biojava serializes protein sequences. I need this functionality pretty badly, so I will most likely try to hack it. Regards, Vasa Schreiber, Mark wrote: >Hi - > >Binary serialization of protein sequences is currently broken. I need >to move to a pattern of using LSIDs to identify them uniquely. This >should fix the problem. Just need to find time to do it > >- Mark > > >-----Original Message----- >From: Vasa Curcin [mailto:vc100@doc.ic.ac.uk] >Sent: Wednesday, 22 October 2003 4:31 p.m. >To: Schreiber, Mark >Cc: msouthern@exsar.com; biojava-l@biojava.org >Subject: Re: [Biojava-l] RE: Sequence serialization exception - AlphabetManagerproblem? > > >Having exactly the same problem here with serializing SimpleSequence >objects containing proteins. It doesn't like MET... Was there a solution >found? > >Vasa > >Schreiber, Mark wrote: > > > >>Hi - >> >>I am not getting the exception you have below (which looks like one >> >> >>from before we fixed the EmblFileFormer). I am however getting an >>InvalidObjectException which is comming from something odd in the AlphabetManager. I'll have a look into it. > > >>java.io.InvalidObjectException: Couldn't resolve symbol:MET at >>org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve(AlphabetManager.java:1480) >> >>- Mark >> >> >> -----Original Message----- >> From: Mark Southern [mailto:msouthern@exsar.com] >> Sent: Tue 7/10/2003 3:50 a.m. >> To: Schreiber, Mark; biojava-l@biojava.org >> Cc: >> Subject: RE: [Biojava-l] RE: Sequence serialization exception - >>AlphabetManagerproblem? >> >> >> >> Hi Mark, >> >> I have downloaded and tested the latest EmblFileFormer.java (1.24.2.1) and >> it can now successfully write out a swissprot format file after first having >> written it in (Thank you). >> However, i am still seeing an exception attempting to read in a serialized >> sequence. Test code and exception below. >> >> Best regards, >> >> Mark. >> >> //------------------------------------------ >> >> public static void main(String[] args) throws Exception{ >> File file = new File("c:\\temp\\sequence.ser"); >> String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; >> >> SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( >> SeqIOConstants.SWISSPROT, new BufferedReader( new FileReader( seqFile >>) ) ); >> >> Sequence seq = iter.nextSequence(); >> >> // this now works >> //SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, >> seq ); >> >> ObjectOutputStream out = new ObjectOutputStream( new >> FileOutputStream(file) ); >> out.writeObject( seq ); >> out.flush(); >> out.close(); >> >> ObjectInputStream in = new ObjectInputStream( new >> FileInputStream(file) ); >> // still get an error deserializing >> seq = (Sequence) in.readObject(); >> in.close(); >> } >> >> //------------------------------------------ >> org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in >> alphabet DNA >> at >> org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) >> at >> org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val >> idate(AlphabetManager.java:1423) >> at >> org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni >> zation.java:178) >> at >> org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz >> ation.java:191) >> at >> org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize >> Symbol(AlphabetManager.java:1276) >> at >> org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe >> nEmblFileFormer.java:337) >> at >> org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) >> rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not >> tokenizing >> at >> org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) >> at >> org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja >> va:125) >> rethrown as org.biojava.bio.BioError: An internal error occurred processing >> symbols >> at >> org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja >> va:137) >> at >> org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) >> at >> org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) >> at >> org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) >> at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) >> at >> org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) >> at com.exsar.test.SerializeTest.main(SerializeTest.java:36) >> >> >> -----Original Message----- >> From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] >> Sent: Wednesday, October 01, 2003 1:54 AM >> To: msouthern@exsar.com; biojava-l@biojava.org >> Subject: RE: [Biojava-l] RE: Sequence serialization exception - >> AlphabetManagerproblem? >> >> >> OK - >> >> I tracked it down to a bug in the EMBLFileFormer (which gets coopted for >> SwissProt writing). It assumed a DNA alphabet and therefore couldn't write >> protein in SwissProt format. >> >> I have checked it into CVS, I will port it back to the 1.3 branch of CVS >> shortly. >> >> - Mark >> -----Original Message----- >> From: Mark Southern [mailto:msouthern@exsar.com] >> Sent: Wed 1/10/2003 1:25 a.m. >> To: Schreiber, Mark; biojava-l@biojava.org >> Cc: >> Subject: RE: [Biojava-l] RE: Sequence serialization exception - >> AlphabetManagerproblem? >> >> >> Hi Mark, >> >> I did also have an error with binary serialization. I was just trying to >> approach to problem from a different direction. B/c that also was a problem >> with finding / determining a protein symbol, i wondered if was coming from >> AlphabetManager rather than the Swissprot writing. I include below the code >> fragment along with the serialization error. >> >> Best regards, >> >> Mark. >> >> >> public static void main(String[] args) throws Exception{ >> File file = new File("c:\\temp\\sequence.ser"); >> String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; >> >> SequenceIterator iter = (SequenceIterator) SeqIOTools.fileToBiojava( >> SeqIOConstants.SWISSPROT >> , >> new BufferedReader( new FileReader( seqFile ) ) ); >> >> Sequence seq = iter.nextSequence(); >> >> System.out.println("\nWriting Sequence object"); >> ObjectOutputStream out = new ObjectOutputStream( new >> FileOutputStream(file) ); >> out.writeObject( seq ); >> out.flush(); >> out.close(); >> >> System.out.println("\nReading Sequence object"); >> ObjectInputStream in = new ObjectInputStream( new >> FileInputStream(file) ); >> seq = (Sequence) in.readObject(); >> in.close(); >> >> } >> >> Writing Sequence object >> Reading Sequence object >> java.io.InvalidObjectException: Couldn't resolve symbol:ALA >> at >> org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve >> (AlphabetManager.java:1480) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 >> ) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) >> at >> org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 >> ) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) >> at com.exsar.test.SerializeTest.main(SerializeTest.java:38) >> Exception in thread "main" >> >> >> >> -----Original Message----- >> From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz] >> Sent: Tuesday, September 30, 2003 6:28 AM >> To: msouthern@exsar.com; biojava-l@biojava.org >> Subject: RE: [Biojava-l] RE: Sequence serialization exception - >> AlphabetManagerproblem? >> >> >> Hi - >> >> I was a bit thrown off at first cause I thought you meant there was an error >> in binary serialization. There seems to be a problem with SwissProt writing. >> I've commited an addition to SeqIOToolsTest in biojava live that replicates >> the error but I haven't got time to track it down just yet. If some one else >> doesn't get it I'll probably find it tommorrow. >> >> - Mark >> >> -----Original Message----- >> From: Mark Southern [mailto:msouthern@exsar.com] >> Sent: Tue 30/09/2003 10:45 a.m. >> To: biojava-l@biojava.org >> Cc: >> Subject: [Biojava-l] RE: Sequence serialization exception - >> AlphabetManagerproblem? >> >> >> Appologies for following up on my own post. What follows is a simpler test >> than the serialization I attempted before. >> >> Consider the bit of code below and corresponding error message; >> >> For some reason, the protein sequence is being treated as a dna sequence. Is >> there something I am missing with respect to how AlphabetManager treats dna >> and protein alphabets? >> >> Any explainations would be most welcome. >> >> Thanks again, >> >> Mark. >> >> >> >>//-------------------------------------------------------------------- >>---- >> >> public static void main(String[] args) throws Exception{ >> String seqFile = "c:\\temp\\KAP0_BOVIN.swiss"; >> SequenceIterator iter = (SequenceIterator) >> SeqIOTools.fileToBiojava(SeqIOConstants.SWISSPROT >> >> ,new BufferedReader( new FileReader( seqFile ) ) ); >> Sequence seq = iter.nextSequence(); >> SeqIOTools.biojavaToFile( SeqIOConstants.SWISSPROT, System.out, >> seq ); >> } >> >> >> org.biojava.bio.symbol.IllegalSymbolException: Symbol ALA not found in >> alphabet DNA >> at >> org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) >> at >> org.biojava.bio.symbol.AlphabetManager$ImmutableWellKnownAlphabetWrapper.val >> idate(AlphabetManager.java:1423) >> at >> org.biojava.bio.seq.io.CharacterTokenization._tokenizeSymbol(CharacterTokeni >> zation.java:178) >> at >> org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbol(CharacterTokeniz >> ation.java:191) >> at >> org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenize >> Symbol(AlphabetManager.java:1276) >> at >> org.biojava.bio.seq.io.AbstractGenEmblFileFormer.formatTokenBlock(AbstractGe >> nEmblFileFormer.java:337) >> at >> org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:211) >> rethrown as org.biojava.bio.symbol.IllegalAlphabetException: DNA not >> tokenizing >> at >> org.biojava.bio.seq.io.EmblFileFormer.addSymbols(EmblFileFormer.java:224) >> at >> org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja >> va:125) >> rethrown as org.biojava.bio.BioError: An internal error occurred processing >> symbols >> at >> org.biojava.bio.seq.io.SeqIOEventEmitter.getSeqIOEvents(SeqIOEventEmitter.ja >> va:137) >> at >> org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:289) >> at >> org.biojava.bio.seq.io.EmblLikeFormat.writeSequence(EmblLikeFormat.java:253) >> at >> org.biojava.bio.seq.io.SeqIOTools.writeSwissprot(SeqIOTools.java:316) >> at org.biojava.bio.seq.io.SeqIOTools.seqToFile(SeqIOTools.java:1078) >> at >> org.biojava.bio.seq.io.SeqIOTools.biojavaToFile(SeqIOTools.java:870) >> at com.exsar.test.SerializeTest.main(SerializeTest.java:24) >> >> >> >> -----Original Message----- >> From: Mark Southern [mailto:msouthern@exsar.com] >> Sent: Monday, September 29, 2003 2:01 PM >> Cc: 'biojava-l@biojava.org' >> Subject: Sequence serialization exception >> >> >> I am getting the following exception when trying to serialize a protein >> sequence. I am using biojava 1.3. Can anyone please explain to me >>why? >> >> Many thanks, >> >> Mark. >> >> >> java.io.InvalidObjectException: Couldn't resolve symbol:SER >> at >> org.biojava.bio.symbol.AlphabetManager$WellKnownAtomicSymbol$OPH.readResolve >> (AlphabetManager.java:1441) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 >> ) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:911) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1655) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1603) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1271) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) >> at >> org.biojava.bio.seq.impl.SimpleSequence.readObject(SimpleSequence.java:119) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 >> ) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:452) >> at org.biojava.bio.seq.ViewSequence.readObject(ViewSequence.java:93) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 >> ) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) >> at java.util.HashMap.readObject(HashMap.java:985) >> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) >> at java.util.HashMap.readObject(HashMap.java:986) >> at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl >> .java:25) >> at java.lang.reflect.Method.invoke(Method.java:324) >> at >> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:824) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646) >> at >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324) >> at com.exsar.hdex.model.calc.Test.main(Test.java:104) >> >> >> >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l@biojava.org >> http://biojava.org/mailman/listinfo/biojava-l >> >> >> >>====================================================================== >>= >> >> Attention: The information contained in this message and/or >>attachments >> >> from AgResearch Limited is intended only for the persons or entities >> >> to which it is addressed and may contain confidential and/or >>privileged >> >> material. Any review, retransmission, dissemination or other use of, >>or >> >> taking of any action in reliance upon, this information by persons or >> >> entities other than the intended recipients is prohibited by >>AgResearch >> >> Limited. If you have received this message in error, please notify >>the >> >> sender immediately. >> >> >>====================================================================== >>= >> >> >>====================================================================== >>= >> >> Attention: The information contained in this message and/or >>attachments >> >> from AgResearch Limited is intended only for the persons or entities >> >> to which it is addressed and may contain confidential and/or >>privileged >> >> material. Any review, retransmission, dissemination or other use of, >>or >> >> taking of any action in reliance upon, this information by persons or >> >> entities other than the intended recipients is prohibited by >>AgResearch >> >> Limited. If you have received this message in error, please notify >>the >> >> sender immediately. >> >> >>====================================================================== >>= >> >> >> >> >> >>====================================================================== >>= >>Attention: The information contained in this message and/or attachments >> >> >>from AgResearch Limited is intended only for the persons or entities >>to > > >>which it is addressed and may contain confidential and/or privileged >>material. Any review, retransmission, dissemination or other use of, or >>taking of any action in reliance upon, this information by persons or >>entities other than the intended recipients is prohibited by AgResearch >>Limited. If you have received this message in error, please notify the >>sender immediately. >>======================================================================= >> >>_______________________________________________ >>Biojava-l mailing list - Biojava-l@biojava.org >>http://biojava.org/mailman/listinfo/biojava-l >> >> >> >> > > > >======================================================================= >Attention: The information contained in this message and/or attachments >from AgResearch Limited is intended only for the persons or entities to >which it is addressed and may contain confidential and/or privileged >material. Any review, retransmission, dissemination or other use of, or >taking of any action in reliance upon, this information by persons or >entities other than the intended recipients is prohibited by AgResearch >Limited. If you have received this message in error, please notify the >sender immediately. >======================================================================= > > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From ahmed at foobox.com Sun Oct 26 20:21:37 2003 From: ahmed at foobox.com (Ahmed Moustafa) Date: Sun Oct 26 20:18:28 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm In-Reply-To: References: Message-ID: <3F9C7321.2050705@foobox.com> Hi Mark, Which object does hold the output of the alignment? Thanks, Ahmed Schreiber, Mark wrote: >Hi - > >Because we have traditionally used HMMs there where no scoring matrices per se, just transition and emission probabilities. > >A scoring matrix would be easy to acheive. I would suggest you start with an interface called ScoringMatrix that would have as a minimum methods like: > >String getName(); which would return Blosum60 or similar depending on the implementation. >int getScore(Symbol s, Symbol substitute) which returns the int score for the substitution. > >You might want to also make a ScoringMatrixFactory that can generate appropriate instances of the ScoringMatrix from an XML file or files that could be included in the Distribution. > >- Mark > > > -----Original Message----- > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > Sent: Tue 14/10/2003 6:45 p.m. > To: Schreiber, Mark > Cc: biojava-dev@biojava.org > Subject: RE: [Biojava-dev] Java implementation of Smith-Waterman algorithm > > > > Hi, > > What are the Biojava's objects for the scoring matrices e.g. BLOSUMs and > PAMs? > > Thanks, > > Ahmed > > On Mon, 13 Oct 2003, Schreiber, Mark wrote: > > > Hi - > > > > Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. > > > > - Mark > > > > > > > -----Original Message----- > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > Sent: Sunday, 12 October 2003 7:06 p.m. > > > To: Schreiber, Mark > > > Cc: biojava-dev@biojava.org > > > Subject: RE: [Biojava-dev] Java implementation of > > > Smith-Waterman algorithm > > > > > > > > > Hi Mark, > > > > > > I believe the current API is reusable. Is it necessary to > > > convert the already existing implementation? > > > > > > Anyway, how can I convert my classes to biojava? > > > > > > Thanks! > > > > > > Ahmed > > > > > > > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > > > > > Hi - > > > > > > > > We have traditionally done pairwise alignments using HMMs. However > > > > there have been numerous requests for an implementation of > > > > Smith-Waterman. If you want some help coverting your classes to > > > > biojava give us a yell on the list. > > > > > > > > - Mark > > > > > > > > > > > > -----Original Message----- > > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > > Sent: Sun 12/10/2003 9:19 a.m. > > > > To: biojava-dev@biojava.org > > > > Cc: > > > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > > > algorithm > > > > > > > > > > > > > > > > Hello, > > > > > > > > I am working on a Java package of implementations of > > > sequence alignment > > > > algorithms. I have released an implementation of Smith-Waterman > > > > algorithm with Gotoh's improvement. The time complexity > > > is O(n2) and the > > > > space complexity is O(m * n + n) . > > > > > > > > The package name is JAligner and it is hosted at sourceforge.net > > > > . There is a front-end > > > demo using Swing > > > > and Java Web Start. > > > > > > > > Could JAligner be incorporated into the BioJava project? > > > > > > > > Best Regards, > > > > > > > > Ahmed > > From mark.schreiber at agresearch.co.nz Mon Oct 27 17:59:47 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Oct 27 17:56:57 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm Message-ID: Hi - This all depends on how you want to do the output. You could produce an instance of the Alignment interface (or more likely one of its subinterfaces) that gives the alignment between the two objects and use the Annotation of the alignment to store things like the score and the matrix used. Another possibility would be to return some sort of result object that would basically be a data structure holding the info you might have put in the Annotation as well as the Alignment. If you are searching a whole database you may need to return several Alignments. - Mark -----Original Message----- From: Ahmed Moustafa [mailto:ahmed@foobox.com] Sent: Mon 27/10/2003 2:21 p.m. To: Schreiber, Mark Cc: biojava-dev@biojava.org Subject: Re: [Biojava-dev] Java implementation of Smith-Waterman algorithm Hi Mark, Which object does hold the output of the alignment? Thanks, Ahmed Schreiber, Mark wrote: >Hi - > >Because we have traditionally used HMMs there where no scoring matrices per se, just transition and emission probabilities. > >A scoring matrix would be easy to acheive. I would suggest you start with an interface called ScoringMatrix that would have as a minimum methods like: > >String getName(); which would return Blosum60 or similar depending on the implementation. >int getScore(Symbol s, Symbol substitute) which returns the int score for the substitution. > >You might want to also make a ScoringMatrixFactory that can generate appropriate instances of the ScoringMatrix from an XML file or files that could be included in the Distribution. > >- Mark > > > -----Original Message----- > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > Sent: Tue 14/10/2003 6:45 p.m. > To: Schreiber, Mark > Cc: biojava-dev@biojava.org > Subject: RE: [Biojava-dev] Java implementation of Smith-Waterman algorithm > > > > Hi, > > What are the Biojava's objects for the scoring matrices e.g. BLOSUMs and > PAMs? > > Thanks, > > Ahmed > > On Mon, 13 Oct 2003, Schreiber, Mark wrote: > > > Hi - > > > > Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. > > > > - Mark > > > > > > > -----Original Message----- > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > Sent: Sunday, 12 October 2003 7:06 p.m. > > > To: Schreiber, Mark > > > Cc: biojava-dev@biojava.org > > > Subject: RE: [Biojava-dev] Java implementation of > > > Smith-Waterman algorithm > > > > > > > > > Hi Mark, > > > > > > I believe the current API is reusable. Is it necessary to > > > convert the already existing implementation? > > > > > > Anyway, how can I convert my classes to biojava? > > > > > > Thanks! > > > > > > Ahmed > > > > > > > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > > > > > Hi - > > > > > > > > We have traditionally done pairwise alignments using HMMs. However > > > > there have been numerous requests for an implementation of > > > > Smith-Waterman. If you want some help coverting your classes to > > > > biojava give us a yell on the list. > > > > > > > > - Mark > > > > > > > > > > > > -----Original Message----- > > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > > Sent: Sun 12/10/2003 9:19 a.m. > > > > To: biojava-dev@biojava.org > > > > Cc: > > > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > > > algorithm > > > > > > > > > > > > > > > > Hello, > > > > > > > > I am working on a Java package of implementations of > > > sequence alignment > > > > algorithms. I have released an implementation of Smith-Waterman > > > > algorithm with Gotoh's improvement. The time complexity > > > is O(n2) and the > > > > space complexity is O(m * n + n) . > > > > > > > > The package name is JAligner and it is hosted at sourceforge.net > > > > . There is a front-end > > > demo using Swing > > > > and Java Web Start. > > > > > > > > Could JAligner be incorporated into the BioJava project? > > > > > > > > Best Regards, > > > > > > > > Ahmed > > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From mark.schreiber at agresearch.co.nz Mon Oct 27 21:18:02 2003 From: mark.schreiber at agresearch.co.nz (Schreiber, Mark) Date: Mon Oct 27 21:15:43 2003 Subject: [Biojava-dev] AlphabetManager and Sequence serialization Message-ID: Hi all - There has been problems for a while with the AlphabetManager and serialization of protein Sequences. This now appears to be fixed. The solution involved using LifeScienceIdentifiers to uniquely identify Symbols. There should be no noticable changes for most people. Developers will notice that the AlphabetManager.xml file now uses LSIDs to name symbols. No change is made to Symbol.getName() return values. Users who supply custom alphabets as XML will need to use an LSID. The use of LSID's now avoids the possibility of name colisions. Maybe we should use them for other things we register (eg Alphabets which have no protection against name colisions). Other changes. AlphabetManager.symbolForName(String name) is now deprecated, hey it never worked properly anyway. Actually it works better now than it ever did but is only gaurenteed to work now for symbols of with LSID domains of biojava.org which should only be the core ones, don't name custom symbols with that domain. The new method of choice is symbolForLifeScienceID(LifeScienceIdentifier lsid). Only serious developers will ever really need these methods though. All the relevant tests seem to pass, most importantly serialization works with all the core alphabets and they all behave as expected. I will be committing some more unit tests over the next few days to ensure this continues. One last note. DNATools.a() == RNATools.a(). I'm not sure if this was ever the desrired behaivour but that is how it works in biojava1.3 If it isn't it shouldn't be too hard to fix but we should put in a unit test one way or the other to define the behaivour. DNA still doesn't contain RNATools.u() and RNA doesn't contain DNATools.t() which is desirable. The changes are now in CVS for biojava-live and will be on the biojava1.3 branch shortly. Please yell if you notice strangeness. - Mark ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From chibueze422 at hotmail.com Tue Oct 28 04:33:06 2003 From: chibueze422 at hotmail.com (MARIAM ABACHA) Date: Tue Oct 28 04:29:46 2003 Subject: [Biojava-dev] I need your assistance Message-ID: <200310280929.h9S9TFdb022740@portal.open-bio.org> STRICTLY CONFIDENTIAL/URGENT FROM: MRS. MARIAM ABACHA LAGOS NIGERIA. Att:sir/madam. I am Mrs. Mariam Abacha, the widow of the late Gen. Sanni Abacha former Nigerian Military Head of State who died mysteriously as a result of Cardiac Arrest. Since after my husbands death my family is under restriction of movement and that not withstanding, we are being molested, policed and our Bank Account both here and abroad are being frozen by the Nigeria Civilian Government. Furthermore, my elder son is in detention by the Nigerian Government for more interrogation about my husbands assets and more vital documents. Following the recent discovery of my husbands Bank Account by the Nigerian Government with Swiss Bank in which the huge sum of UD$700 million and DM450 million was logged. I therefore decided to contact you in confidence that I was able to move out the sum of US$48.6million Dollars, which was secretary defaced and is sealed in Metal Boxes for security reasons. I therefore personally, appeal to you seriously and religiously for your urgent assistance to move this money into your country where I believe it will be safe since I cannot leave the country due to the restriction of movement imposed on the members of my family by the Nigerian Government. My lawyer shall arrange with you for a face to face meeting outside Nigeria in order to liaise with you toward effective completion of this transactions. However, arrangement have been put in place to move this money out of the country in a secret vault through a security company in ABROAD, and as soon as you indicate your interest my Lawyer shall send you the deposite certificate of the consignment and other related documents so that you can help to claim the consignment. Conclusively, we have agreed to offer you 30% of the total sum while 60% is to be held on trust by you until we can decide on a suitable business investment in your country subsequent to our free movement by the Nigeria Government. While 10% is maped out for expenses as your government may demand for tax and it also covers your telephone bills. I will refer you to my Attorney when you send your tel and fax number address and your full name to enable him release the document involve before you will travel to the security company and claim the consignment. Please reply urgently and treat with absolute confidentiality and sincerity. Best regards, HAJIA MARIAM ABACHA (MRS.) NB: PLEASE I WILL WANT YOU TO GO THROUGH THE FOLLOWING WEBSITE IN OTHER TO CONFIRM MORE ON THE ISSUES OF THE ABACHA AND THE FEDERAL GOVERNMENT SEIZURE AND FROZON ON THE ABACHA'S WEALTH. http://news.bbc.co.uk/1/hi/world/africa/877113.stm http://news.bbc.co.uk/1/hi/world/africa/909972.stm http://www.marcosbillions.com/marcos/Dictators%20Abacha.htm From musma_adb at ny.com Sat Oct 25 18:11:44 2003 From: musma_adb at ny.com (MUSA A. ZULU) Date: Wed Oct 29 01:06:58 2003 Subject: [Biojava-dev] please assist. Message-ID: <200310290606.h9T65Wdb003785@portal.open-bio.org> My Dear Friend, My name is Mr.Musa A. Zulu, the manager, credit and foreign bills in African Development Bank Plc. I am writing in respect of a foreign customer of my bank with account number 89-3k7-1996/adb who perished in a plane crash [Korean Air Flight 801] with the whole passengers aboard on August 6, 1997. Since the demise of this our customer, whom I could not care lessa bout, I personally have watched with keen interest to see the next of kin but all has proved abortive as no one has come to claim his funds of us$.57 m, [Fifty seven million united states dollars] which has been with my branch for a very long time. On this note, I decided to seek for whom his name could be used as the next of kin as no one has come up to be the next of kin. And the banking ethics here does not allow such money to stay more than seven years, because the money will be recalled to the bank treasury as unclaimed after this period. In view of this I got your contact through my search via business report from your web site to see if you can assist. I will give you 25% of the total. Upon the receipt of your response, I will send to you by fax or e-mail the application letter, bank's fax number and the next step to take. I will not fail to bring to your notice that this business is hitch free and that you should not entertain any fear as all modalities for fund transfer can be finalized within five banking days, after you apply to the bank as a relation to the deceased. When you receive this letter. Kindly send me an e-mail signifying Your decision including your private Tel/Fax numbers for quick communication. Yours faithfully, Musa A Zulu. From tamir at imp.univie.ac.at Wed Oct 29 16:54:18 2003 From: tamir at imp.univie.ac.at (Ido M. Tamir) Date: Wed Oct 29 16:50:39 2003 Subject: [Biojava-dev] Problem with SymbolListCharSequence and Regex Message-ID: <200310292254.18149.tamir@imp.univie.ac.at> Hi, could this be a bug ? The regex captured group returned from a SymbolListCharSequence is 1 char more extended to the right than expected. Thank you very much for your time and effort. Ido M. Tamir Output for the testcase below: string: C symbol: ca gcat ---testcase: package mf; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.biojava.bio.seq.DNATools; import org.biojava.bio.seq.io.SymbolListCharSequence; import org.biojava.bio.symbol.SymbolList; public class TestRegex { public static void main(String[] args) { try { Pattern p = Pattern.compile("C", Pattern.CASE_INSENSITIVE); String strSeq = "GCAT"; SymbolList symSeq = DNATools.createDNA(strSeq); Matcher m = p.matcher( strSeq ); if( m.find() ){ System.out.println( "string: " + m.group() ); } m = p.matcher( new SymbolListCharSequence(symSeq )); if( m.find() ){ System.out.println( "symbol: " + m.group() + " " + new SymbolListCharSequence(symSeq )); } } catch (Exception e) { e.printStackTrace(); } } } From davidkosoko at netscape.net Tue Oct 21 14:15:33 2003 From: davidkosoko at netscape.net (David Kosoko) Date: Thu Oct 30 02:34:42 2003 Subject: [Biojava-dev] BUSINESS ARRANGEMENT Message-ID: <200310300734.h9U7Yddb001771@portal.open-bio.org> From: DAVID KOSOKO. TO: THE MANAGING DIRECTOR/CEO sir/madam Request For Urgent investment Proposal. This letter was borne out of my sincere desire to establish a business/mutual contact with you. My name is Mr.DAVID KOSOKO, the son of Chief Joshua Kosoko (the former deputy minister of finance under the ousted civilian government) who was killed and mutilated by the military junta led by Major ,Paul koroma. Though i do not know to what extent you are familiar with the events and disturbances in Sierra-leon but the pressure of war drove me and my mother out of sierra-leon into exile in the Netherlands where we have been living under political asylum for 3 years. Sadly,my mother died of cancer 2 months ago and was buried in DEN HAAG,THE NETHERLANDS,prior to her death,she handed me over a certificate meant for a secret deposit which my father made with a Financial Deposit Security company in Amsterdam,Holland. The deposit which is worth US$9,500.000.00 ( Nine Million five hundred thousand U.S.Dollars) was money paid to his corporation by its overseas customers before the heat of the conflict. He made the deposit in his name with the hope of converting it to his personal use at the end of the war but was killed when the conflict intensified as a result of his opposition to the rebel forces. I have contacted the security company to confirm the deposit and establish ownership of the money,also would want to invest the money into a profitable investment by your assistant,as i am yet to be issued with a traveling documents, by the Netherlands foreign police ,your involvement will be a very big help for me regarding the investment, untill i receive my traveling papers. I have decided to solicit for your participation as an honest and trustworthy person to assist me in the re-transfer and business investing of the money,as I can not do it alone due to my present social status and my lack of expertise in business,you will be given a negotiable percentage of the money at the end of the transaction. On the above proposal,contact me immediately through my personal e-mail: for more details, absolute confidentiality/secrecy must be ensured for trust. Indicate your personal Tel/Fax Nos. When replying. Sincerely, MR. DAVID KOSOKO. From matthew_pocock at yahoo.co.uk Thu Oct 30 06:02:15 2003 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Oct 30 06:05:05 2003 Subject: [Biojava-dev] Problem with SymbolListCharSequence and Regex In-Reply-To: <200310292254.18149.tamir@imp.univie.ac.at> References: <200310292254.18149.tamir@imp.univie.ac.at> Message-ID: <3FA0EFB7.6080006@yahoo.co.uk> Hi, This is a coordinate systems problem. Strings index from 0 to length-1, and ranges are inclusive of the min index and exclusive of the max index. Sequencees index from 1 to length and ranges are inclusive of min index and inclusive of max index. There was a bug in SymbolListCharSequence where code wasn't taking this into account. Now fixed in CVS. Matthew public CharSequence subSequence(int start, int end) { return new SymbolListCharSequence(syms.subList(start + 1, end), // was end + 1 alphaTokens); } Ido M. Tamir wrote: >Hi, >could this be a bug ? > >The regex captured group returned from a >SymbolListCharSequence is 1 char more extended >to the right than expected. > >Thank you very much for >your time and effort. > >Ido M. Tamir > > >Output for the testcase below: > >string: C >symbol: ca gcat > >---testcase: > > >package mf; > >import java.util.regex.Matcher; >import java.util.regex.Pattern; > >import org.biojava.bio.seq.DNATools; >import org.biojava.bio.seq.io.SymbolListCharSequence; >import org.biojava.bio.symbol.SymbolList; > > >public class TestRegex { > public static void main(String[] args) { > try { > Pattern p = Pattern.compile("C", Pattern.CASE_INSENSITIVE); > String strSeq = "GCAT"; > SymbolList symSeq = DNATools.createDNA(strSeq); > Matcher m = p.matcher( strSeq ); > if( m.find() ){ > System.out.println( "string: " + m.group() ); > } > m = p.matcher( new SymbolListCharSequence(symSeq )); > if( m.find() ){ > System.out.println( "symbol: " + m.group() + " " + new >SymbolListCharSequence(symSeq )); > } > } catch (Exception e) { > e.printStackTrace(); > } > } >} > > >_______________________________________________ >biojava-dev mailing list >biojava-dev@biojava.org >http://biojava.org/mailman/listinfo/biojava-dev > > > From kdj at sanger.ac.uk Thu Oct 30 07:00:09 2003 From: kdj at sanger.ac.uk (Keith James) Date: Thu Oct 30 07:00:10 2003 Subject: [Biojava-dev] Problem with SymbolListCharSequence and Regex In-Reply-To: <3FA0EFB7.6080006@yahoo.co.uk> References: <200310292254.18149.tamir@imp.univie.ac.at> <3FA0EFB7.6080006@yahoo.co.uk> Message-ID: >>>>> "Matthew" == Matthew Pocock writes: [...] Matthew> There was a bug in SymbolListCharSequence where code Matthew> wasn't taking this into account. Now fixed in CVS. Thanks. You got there just before me. I've added unit tests. -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From offers at closeout-sale.com Thu Oct 30 13:41:17 2003 From: offers at closeout-sale.com (Amber at dataDate) Date: Thu Oct 30 13:38:09 2003 Subject: [Biojava-dev] Unlimited email, chat and IM to 800,000+ members! Message-ID: <1180$7DXUdH5H-KL5IDXUdH5HoUxE@close-1.closeout-sale.com> Hi, I know, you've tried the other sites and they haven't produced anything for you. I was thinking the same thing until I re-discovered dataDate.com. Skeptical? The dataDate team is so sure that you'll find what you're looking for that they will allow you to EMAIL, IM and CHAT to the 800,000+ members for FREE. It's not very often that you have access to a group of 800,000+ eligible singles for FREE. Don't take my word click below to try the site out for yourself. http://www.DataDate.com/trk.asp?CID=1500 Hope to see you on the site, Amber http://www.DataDate.com/trk.asp?CID=1500 You are receiving this offer as part of the Closeout-sale recurring list. Click the following link if you prefer to not receive these messages in the future:http://www.closeout-sale.com/unsub.php?e=biojava-dev@biojava.org&m=2188017. To read Closeout-sale' privacy policy, visit http://www.closeout-sale.com/privacy.html">closeout-sale.com/privacy.html. The products and/or services advertised in this email are the sole responsibility of the advertiser, and questions about this offer should be directed to the advertiser. Or to unsubscribe via postal mail, please send request to: Closeout-sale.com 1140 Highland Ave., Suite #302 Manhattan Beach, CA 90266 The e-mail subscription address is: biojava-dev@biojava.org TM: <47;7DXUdH5H-KL5IDXUdH5HoUxE;2188017> From offers at closeout-sale.com Thu Oct 30 14:02:48 2003 From: offers at closeout-sale.com (Allison at PerfectMatch) Date: Thu Oct 30 13:59:43 2003 Subject: [Biojava-dev] 800,000 + singles waiting to hear from you! Message-ID: <1179$0xbYv4C4-dcC9xbYv4C4UYH8@close-1.closeout-sale.com> Hi, I know, you've tried the other sites and they haven't produced anything for you. I was thinking the same thing until I discovered the all new PerfectMatch.com. Skeptical? The PerfectMatch team is so sure that you'll find what you're looking for that they will allow you to EMAIL, IM and CHAT to the 800,000+ members for FREE. It's not very often that you have access to a group of 800,000+ eligible singles for FREE. Don't just take my word, click below to try the site out for yourself. http://www.PerfectMatch.com/trk.asp?CID=1505 Hope to see you on the site, Allison http://www.PerfectMatch.com/trk.asp?CID=1505 You are receiving this offer as part of the Closeout-sale recurring list. Click the following link if you prefer to not receive these messages in the future:http://www.closeout-sale.com/unsub.php?e=biojava-dev@biojava.org&m=2192017. To read Closeout-sale' privacy policy, visit http://www.closeout-sale.com/privacy.html">closeout-sale.com/privacy.html. The products and/or services advertised in this email are the sole responsibility of the advertiser, and questions about this offer should be directed to the advertiser. Or to unsubscribe via postal mail, please send request to: Closeout-sale.com 1140 Highland Ave., Suite #302 Manhattan Beach, CA 90266 The e-mail subscription address is: biojava-dev@biojava.org TM: <47;68O0gfwf-vGwN8O0gfwfP0l5;2192017> From offers at closeout-sale.com Thu Oct 30 18:23:07 2003 From: offers at closeout-sale.com (Amber at dataDate) Date: Thu Oct 30 18:19:59 2003 Subject: [Biojava-dev] Unlimited email, chat and IM to 800,000+ members! Message-ID: <1182$5n3zBrtr-qbtNn3zBrtr9zuO@close-1.closeout-sale.com> Hi, I know, you've tried the other sites and they haven't produced anything for you. I was thinking the same thing until I re-discovered dataDate.com. Skeptical? The dataDate team is so sure that you'll find what you're looking for that they will allow you to EMAIL, IM and CHAT to the 800,000+ members for FREE. It's not very often that you have access to a group of 800,000+ eligible singles for FREE. Don't take my word click below to try the site out for yourself. http://www.DataDate.com/trk.asp?CID=1500 Hope to see you on the site, Amber http://www.DataDate.com/trk.asp?CID=1500 You are receiving this offer as part of the Closeout-sale recurring list. Click the following link if you prefer to not receive these messages in the future:http://www.closeout-sale.com/unsub.php?e=biojava-dev@biojava.org&m=2195017. To read Closeout-sale' privacy policy, visit http://www.closeout-sale.com/privacy.html">closeout-sale.com/privacy.html. The products and/or services advertised in this email are the sole responsibility of the advertiser, and questions about this offer should be directed to the advertiser. Or to unsubscribe via postal mail, please send request to: Closeout-sale.com 1140 Highland Ave., Suite #302 Manhattan Beach, CA 90266 The e-mail subscription address is: biojava-dev@biojava.org TM: <47;5n3zBrtr-qbtNn3zBrtr9zuO;2195017> From yournewyear at address.com Fri Oct 31 10:49:46 2003 From: yournewyear at address.com (Dr. Beyer) Date: Fri Oct 31 10:45:57 2003 Subject: [Biojava-dev] To You as a Better Person Message-ID: <411-2200310531154945840@address.com> Dear biojava-dev: This is intended to be an one-time-only message, and I will never share your email address with others. I am sending this message to you as a voluntary charitable task, NOT for commercial profit. Thank you for your kindly consideration. You may ask yourself > How to make people like you and respect you > How to win friends > How to let your conduct help your health, work, job, career, success, relationships, spirit, mind, well-being, ... > How to make your life smoother and happier > How to do whatever you like without being unpleasant to other people > How to develop good conduct in your children or students > How to make the world peaceful and better You can find all the answers to these questions, and much more, in this great handbook: " Complete Conduct Principles for the 21st Century " by Dr. John Newton It is the best educational GIFT idea for children, friends, relatives, classmates, students, parents, teachers, other educators, ..., particularly at this special time. BENEFITS to each individual reader: Many! -- such as for health, work, job, career, success, self-improvement, education, relationships, spirit, mind, well-being, and much more -- almost all the areas that are important to you in the 21st century. People around you will benefit, too. (Please see the preface of the book for details.) EVERYONE may find this handbook useful and helpful, regardless of age (from children to oldsters), occupation, rank, status, gender, religious belief, race, nationality, country, or region. If you are a parent or a teacher, you can learn how to develop good conduct in your children or students from this handbook. Please advise your children or students to read the book. It will result in great benefits for both you and them. This book is a must for EVERYONE to be better prepared for personal conduct for the rest of the 21st century. The book's content is obvious from its title. The complete useful conduct principles cover not only what we should do, but also what we should not do -- especially those faults people make often and easily. This timely, unique, and very important handbook is designed to suit most people, and is self-contained and user-friendly. The book was also praised as "a compendium of concisely expressed, practical, informative, pertinent, workable advice" by Michael J. Carson, a professional book reviewer. "Unlike most books of this subject, it is NOT a religious book, nor is a collection of old conduct rules." This book is significantly different and better than competitive works. Some of its innovative contents may help solve problems that the Western culture cannot. The book's merit and importance have been recognized and praised by many experts, elected public officials, and world leaders. As a result of popular demand, the book has been a top "Featured Item" in some top on-line bookstores in some subjects, such as "Personal Practical Guides" in "Nonfiction" and "Reference", "Conduct" in "Health, Mind & Body", "Life skills guides", "Ethics & Moral Philosophy", ... "The book will also be effective for violence prevention for the whole society." said some experts. How to make the world peaceful and better --- You can find the solution in the book. Let's work together to make the world peaceful and better! The author, John Newton, holds a Ph.D. from MIT, and does researches at Harvard. His long-term research on "The personal conduct in the human society of the 21st century" resulted in this book. Before the human beings went into the 21st century, the compassionate, merciful, courageous and farsighted Dr. Newton had issued a number of warning predictions, some of which have already been proved in the new century. The book is introduced by NCWO, headquartered beside Harvard University and MIT, two leading institutes of new knowledge and literature. NCWO is an educational, charitable, non-profit, non-partisan, and honorary organization; it endeavors to make the 21st century nicer than ever before. To accomplish its mission, NCWO is proud to introduce this book. The Web site of NCWO has been chosen as one of "Top Non Profit Sites Chosen by Type Non Profit Editors". The book is available in two types of binding: Hardcover (ISBN 0967370574; case bound, Smyth sewn; with dust jacket) and Paperback (ISBN 0967370582; perfect bound). Both editions are unabridged, and are printed on 60 lb, natural, acid-free, excellent and healthful paper. You can get the book from many fine on-line bookstores and traditional bookstores. For your convenience, (if you wish for an Internet link; otherwise you may skip this section) I herewith provide you with a link directly to the book page of each edition in Half.com by eBay, a popular on-line discount mall: for paperback: http://half.ebay.com/cat/buy/prod.cgi?cpid=2425993 for hardcover: http://half.ebay.com/cat/buy/prod.cgi?cpid=2425992 Please forward this e-mail to people you know -- children, friends, relatives, classmates, students, parents, teachers, other educators, ..., because they can benefit from it, too. This can be a wonderful kindness you provide to them! Sincerely yours, (biojava-dev, best wishes to you!) Erwin Beyer, Ph.D. Cambridge, Massachusetts, USA P.S. Some educational units, ranging from the level of nation or state to individual school or university, have ordered the book as textbook, reference book, gift to students, or as an active action to prevent school violence, to improve education and to benefit students, teachers & parents. To have more people benefit from the book, please consider suggesting to the schools -- your children attend, or you yourself attend, have attended before, teach at, or serve -- that the book be used for fundraising for the schools. The book is an ideal fundraising tool. For example, it may used as a premium or a re-sale product for the fundraising. The successful fundraising will significantly help school education. Better yet, each supporter and his/her family will benefit from the book. Suggesting to the parent-teacher associations (organizations) (PTA/PTO) of the schools is also a good idea. From ahmed at arbornet.org Tue Oct 14 22:05:33 2003 From: ahmed at arbornet.org (Ahmed Moustafa) Date: Mon Nov 3 08:31:16 2003 Subject: [Biojava-dev] Java implementation of Smith-Waterman algorithm In-Reply-To: Message-ID: <20031014220237.C7332-200000@m-net.arbornet.org> In my implementation, I store the scoring matrices in a hash table of 2-dimensions arrays, the keys of the hash table are the names of the scoring matrices e.g. BLOSUM62 and PAM250. I avoided using objects for the scoring matrices to save the overhead of methods calling. The Jar file contains about 70 scoring matrices from the NBCI site. What do you think of keeping the Jar file as it is and writing a wrapper class to interface between Biojava and JAligner? On Tue, 14 Oct 2003, Schreiber, Mark wrote: > Hi - > > Because we have traditionally used HMMs there where no scoring matrices per se, just transition and emission probabilities. > > A scoring matrix would be easy to acheive. I would suggest you start with an interface called ScoringMatrix that would have as a minimum methods like: > > String getName(); which would return Blosum60 or similar depending on the implementation. > int getScore(Symbol s, Symbol substitute) which returns the int score for the substitution. > > You might want to also make a ScoringMatrixFactory that can generate appropriate instances of the ScoringMatrix from an XML file or files that could be included in the Distribution. > > - Mark > > > -----Original Message----- > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > Sent: Tue 14/10/2003 6:45 p.m. > To: Schreiber, Mark > Cc: biojava-dev@biojava.org > Subject: RE: [Biojava-dev] Java implementation of Smith-Waterman algorithm > > > > Hi, > > What are the Biojava's objects for the scoring matrices e.g. BLOSUMs and > PAMs? > > Thanks, > > Ahmed > > On Mon, 13 Oct 2003, Schreiber, Mark wrote: > > > Hi - > > > > Probably just convert to use BioJava Symbol and SymbolList objects. Probably everything else is OK. > > > > - Mark > > > > > > > -----Original Message----- > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > Sent: Sunday, 12 October 2003 7:06 p.m. > > > To: Schreiber, Mark > > > Cc: biojava-dev@biojava.org > > > Subject: RE: [Biojava-dev] Java implementation of > > > Smith-Waterman algorithm > > > > > > > > > Hi Mark, > > > > > > I believe the current API is reusable. Is it necessary to > > > convert the already existing implementation? > > > > > > Anyway, how can I convert my classes to biojava? > > > > > > Thanks! > > > > > > Ahmed > > > > > > > > > On Sun, 12 Oct 2003, Schreiber, Mark wrote: > > > > > > > Hi - > > > > > > > > We have traditionally done pairwise alignments using HMMs. However > > > > there have been numerous requests for an implementation of > > > > Smith-Waterman. If you want some help coverting your classes to > > > > biojava give us a yell on the list. > > > > > > > > - Mark > > > > > > > > > > > > -----Original Message----- > > > > From: Ahmed Moustafa [mailto:ahmed@arbornet.org] > > > > Sent: Sun 12/10/2003 9:19 a.m. > > > > To: biojava-dev@biojava.org > > > > Cc: > > > > Subject: [Biojava-dev] Java implementation of Smith-Waterman > > > > algorithm > > > > > > > > > > > > > > > > Hello, > > > > > > > > I am working on a Java package of implementations of > > > sequence alignment > > > > algorithms. I have released an implementation of Smith-Waterman > > > > algorithm with Gotoh's improvement. The time complexity > > > is O(n2) and the > > > > space complexity is O(m * n + n) . > > > > > > > > The package name is JAligner and it is hosted at sourceforge.net > > > > . There is a front-end > > > demo using Swing > > > > and Java Web Start. > > > > > > > > Could JAligner be incorporated into the BioJava project? > > > > > > > > Best Regards, > > > > > > > > Ahmed > > > > > > > > ======================================================================= > > Attention: The information contained in this message and/or attachments > > from AgResearch Limited is intended only for the persons or entities > > to which it is addressed and may contain confidential and/or privileged > > material. Any review, retransmission, dissemination or other use of, or > > taking of any action in reliance upon, this information by persons or > > entities other than the intended recipients is prohibited by AgResearch > > Limited. If you have received this message in error, please notify the > > sender immediately. > > ======================================================================= > > > > > > > ======================================================================= > Attention: The information contained in this message and/or attachments > from AgResearch Limited is intended only for the persons or entities > to which it is addressed and may contain confidential and/or privileged > material. Any review, retransmission, dissemination or other use of, or > taking of any action in reliance upon, this information by persons or > entities other than the intended recipients is prohibited by AgResearch > Limited. If you have received this message in error, please notify the > sender immediately. > ======================================================================= > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/biojava-dev/attachments/20031015/5faa20a4/Matrices.java.html