From ola.spjuth at farmbio.uu.se Tue May 2 09:15:18 2006 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Tue, 02 May 2006 15:15:18 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences Message-ID: <1146575718.5603.23.camel@localhost.localdomain> Hi, Implementing a Biojava reader/parser for sequences in Bioclipse [1,2] I have come up with a few questions: 1) I'd like to use Biojava-X with Bioclipse. Are there any problems running it with Java 1.5 (as is required by Bioclipse)? 2) I would propose the addition of a readStream(...) method in RichSequence.IOTools in addition to readFile(...). For the Bioclipse project it would be most useful to be able to guess the format of a Stream. As IOTools is marked final it cannot be subclassed. 3) Is HashBioEntryDB a suitable base object for storing 1-N RichSequences in memory or should I use RichSequence[]? Which solution has the simplest toByte() method for writing to e.g. a File? So, basically I am looking for the most convenient way of doing: i) Read byte[] (from a File containing 1-N sequences) into a base object in memory (HashBioEntryDB or RichSequence[]) ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then later to File using Bioclipse-methods) Cheers, .../Ola [1] http://www.bioclipse.net [2] http://wiki.bioclipse.net From mark.schreiber at novartis.com Tue May 2 21:19:02 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 3 May 2006 09:19:02 +0800 Subject: [Biojava-l] BioJava-X parsing of RichSequences Message-ID: Ola Spjuth Sent by: biojava-l-bounces at lists.open-bio.org 05/02/2006 09:15 PM To: biojava-l cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] BioJava-X parsing of RichSequences > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > running it with Java 1.5 (as is required by Bioclipse)? Shouldn't be a problem. Biojava-X doesn't use Java1.5 but JDK1.5 (JRE5.0) can run and compile biojava. >2) I would propose the addition of a readStream(...) method in >RichSequence.IOTools in addition to readFile(...). For the Bioclipse >project it would be most useful to be able to guess the format of a >Stream. As IOTools is marked final it cannot be subclassed. The reason you cannot do this is because format guessing involves reading some data from the source and then either pushing it back or re-opening when it has guessed the format. You cannot guarentee a pushback to a Stream and you cannot guarentee you could re-open it again. As a hack you could read the stream into a temp file and pass that to IOTools. You may also be able to read it to a ByteArrayBuffer and read that as a Stream. >3) Is HashBioEntryDB a suitable base object for storing 1-N >RichSequences in memory or should I use RichSequence[]? Which solution >has the simplest toByte() method for writing to e.g. a File? > >So, basically I am looking for the most convenient way of doing: > >i) Read byte[] (from a File containing 1-N sequences) into a base >object in memory (HashBioEntryDB or RichSequence[]) >ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then >later to File using Bioclipse-methods) > The simplist way to read in and write out directly is to take the RichSequenceIterator you get from the IOTools read method and pass it direct to the IOTools out method of choice. If you want to manipulate data in between a RichSequence[] is probably smaller in memory but not as user freindly as a DB object. You should also be aware that RichSequenceIterators are lazy, eg they only read data from a file for each request to nextRichSequence(), thus you can manipulate each sequence as it comes in and not have to worry about running out of memory. Hope this helps, - Mark From richard.holland at ebi.ac.uk Wed May 3 04:38:38 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 03 May 2006 09:38:38 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: References: Message-ID: <1146645518.3950.22.camel@texas.ebi.ac.uk> Ah yes, I hadn't thought about that aspect. In which case, a Stream- capable format-guesser is not going to be possible. But there's nothing stopping Ola from reading/writing to Streams directly, as long as he knows what format they're in. It's also worth pointing out that the format guesser is not to be relied on. It'll sometimes get it wrong and some formats it won't recognise at all. I wouldn't rely on it - it's there for simple applications only. cheers, Richard On Wed, 2006-05-03 at 09:19 +0800, mark.schreiber at novartis.com wrote: > Ola Spjuth > Sent by: biojava-l-bounces at lists.open-bio.org > 05/02/2006 09:15 PM > > > To: biojava-l > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] BioJava-X parsing of RichSequences > > > > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > > running it with Java 1.5 (as is required by Bioclipse)? > > Shouldn't be a problem. Biojava-X doesn't use Java1.5 but JDK1.5 (JRE5.0) > can run and compile biojava. > > >2) I would propose the addition of a readStream(...) method in > >RichSequence.IOTools in addition to readFile(...). For the Bioclipse > >project it would be most useful to be able to guess the format of a > >Stream. As IOTools is marked final it cannot be subclassed. > > The reason you cannot do this is because format guessing involves reading > some data from the source and then either pushing it back or re-opening > when it has guessed the format. You cannot guarentee a pushback to a > Stream and you cannot guarentee you could re-open it again. As a hack you > could read the stream into a temp file and pass that to IOTools. You may > also be able to read it to a ByteArrayBuffer and read that as a Stream. > > >3) Is HashBioEntryDB a suitable base object for storing 1-N > >RichSequences in memory or should I use RichSequence[]? Which solution > >has the simplest toByte() method for writing to e.g. a File? > > > >So, basically I am looking for the most convenient way of doing: > > > >i) Read byte[] (from a File containing 1-N sequences) into a base > >object in memory (HashBioEntryDB or RichSequence[]) > >ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then > >later to File using Bioclipse-methods) > > > > The simplist way to read in and write out directly is to take the > RichSequenceIterator you get from the IOTools read method and pass it > direct to the IOTools out method of choice. If you want to manipulate data > in between a RichSequence[] is probably smaller in memory but not as user > freindly as a DB object. > > You should also be aware that RichSequenceIterators are lazy, eg they only > read data from a file for each request to nextRichSequence(), thus you can > manipulate each sequence as it comes in and not have to worry about > running out of memory. > > Hope this helps, > > - Mark > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From fpepin at cs.mcgill.ca Wed May 3 12:50:56 2006 From: fpepin at cs.mcgill.ca (Francois Pepin) Date: Wed, 03 May 2006 12:50:56 -0400 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146645518.3950.22.camel@texas.ebi.ac.uk> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> Message-ID: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> On Wed, 2006-05-03 at 09:38 +0100, Richard Holland wrote: > Ah yes, I hadn't thought about that aspect. In which case, a Stream- > capable format-guesser is not going to be possible. But there's nothing > stopping Ola from reading/writing to Streams directly, as long as he > knows what format they're in. I would tend to disagree about the impossibility of using streams. How much of the file is generally being read before the guess is made? I'm thinking very little is needed, especially compared to how much memory Java usually takes. It would not be very difficult to save that first part of the stream and then play it back once the guess is made. I kind of like the idea of using streams, in cases where you are not reading from a file. Having to write everything to a temporary file to satisfy the API isn't a very appealing solution, I think. I could code something up if people are interested. Francois From rhett at detailedbalance.net Wed May 3 13:44:14 2006 From: rhett at detailedbalance.net (Rhett Sutphin) Date: Wed, 3 May 2006 12:44:14 -0500 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> Message-ID: <171BAAB6-D815-4EFC-8ACD-E0CEFAC407A1@detailedbalance.net> On May 3, 2006, at 11:50 AM, Francois Pepin wrote: > On Wed, 2006-05-03 at 09:38 +0100, Richard Holland wrote: >> Ah yes, I hadn't thought about that aspect. In which case, a Stream- >> capable format-guesser is not going to be possible. But there's >> nothing >> stopping Ola from reading/writing to Streams directly, as long as he >> knows what format they're in. > > I would tend to disagree about the impossibility of using streams. > > How much of the file is generally being read before the guess is made? > I'm thinking very little is needed, especially compared to how much > memory Java usually takes. > > It would not be very difficult to save that first part of the > stream and > then play it back once the guess is made. I encountered the same issue when writing the chromatogram reading code. I wrote org.biojava.utils.io.CachingInputStream as a solution. It may be useful as a starting point. Rhett From e.willighagen at science.ru.nl Wed May 3 13:54:23 2006 From: e.willighagen at science.ru.nl (Egon Willighagen) Date: Wed, 3 May 2006 19:54:23 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> Message-ID: <200605031954.23470.e.willighagen@science.ru.nl> On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > How much of the file is generally being read before the guess is made? > I'm thinking very little is needed, especially compared to how much > memory Java usually takes. Generally not much. Jmol uses 16384 bytes. > It would not be very difficult to save that first part of the stream and > then play it back once the guess is made. See how Jmol does it: http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > I kind of like the idea of using streams, in cases where you are not > reading from a file. Having to write everything to a temporary file to > satisfy the API isn't a very appealing solution, I think. > > I could code something up if people are interested. An additional advantage is that you get .gz support in one go: BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); InputStream is = bis; bis.mark(5); int countRead = 0; countRead = bis.read(abMagic, 0, 4); bis.reset(); if (countRead == 4 && abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) is = new GZIPInputStream(bis); where t is your InputStream, and is the stream to use after the gzip check/unzip. For the full working code, see again Jmol CVS: http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup Egon -- e.willighagen at science.ru.nl Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 From richard.holland at ebi.ac.uk Thu May 4 05:02:10 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 04 May 2006 10:02:10 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <200605031954.23470.e.willighagen@science.ru.nl> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> <200605031954.23470.e.willighagen@science.ru.nl> Message-ID: <1146733331.3955.0.camel@texas.ebi.ac.uk> I have added the capability to guess the format of streams, and read directly from them. See RichSequence.IOTools.readStream() for details. In CVS biojava-live now. cheers, Richard On Wed, 2006-05-03 at 19:54 +0200, Egon Willighagen wrote: > On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > > How much of the file is generally being read before the guess is made? > > I'm thinking very little is needed, especially compared to how much > > memory Java usually takes. > > Generally not much. Jmol uses 16384 bytes. > > > It would not be very difficult to save that first part of the stream and > > then play it back once the guess is made. > > See how Jmol does it: > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > > > I kind of like the idea of using streams, in cases where you are not > > reading from a file. Having to write everything to a temporary file to > > satisfy the API isn't a very appealing solution, I think. > > > > I could code something up if people are interested. > > An additional advantage is that you get .gz support in one go: > > BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); > InputStream is = bis; > bis.mark(5); > int countRead = 0; > countRead = bis.read(abMagic, 0, 4); > bis.reset(); > if (countRead == 4 && > abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) > is = new GZIPInputStream(bis); > > where t is your InputStream, and is the stream to use after the gzip > check/unzip. For the full working code, see again Jmol CVS: > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup > > Egon > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From ola.spjuth at farmbio.uu.se Thu May 4 05:25:16 2006 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 04 May 2006 11:25:16 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146733331.3955.0.camel@texas.ebi.ac.uk> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> <200605031954.23470.e.willighagen@science.ru.nl> <1146733331.3955.0.camel@texas.ebi.ac.uk> Message-ID: <1146734717.5587.46.camel@localhost.localdomain> Thank you very much! I shall update Bioclipse to use this for the next release (>0.9.0). Cheers, .../Ola On Thu, 2006-05-04 at 10:02 +0100, Richard Holland wrote: > I have added the capability to guess the format of streams, and read > directly from them. See RichSequence.IOTools.readStream() for details. > > In CVS biojava-live now. > > cheers, > Richard > > On Wed, 2006-05-03 at 19:54 +0200, Egon Willighagen wrote: > > On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > > > How much of the file is generally being read before the guess is made? > > > I'm thinking very little is needed, especially compared to how much > > > memory Java usually takes. > > > > Generally not much. Jmol uses 16384 bytes. > > > > > It would not be very difficult to save that first part of the stream and > > > then play it back once the guess is made. > > > > See how Jmol does it: > > > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > > > > > I kind of like the idea of using streams, in cases where you are not > > > reading from a file. Having to write everything to a temporary file to > > > satisfy the API isn't a very appealing solution, I think. > > > > > > I could code something up if people are interested. > > > > An additional advantage is that you get .gz support in one go: > > > > BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); > > InputStream is = bis; > > bis.mark(5); > > int countRead = 0; > > countRead = bis.read(abMagic, 0, 4); > > bis.reset(); > > if (countRead == 4 && > > abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) > > is = new GZIPInputStream(bis); > > > > where t is your InputStream, and is the stream to use after the gzip > > check/unzip. For the full working code, see again Jmol CVS: > > > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup > > > > Egon > > From richard.holland at ebi.ac.uk Thu May 4 11:45:38 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 04 May 2006 16:45:38 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146736092.5587.64.camel@localhost.localdomain> References: <1146575718.5603.23.camel@localhost.localdomain> <1146583993.3950.19.camel@texas.ebi.ac.uk> <1146608664.5603.37.camel@localhost.localdomain> <1146651509.3950.24.camel@texas.ebi.ac.uk> <1146736092.5587.64.camel@localhost.localdomain> Message-ID: <1146757539.3955.6.camel@texas.ebi.ac.uk> The UniProt file format has apparently changed since I wrote the parser, and the date lines now take a different format: DT 01-OCT-1994, integrated into UniProtKB/Swiss-Prot. DT 27-APR-2001, sequence version 3. DT 18-APR-2006, entry version 85. These are not recognised by the parser and are throwing an exception. Also, UniProt changed their Feature Table format. I've also fixed this. I've updated the parser in CVS to (hopefully) cope with this, although it now no longer recognises the old format (which was the same as the EMBL format). Can someone test it thoroughly please? cheers, Richard On Thu, 2006-05-04 at 11:48 +0200, Ola Spjuth wrote: > Richard, > > This is what I tried: > Class.forName("org.biojavax.bio.seq.io.EMBLFormat"); > Class.forName("org.biojavax.bio.seq.io.EMBLxmlFormat"); > Class.forName("org.biojavax.bio.seq.io.FastaFormat"); > Class.forName("org.biojavax.bio.seq.io.GenbankFormat"); > Class.forName("org.biojavax.bio.seq.io.INSDseqFormat"); > Class.forName("org.biojavax.bio.seq.io.RichSequenceFormat"); > Class.forName("org.biojavax.bio.seq.io.UniProtFormat"); > Class.forName("org.biojavax.bio.seq.io.UniProtXMLFormat"); > > Namespace ns = RichObjectFactory.getDefaultNamespace(); > RichSequenceIterator seqit; > seqit = RichSequence.IOTools.readFile(new File(MyFilename),ns); > > ArrayList seqs=new ArrayList(); > while (seqit.hasNext()){ > RichSequence rseq=null; > Sequence seq=null; > rseq = seqit.nextRichSequence(); > if (rseq!=null) > seqs.add(rseq); > } > > -- > > Seems that seqit.hasNext() returns true, but seqit.nextRichSequence() > throws an exception. > > It works with my Fasta-sequences but not with the attached UniProt > sequence (or else I'm doing something wrong). The test-file was attached > by Mark Southern (thanks Mark!) and works with biojavas SeqIOTools. > > Glad if you could have a look at it! > > Cheers, > > .../Ola > > > On Wed, 2006-05-03 at 11:18 +0100, Richard Holland wrote: > > Interesting - the code and file would be useful in trying to work out > > what is happening. > > > > cheers, > > Richard > > > > On Wed, 2006-05-03 at 00:24 +0200, Ola Spjuth wrote: > > > Hi Richard, > > > > > > Thanks a lot, I really appreciate that! I think Bioclipse will serve as > > > an excellent showcase for what can easily be achieved with Biojava. > > > > > > Another problem I found was that parsing of a UniprotFormat file > > > resulted in no RichSequences while it worked with the old Biojava > > > SeqIOtools. If you like I can provide the file and code used for my > > > reading of it. > > > > > > Cheers, > > > > > > .../Ola > > > > > > > > > On Tue, 2006-05-02 at 16:33 +0100, Richard Holland wrote: > > > > Hi Ola. I'll look into implementing something that'll help you. Give me > > > > a day or two and see what happens... :) > > > > > > > > cheers, > > > > Richard > > > > > > > > > > > > On Tue, 2006-05-02 at 15:15 +0200, Ola Spjuth wrote: > > > > > Hi, > > > > > > > > > > Implementing a Biojava reader/parser for sequences in Bioclipse [1,2] I > > > > > have come up with a few questions: > > > > > > > > > > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > > > > > running it with Java 1.5 (as is required by Bioclipse)? > > > > > > > > > > 2) I would propose the addition of a readStream(...) method in > > > > > RichSequence.IOTools in addition to readFile(...). For the Bioclipse > > > > > project it would be most useful to be able to guess the format of a > > > > > Stream. As IOTools is marked final it cannot be subclassed. > > > > > > > > > > 3) Is HashBioEntryDB a suitable base object for storing 1-N > > > > > RichSequences in memory or should I use RichSequence[]? Which solution > > > > > has the simplest toByte() method for writing to e.g. a File? > > > > > > > > > > So, basically I am looking for the most convenient way of doing: > > > > > > > > > > i) Read byte[] (from a File containing 1-N sequences) into a base > > > > > object in memory (HashBioEntryDB or RichSequence[]) > > > > > ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then > > > > > later to File using Bioclipse-methods) > > > > > > > > > > Cheers, > > > > > > > > > > .../Ola > > > > > > > > > > [1] http://www.bioclipse.net > > > > > [2] http://wiki.bioclipse.net > > > > > > > > > > > > > > > _______________________________________________ > > > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Tue May 9 07:19:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 9 May 2006 12:19:29 +0100 Subject: [Biojava-l] Access to variables Message-ID: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Apologies if this comes through more than once - I forgot to send in plain text without attachments! In case you don?t know ? I?m new to Java . I?m working out an interface/class structure for part of an app I want to convert from Perl to Java and I have a question about the best way to provide access to variables to the client programmer: Is it best to have variables you want the client programmer to access just made public or is it best to provide access to them via a get/set method? >From my limited reading of ?Thinking in Java? I would think it best to hide the implementation from the user and provide methods to access these variables e.g. setThreshold and getThreshold modify the private variable threshold ? is that correct or am I way off the mark!? Thanks for any clarification. Nath ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ---------------------------------------------------------------------------- ------ --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 12:18:14 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 12:19:29 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Tue May 9 08:56:30 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Tue, 09 May 2006 13:56:30 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> References: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Message-ID: <1147179391.3951.25.camel@texas.ebi.ac.uk> hi there. Get/Set methods with private fields are by far the preferred way of doing things. This ensures that the object gets to know whenever one of its variables has changed. For example, assume you had a class that represented a sequence, and one of the methods in that class computed some expensive statistic on that sequence and stored that statistic in another variable. If the sequence itself changed then you'd need to recompute the statistic too. Without get/set, there'd be no way of knowing the sequence had changed, and no way of knowing when to recompute the statistic. cheers, Richard On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > Apologies if this comes through more than once - I forgot to send in plain > text without attachments! > > In case you don?t know ? I?m new to Java?. > > I?m working out an interface/class structure for part of an app I want to > convert from Perl to Java and I have a question about the best way to > provide access to variables to the client programmer: > > Is it best to have variables you want the client programmer to access just > made public or is it best to provide access to them via a get/set method? > >From my limited reading of ?Thinking in Java? I would think it best to hide > the implementation from the user and provide methods to access these > variables e.g. setThreshold and getThreshold modify the private variable > threshold ? is that correct or am I way off the mark!? > > Thanks for any clarification. > > Nath > > ---------------------------------------------------------------------------- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211 Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences Mob: +44 (0)7742 533 > 569 > University of Sheffield Fax: +44 (0)114 22 > 20002 > Western Bank Web: > www.bioinf.shef.ac.uk > Sheffield > www.petraea.shef.ac.uk > S10 2TN > ---------------------------------------------------------------------------- > ------ > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 12:18:14 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 12:19:29 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Tue May 9 09:09:58 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 9 May 2006 14:09:58 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <1147179391.3951.25.camel@texas.ebi.ac.uk> Message-ID: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> Well, I've jumped straight in and am already planning to use get/set methods for most of my variables :o) In my app I plan to have a multiple alignment displayed and the user opts to calculate a consensus sequence as part of a larger process. The user will also be able to make changes to the alignment. Therefore, if a consensus sequence has already been calculated I'd like this to be automatically updated to reflect the changes in the alignment. Do you know of a small coded example of how this is done i.e. in your example: detecting if the sequence changed and processing a block of code if it has. Cheers Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 09 May 2006 13:57 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Access to variables > > hi there. > > Get/Set methods with private fields are by far the preferred way of > doing things. This ensures that the object gets to know whenever one of > its variables has changed. > > For example, assume you had a class that represented a sequence, and one > of the methods in that class computed some expensive statistic on that > sequence and stored that statistic in another variable. If the sequence > itself changed then you'd need to recompute the statistic too. Without > get/set, there'd be no way of knowing the sequence had changed, and no > way of knowing when to recompute the statistic. > > cheers, > Richard > > On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > > Apologies if this comes through more than once - I forgot to send in > plain > > text without attachments! > > > > In case you don't know - I'm new to Java.. > > > > I'm working out an interface/class structure for part of an app I want > to > > convert from Perl to Java and I have a question about the best way to > > provide access to variables to the client programmer: > > > > Is it best to have variables you want the client programmer to access > just > > made public or is it best to provide access to them via a get/set > method? > > >From my limited reading of "Thinking in Java" I would think it best to > hide > > the implementation from the user and provide methods to access these > > variables e.g. setThreshold and getThreshold modify the private variable > > threshold - is that correct or am I way off the mark!? > > > > Thanks for any clarification. > > > > Nath > > > > ------------------------------------------------------------------------ > ---- > > ------ > > Dr. Nathan S. Haigh > > Bioinformatics PostDoctoral Research Associate > > > > Room B2 211 Tel: +44 (0)114 > 22 > > 20112 > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > 533 > > 569 > > University of Sheffield Fax: +44 (0)114 > 22 > > 20002 > > Western Bank Web: > > www.bioinf.shef.ac.uk > > Sheffield > > www.petraea.shef.ac.uk > > S10 2TN > > ------------------------------------------------------------------------ > ---- > > ------ > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0615-2, 12/04/2006 > > Tested on: 09/05/2006 12:18:14 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0615-2, 12/04/2006 > > Tested on: 09/05/2006 12:19:29 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 14:09:48 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From smh1008 at cam.ac.uk Tue May 9 09:12:24 2006 From: smh1008 at cam.ac.uk (David Huen) Date: 09 May 2006 14:12:24 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> References: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Message-ID: On May 9 2006, Nathan S. Haigh wrote: >Apologies if this comes through more than once - I forgot to send in plain >text without attachments! > >In case you don't know - I'm new to Java . > >I'm working out an interface/class structure for part of an app I want to >convert from Perl to Java and I have a question about the best way to >provide access to variables to the client programmer: > >Is it best to have variables you want the client programmer to access just >made public or is it best to provide access to them via a get/set method? >> From my limited reading of "Thinking in Java" I would think it best to >> hide >the implementation from the user and provide methods to access these >variables e.g. setThreshold and getThreshold modify the private variable >threshold - is that correct or am I way off the mark!? > Breaking object encapsulation is generally a bad thing in OO programming so, yes, avoid it when you can. We try to make it difficult to do so in BioJava anyway :-). Regards, David From richard.holland at ebi.ac.uk Tue May 9 09:28:28 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Tue, 09 May 2006 14:28:28 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> References: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> Message-ID: <1147181309.3951.41.camel@texas.ebi.ac.uk> There is no easy way, but BioJava handles things like this using a listener model. What this means is that some objects are EventListeners, whilst other fire events to a central EventManager. The EventManager sends these events to all EventListeners registered as being interested in that kind of event. The simplest form is: public class Event { public final Object object; public final Object eventType; public Event(Object object, Object eventType) { this.object = object; this.eventType = eventType; } } public interface EventListener { public void eventOccurred(Event e); } public class EventManager { private static final Map eventListeners = new HashMap(); public static void registerEventListener(EventListener eventListener, Object eventType) { if (!eventListeners.containsKey(eventType)) eventListeners.put(eventType, new ArrayList()); ((List)eventListeners.get(eventType)).add(eventListener); } public static void fireEvent(Event e) { for (Iterator i = ((List)eventListeners.get(e.eventType)).iterator(); i.hasNext(); ) ((EventListener)i.next()).eventOccurred(e); } } In your example, the class representing the alignment would fire an event whenever the alignment changed, by calling EventManager.fireEvent () from the method which made the change. For instance, assuming the method which made the change was insertGap(): public void insertGap(int gapPosition) { // do the work of inserting the gap here. ... ... // Fire an event. EventManager.fireEvent(new Event(this, "gapInserted")); } In the class representing the consensus, which may or may not be the same class as the alignment, you would do this: public class Consensus implements EventListener { private Alignment alignment; public Consensus(Alignment alignment) { this.alignment = alignment; this.updateConsensus(); EventManager.registerEventListener(this, "gapInserted"); } public void eventOccurred(Event e) { if (e.eventType.equals("gapInserted")) { this.updateConsensus(); } } private void updateConsensus() { // do the updating here ... ... } } This is by far a simplistic example, but I hope you get the idea. There is much more out there on the web - Wikipedia is a good starting point for programming concepts such as these. cheers, Richard On Tue, 2006-05-09 at 14:09 +0100, Nathan S. Haigh wrote: > Well, I've jumped straight in and am already planning to use get/set methods > for most of my variables :o) > > In my app I plan to have a multiple alignment displayed and the user opts to > calculate a consensus sequence as part of a larger process. The user will > also be able to make changes to the alignment. Therefore, if a consensus > sequence has already been calculated I'd like this to be automatically > updated to reflect the changes in the alignment. Do you know of a small > coded example of how this is done i.e. in your example: detecting if the > sequence changed and processing a block of code if it has. > > Cheers > Nath > > > > -----Original Message----- > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > Sent: 09 May 2006 13:57 > > To: n.haigh at sheffield.ac.uk > > Cc: biojava-l at lists.open-bio.org > > Subject: Re: [Biojava-l] Access to variables > > > > hi there. > > > > Get/Set methods with private fields are by far the preferred way of > > doing things. This ensures that the object gets to know whenever one of > > its variables has changed. > > > > For example, assume you had a class that represented a sequence, and one > > of the methods in that class computed some expensive statistic on that > > sequence and stored that statistic in another variable. If the sequence > > itself changed then you'd need to recompute the statistic too. Without > > get/set, there'd be no way of knowing the sequence had changed, and no > > way of knowing when to recompute the statistic. > > > > cheers, > > Richard > > > > On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > > > Apologies if this comes through more than once - I forgot to send in > > plain > > > text without attachments! > > > > > > In case you don't know - I'm new to Java.. > > > > > > I'm working out an interface/class structure for part of an app I want > > to > > > convert from Perl to Java and I have a question about the best way to > > > provide access to variables to the client programmer: > > > > > > Is it best to have variables you want the client programmer to access > > just > > > made public or is it best to provide access to them via a get/set > > method? > > > >From my limited reading of "Thinking in Java" I would think it best to > > hide > > > the implementation from the user and provide methods to access these > > > variables e.g. setThreshold and getThreshold modify the private variable > > > threshold - is that correct or am I way off the mark!? > > > > > > Thanks for any clarification. > > > > > > Nath > > > > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > Dr. Nathan S. Haigh > > > Bioinformatics PostDoctoral Research Associate > > > > > > Room B2 211 Tel: +44 (0)114 > > 22 > > > 20112 > > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > > 533 > > > 569 > > > University of Sheffield Fax: +44 (0)114 > > 22 > > > 20002 > > > Western Bank Web: > > > www.bioinf.shef.ac.uk > > > Sheffield > > > www.petraea.shef.ac.uk > > > S10 2TN > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0615-2, 12/04/2006 > > > Tested on: 09/05/2006 12:18:14 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0615-2, 12/04/2006 > > > Tested on: 09/05/2006 12:19:29 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 14:09:48 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From wendy.wong at gmail.com Tue May 9 11:46:03 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 9 May 2006 16:46:03 +0100 Subject: [Biojava-l] ScoreType.Odds Message-ID: Hi, I was wondering if I use ScoreType.Odds for my HMM is there a default cutoff value? or it just picks whichever state that has the highest odds ratio? if it uses a cutoff value is there a way to set it? thanks, wendy From mark.schreiber at novartis.com Tue May 9 23:54:07 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 10 May 2006 11:54:07 +0800 Subject: [Biojava-l] Access to variables Message-ID: Further to this I would add that sometimes get / set methods should not be public. This is usually the case for set methods where you don't want the possibility of something external to the class or package calling the set method and messing things up for you. For a set method to only be accesible internally you would make it private. If you make it protected you have more options. If you make it public you expose it to the world. Basically if you think your set method is not safe for general developers to use under normal circumstances or if it is only relevant to other classes in your API you should make it protected or private. Hope that was not too confusing. Bloch's Effective Java is probably much clearer/ - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 David Huen Sent by: biojava-l-bounces at lists.open-bio.org 05/09/2006 09:12 PM To: n.haigh at sheffield.ac.uk cc: biojava-l at lists.open-bio.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Access to variables On May 9 2006, Nathan S. Haigh wrote: >Apologies if this comes through more than once - I forgot to send in plain >text without attachments! > >In case you don't know - I'm new to Java?. > >I'm working out an interface/class structure for part of an app I want to >convert from Perl to Java and I have a question about the best way to >provide access to variables to the client programmer: > >Is it best to have variables you want the client programmer to access just >made public or is it best to provide access to them via a get/set method? >> From my limited reading of "Thinking in Java" I would think it best to >> hide >the implementation from the user and provide methods to access these >variables e.g. setThreshold and getThreshold modify the private variable >threshold - is that correct or am I way off the mark!? > Breaking object encapsulation is generally a bad thing in OO programming so, yes, avoid it when you can. We try to make it difficult to do so in BioJava anyway :-). Regards, David _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From n.haigh at sheffield.ac.uk Thu May 11 09:27:28 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 14:27:28 +0100 Subject: [Biojava-l] Creating an alignment object Message-ID: <004001c674fe$a637d570$9f5ea78f@bmbpc196> I'm new to Java and Biojava, but I've been having a play with writing and interface and some classes for an app I'd like to write in Java. The part I'm playing around with at the moment deals with alignments and groups of alignment positions. What is the easiest/best way to create an alignment that I can then play around with and generate Locations from? A self contained working example would be great because as I said, I'm really new to java! Cheers Nath ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ---------------------------------------------------------------------------- ------ --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 14:27:16 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Thu May 11 09:56:20 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 11 May 2006 14:56:20 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004001c674fe$a637d570$9f5ea78f@bmbpc196> References: <004001c674fe$a637d570$9f5ea78f@bmbpc196> Message-ID: <1147355780.3951.59.camel@texas.ebi.ac.uk> BioJava itself cannot align sequences. It can only create objects that are representations of alignments generated by third-party software. However, there is a third-party addon to BioJava called Strap, which can actually do the alignment work itself from within your Java program and return a BioJava alignment object that represents the results. It is available for download, along with an example of how to use it, from here: http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html cheers, Richard On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > I'm new to Java and Biojava, but I've been having a play with writing and > interface and some classes for an app I'd like to write in Java. > > The part I'm playing around with at the moment deals with alignments and > groups of alignment positions. What is the easiest/best way to create an > alignment that I can then play around with and generate Locations from? A > self contained working example would be great because as I said, I'm really > new to java! > > Cheers > Nath > > ---------------------------------------------------------------------------- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211 Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences Mob: +44 (0)7742 533 > 569 > University of Sheffield Fax: +44 (0)114 22 > 20002 > Western Bank Web: > www.bioinf.shef.ac.uk > Sheffield > www.petraea.shef.ac.uk > S10 2TN > ---------------------------------------------------------------------------- > ------ > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-2, 11/05/2006 > Tested on: 11/05/2006 14:27:16 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Thu May 11 10:26:59 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 15:26:59 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <1147355780.3951.59.camel@texas.ebi.ac.uk> Message-ID: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Sorry, I think I may have been unclear. For example I have an alignment file in FASTA format which looks like: >seq1 ACGTTGCA >seq2 ATGTTGCG >seq3 AGGTTGCT >seq4 AGGTTGCC How do I get this into an alignment object? Or, better still, can I create an alignment object without specifying an alignment file, but somehow creating the alignment by hand? Maybe create, a sequence object for each of the above sequences and add them to an alignment object? Something like that! :o) Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 11 May 2006 14:56 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Creating an alignment object > > BioJava itself cannot align sequences. It can only create objects that > are representations of alignments generated by third-party software. > > However, there is a third-party addon to BioJava called Strap, which can > actually do the alignment work itself from within your Java program and > return a BioJava alignment object that represents the results. It is > available for download, along with an example of how to use it, from > here: > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > cheers, > Richard > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > I'm new to Java and Biojava, but I've been having a play with writing > and > > interface and some classes for an app I'd like to write in Java. > > > > The part I'm playing around with at the moment deals with alignments and > > groups of alignment positions. What is the easiest/best way to create an > > alignment that I can then play around with and generate Locations from? > A > > self contained working example would be great because as I said, I'm > really > > new to java! > > > > Cheers > > Nath > > > > ------------------------------------------------------------------------ > ---- > > ------ > > Dr. Nathan S. Haigh > > Bioinformatics PostDoctoral Research Associate > > > > Room B2 211 Tel: +44 (0)114 > 22 > > 20112 > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > 533 > > 569 > > University of Sheffield Fax: +44 (0)114 > 22 > > 20002 > > Western Bank Web: > > www.bioinf.shef.ac.uk > > Sheffield > > www.petraea.shef.ac.uk > > S10 2TN > > ------------------------------------------------------------------------ > ---- > > ------ > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0619-2, 11/05/2006 > > Tested on: 11/05/2006 14:27:16 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 15:26:57 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Thu May 11 10:43:23 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 11 May 2006 15:43:23 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004901c67506$f6e03460$9f5ea78f@bmbpc196> References: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Message-ID: <1147358603.3951.73.camel@texas.ebi.ac.uk> Andreas Prlic just pointed out to me that... "Andreas Draeger provided the org.biojava.bio.alignment classes, where one can do e.g. swith waterman and needleman wunsch...". Having just had a look at this it's very powerful and you should be able to implement SequenceAlignment with your own algorithm to construct a FlexibleAlignment object, if that's what you're ultimately intending to do. Basically you add sequences to/from a FlexibleAlignment, then insert gaps and deletions as necessary, all from the SequenceAlignment implementation which is passed as input a set of Sequence objects to align. cheers, Richard On Thu, 2006-05-11 at 15:26 +0100, Nathan S. Haigh wrote: > Sorry, I think I may have been unclear. > > For example I have an alignment file in FASTA format which looks like: > > >seq1 > ACGTTGCA > >seq2 > ATGTTGCG > >seq3 > AGGTTGCT > >seq4 > AGGTTGCC > > > How do I get this into an alignment object? Or, better still, can I create > an alignment object without specifying an alignment file, but somehow > creating the alignment by hand? Maybe create, a sequence object for each of > the above sequences and add them to an alignment object? > > Something like that! :o) > > Nath > > > -----Original Message----- > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > Sent: 11 May 2006 14:56 > > To: n.haigh at sheffield.ac.uk > > Cc: biojava-l at lists.open-bio.org > > Subject: Re: [Biojava-l] Creating an alignment object > > > > BioJava itself cannot align sequences. It can only create objects that > > are representations of alignments generated by third-party software. > > > > However, there is a third-party addon to BioJava called Strap, which can > > actually do the alignment work itself from within your Java program and > > return a BioJava alignment object that represents the results. It is > > available for download, along with an example of how to use it, from > > here: > > > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > > > cheers, > > Richard > > > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > > I'm new to Java and Biojava, but I've been having a play with writing > > and > > > interface and some classes for an app I'd like to write in Java. > > > > > > The part I'm playing around with at the moment deals with alignments and > > > groups of alignment positions. What is the easiest/best way to create an > > > alignment that I can then play around with and generate Locations from? > > A > > > self contained working example would be great because as I said, I'm > > really > > > new to java! > > > > > > Cheers > > > Nath > > > > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > Dr. Nathan S. Haigh > > > Bioinformatics PostDoctoral Research Associate > > > > > > Room B2 211 Tel: +44 (0)114 > > 22 > > > 20112 > > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > > 533 > > > 569 > > > University of Sheffield Fax: +44 (0)114 > > 22 > > > 20002 > > > Western Bank Web: > > > www.bioinf.shef.ac.uk > > > Sheffield > > > www.petraea.shef.ac.uk > > > S10 2TN > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0619-2, 11/05/2006 > > > Tested on: 11/05/2006 14:27:16 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-2, 11/05/2006 > Tested on: 11/05/2006 15:26:57 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Thu May 11 10:52:27 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 15:52:27 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <1147358603.3951.73.camel@texas.ebi.ac.uk> Message-ID: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> Nope, I don't need to generate an alignment, I already have an alignment in a file created by third party software (clustalw). In fact, the app I'd eventually like to have written in Java would include some sort of wrapper for clustalw in order to construct the alignments from a set of unaligned sequences, but algorithms implemented in Biojava would also be a welcome addition to the app. But first things first. If I didn't have any sequences or an alignment in any files. What is the easiest way to get an alignment object in Java to have a play around with? Is there a way to just "magically" create a default alignment of say 5 sequences with 20 positions? Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 11 May 2006 15:43 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: RE: [Biojava-l] Creating an alignment object > > Andreas Prlic just pointed out to me that... "Andreas Draeger provided > the org.biojava.bio.alignment classes, where one can do e.g. swith > waterman and needleman wunsch...". > > Having just had a look at this it's very powerful and you should be able > to implement SequenceAlignment with your own algorithm to construct a > FlexibleAlignment object, if that's what you're ultimately intending to > do. > > Basically you add sequences to/from a FlexibleAlignment, then insert > gaps and deletions as necessary, all from the SequenceAlignment > implementation which is passed as input a set of Sequence objects to > align. > > cheers, > Richard > > On Thu, 2006-05-11 at 15:26 +0100, Nathan S. Haigh wrote: > > Sorry, I think I may have been unclear. > > > > For example I have an alignment file in FASTA format which looks like: > > > > >seq1 > > ACGTTGCA > > >seq2 > > ATGTTGCG > > >seq3 > > AGGTTGCT > > >seq4 > > AGGTTGCC > > > > > > How do I get this into an alignment object? Or, better still, can I > create > > an alignment object without specifying an alignment file, but somehow > > creating the alignment by hand? Maybe create, a sequence object for each > of > > the above sequences and add them to an alignment object? > > > > Something like that! :o) > > > > Nath > > > > > -----Original Message----- > > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > > Sent: 11 May 2006 14:56 > > > To: n.haigh at sheffield.ac.uk > > > Cc: biojava-l at lists.open-bio.org > > > Subject: Re: [Biojava-l] Creating an alignment object > > > > > > BioJava itself cannot align sequences. It can only create objects that > > > are representations of alignments generated by third-party software. > > > > > > However, there is a third-party addon to BioJava called Strap, which > can > > > actually do the alignment work itself from within your Java program > and > > > return a BioJava alignment object that represents the results. It is > > > available for download, along with an example of how to use it, from > > > here: > > > > > > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > > > > > cheers, > > > Richard > > > > > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > > > I'm new to Java and Biojava, but I've been having a play with > writing > > > and > > > > interface and some classes for an app I'd like to write in Java. > > > > > > > > The part I'm playing around with at the moment deals with alignments > and > > > > groups of alignment positions. What is the easiest/best way to > create an > > > > alignment that I can then play around with and generate Locations > from? > > > A > > > > self contained working example would be great because as I said, I'm > > > really > > > > new to java! > > > > > > > > Cheers > > > > Nath > > > > > > > > -------------------------------------------------------------------- > ---- > > > ---- > > > > ------ > > > > Dr. Nathan S. Haigh > > > > Bioinformatics PostDoctoral Research Associate > > > > > > > > Room B2 211 Tel: +44 > (0)114 > > > 22 > > > > 20112 > > > > Department of Animal and Plant Sciences Mob: +44 > (0)7742 > > > 533 > > > > 569 > > > > University of Sheffield Fax: +44 > (0)114 > > > 22 > > > > 20002 > > > > Western Bank Web: > > > > www.bioinf.shef.ac.uk > > > > Sheffield > > > > www.petraea.shef.ac.uk > > > > S10 2TN > > > > -------------------------------------------------------------------- > ---- > > > ---- > > > > ------ > > > > > > > > --- > > > > avast! Antivirus: Outbound message clean. > > > > Virus Database (VPS): 0619-2, 11/05/2006 > > > > Tested on: 11/05/2006 14:27:16 > > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > > http://www.avast.com > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > -- > > > Richard Holland (BioMart Team) > > > EMBL-EBI > > > Wellcome Trust Genome Campus > > > Hinxton > > > Cambridge CB10 1SD > > > UNITED KINGDOM > > > Tel: +44-(0)1223-494416 > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0619-2, 11/05/2006 > > Tested on: 11/05/2006 15:26:57 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 15:52:15 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From md5 at sanger.ac.uk Thu May 11 10:51:08 2006 From: md5 at sanger.ac.uk (Mutlu Dogruel) Date: Thu, 11 May 2006 15:51:08 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004901c67506$f6e03460$9f5ea78f@bmbpc196> References: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Message-ID: <44634F5C.8000705@sanger.ac.uk> Hi Nathan, You need something like this: BufferedReader br = new BufferedReader(new FileReader("file.txt")); FastaAlignmentFormat faf = new FastaAlignmentFormat(); Alignment aligned = faf.read( br ); br.close(); Cheers M. Nathan S. Haigh wrote: >Sorry, I think I may have been unclear. > >For example I have an alignment file in FASTA format which looks like: > > > >>seq1 >> >> >ACGTTGCA > > >>seq2 >> >> >ATGTTGCG > > >>seq3 >> >> >AGGTTGCT > > >>seq4 >> >> >AGGTTGCC > > >How do I get this into an alignment object? Or, better still, can I create >an alignment object without specifying an alignment file, but somehow >creating the alignment by hand? Maybe create, a sequence object for each of >the above sequences and add them to an alignment object? > >Something like that! :o) > >Nath > > > >>-----Original Message----- >>From: Richard Holland [mailto:richard.holland at ebi.ac.uk] >>Sent: 11 May 2006 14:56 >>To: n.haigh at sheffield.ac.uk >>Cc: biojava-l at lists.open-bio.org >>Subject: Re: [Biojava-l] Creating an alignment object >> >>BioJava itself cannot align sequences. It can only create objects that >>are representations of alignments generated by third-party software. >> >>However, there is a third-party addon to BioJava called Strap, which can >>actually do the alignment work itself from within your Java program and >>return a BioJava alignment object that represents the results. It is >>available for download, along with an example of how to use it, from >>here: >> >> http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html >> >>cheers, >>Richard >> >>On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: >> >> >>>I'm new to Java and Biojava, but I've been having a play with writing >>> >>> >>and >> >> >>>interface and some classes for an app I'd like to write in Java. >>> >>>The part I'm playing around with at the moment deals with alignments and >>>groups of alignment positions. What is the easiest/best way to create an >>>alignment that I can then play around with and generate Locations from? >>> >>> >>A >> >> >>>self contained working example would be great because as I said, I'm >>> >>> >>really >> >> >>>new to java! >>> >>>Cheers >>>Nath >>> >>>------------------------------------------------------------------------ >>> >>> >>---- >> >> >>>------ >>>Dr. Nathan S. Haigh >>>Bioinformatics PostDoctoral Research Associate >>> >>>Room B2 211 Tel: +44 (0)114 >>> >>> >>22 >> >> >>>20112 >>>Department of Animal and Plant Sciences Mob: +44 (0)7742 >>> >>> >>533 >> >> >>>569 >>>University of Sheffield Fax: +44 (0)114 >>> >>> >>22 >> >> >>>20002 >>>Western Bank Web: >>>www.bioinf.shef.ac.uk >>>Sheffield >>>www.petraea.shef.ac.uk >>>S10 2TN >>>------------------------------------------------------------------------ >>> >>> >>---- >> >> >>>------ >>> >>>--- >>>avast! Antivirus: Outbound message clean. >>>Virus Database (VPS): 0619-2, 11/05/2006 >>>Tested on: 11/05/2006 14:27:16 >>>avast! - copyright (c) 1988-2006 ALWIL Software. >>>http://www.avast.com >>> >>> >>> >>> >>> >>>_______________________________________________ >>>Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> >>> >>-- >>Richard Holland (BioMart Team) >>EMBL-EBI >>Wellcome Trust Genome Campus >>Hinxton >>Cambridge CB10 1SD >>UNITED KINGDOM >>Tel: +44-(0)1223-494416 >> >> > >--- >avast! Antivirus: Outbound message clean. >Virus Database (VPS): 0619-2, 11/05/2006 >Tested on: 11/05/2006 15:26:57 >avast! - copyright (c) 1988-2006 ALWIL Software. >http://www.avast.com > > > > > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > From richard.holland at ebi.ac.uk Fri May 12 04:34:41 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Fri, 12 May 2006 09:34:41 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> References: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> Message-ID: <1147422881.19855.11.camel@texas.ebi.ac.uk> Sorry for the delay in replying - I had to leave work a bit early yesterday. > Nope, I don't need to generate an alignment, I already have an alignment in > a file created by third party software (clustalw). There is nothing that I know of in BioJava that reads ClustalW files directly into Alignment objects. (If someone else knows different, please correct me). There are certainly methods in BioJava which read the alignments from ClustalW into a set of String objects, each one representing a member sequence (see SequenceAlignmentSAXParser), but I don't know of anything more detailed than that. The third-party package called Strap which I mentioned yesterday happily reads/writes many of the major alignment formats, and has wrappers for running ClustalW and other aligners programatically and reading back in the results, so it is definitely worth a look. You can use a lot of its functions without having to run the GUI, including reading/writing various alignment formats. > > In fact, the app I'd > eventually like to have written in Java would include some sort of wrapper > for clustalw in order to construct the alignments from a set of unaligned > sequences, but algorithms implemented in Biojava would also be a welcome > addition to the app. If you want to wrap clustalw, the simplest way would be to create Sequence objects in BioJava, write them out to Fasta using the BioJava sequence IO tools, use the Java 'system' command (or one of the alternatives to it) to run ClustalW. However you still then have the problem of reading the output back in again. The classes in org.biojava.bio.alignment that I mentioned yesterday implements several useful alignment algorithms which you can use as an alternative to ClustalW. > But first things first. > If I didn't have any sequences or an alignment in any files. What is the > easiest way to get an alignment object in Java to have a play around with? Make an instance of FlexibleAlignment from org.biojava.bio.alignment, and use its methods to add sequences to it. It doesn't do any aligning itself - it is just a placeholder to contain sequences and information about how they align. You have to use its methods to add and remove sequences from the alignment, to add/remove gaps and deletions, and get things like consensus sequences etc. Technically I suppose you could use FlexibleAlignment in conjunction with SequenceAlignmentSAXParser to read alignment members as strings, construct sequences based on them, and add them to the alignment object, but I haven't tried this myself. It'd probably require some extra processing to convert the dashes (gaps) in the inputted strings into proper gaps in the alignment. > Is there a way to just "magically" create a default alignment of say 5 > sequences with 20 positions? You'd have to manually create yourself 5 sequences and add them to a FlexibleAlignment as described above. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From mark.schreiber at novartis.com Mon May 15 05:15:50 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 15 May 2006 17:15:50 +0800 Subject: [Biojava-l] Creating an alignment object Message-ID: I think ClustalW can output alignments as fasta alignment format which biojava definitely can read. - Mark Richard Holland Sent by: biojava-l-bounces at lists.open-bio.org 05/12/2006 04:34 PM To: n.haigh at sheffield.ac.uk cc: biojava-l at lists.open-bio.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Creating an alignment object Sorry for the delay in replying - I had to leave work a bit early yesterday. > Nope, I don't need to generate an alignment, I already have an alignment in > a file created by third party software (clustalw). There is nothing that I know of in BioJava that reads ClustalW files directly into Alignment objects. (If someone else knows different, please correct me). There are certainly methods in BioJava which read the alignments from ClustalW into a set of String objects, each one representing a member sequence (see SequenceAlignmentSAXParser), but I don't know of anything more detailed than that. The third-party package called Strap which I mentioned yesterday happily reads/writes many of the major alignment formats, and has wrappers for running ClustalW and other aligners programatically and reading back in the results, so it is definitely worth a look. You can use a lot of its functions without having to run the GUI, including reading/writing various alignment formats. > > In fact, the app I'd > eventually like to have written in Java would include some sort of wrapper > for clustalw in order to construct the alignments from a set of unaligned > sequences, but algorithms implemented in Biojava would also be a welcome > addition to the app. If you want to wrap clustalw, the simplest way would be to create Sequence objects in BioJava, write them out to Fasta using the BioJava sequence IO tools, use the Java 'system' command (or one of the alternatives to it) to run ClustalW. However you still then have the problem of reading the output back in again. The classes in org.biojava.bio.alignment that I mentioned yesterday implements several useful alignment algorithms which you can use as an alternative to ClustalW. > But first things first. > If I didn't have any sequences or an alignment in any files. What is the > easiest way to get an alignment object in Java to have a play around with? Make an instance of FlexibleAlignment from org.biojava.bio.alignment, and use its methods to add sequences to it. It doesn't do any aligning itself - it is just a placeholder to contain sequences and information about how they align. You have to use its methods to add and remove sequences from the alignment, to add/remove gaps and deletions, and get things like consensus sequences etc. Technically I suppose you could use FlexibleAlignment in conjunction with SequenceAlignmentSAXParser to read alignment members as strings, construct sequences based on them, and add them to the alignment object, but I haven't tried this myself. It'd probably require some extra processing to convert the dashes (gaps) in the inputted strings into proper gaps in the alignment. > Is there a way to just "magically" create a default alignment of say 5 > sequences with 20 positions? You'd have to manually create yourself 5 sequences and add them to a FlexibleAlignment as described above. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From n.haigh at sheffield.ac.uk Mon May 15 05:24:27 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 15 May 2006 10:24:27 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: Message-ID: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> That's right, clustalw can output in several formats including fasta. It would be nice to have Biojava able to read and write the clustalw format as it is a widely used format. How, easy is it to write something like this? Maybe when I start to learn more about Java I could have a go at doing this. Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 15 May 2006 10:16 > To: Richard Holland > Cc: biojava-l at lists.open-bio.org; n.haigh at sheffield.ac.uk > Subject: Re: [Biojava-l] Creating an alignment object > > I think ClustalW can output alignments as fasta alignment format which > biojava definitely can read. > > - Mark > > > > > > Richard Holland > Sent by: biojava-l-bounces at lists.open-bio.org > 05/12/2006 04:34 PM > > > To: n.haigh at sheffield.ac.uk > cc: biojava-l at lists.open-bio.org, (bcc: Mark > Schreiber/GP/Novartis) > Subject: Re: [Biojava-l] Creating an alignment object > > > Sorry for the delay in replying - I had to leave work a bit early > yesterday. > > > Nope, I don't need to generate an alignment, I already have an alignment > in > > a file created by third party software (clustalw). > > There is nothing that I know of in BioJava that reads ClustalW files > directly into Alignment objects. (If someone else knows different, > please correct me). There are certainly methods in BioJava which read > the alignments from ClustalW into a set of String objects, each one > representing a member sequence (see SequenceAlignmentSAXParser), but I > don't know of anything more detailed than that. > > The third-party package called Strap which I mentioned yesterday happily > reads/writes many of the major alignment formats, and has wrappers for > running ClustalW and other aligners programatically and reading back in > the results, so it is definitely worth a look. You can use a lot of its > functions without having to run the GUI, including reading/writing > various alignment formats. > > > > > In fact, the app I'd > > eventually like to have written in Java would include some sort of > wrapper > > for clustalw in order to construct the alignments from a set of > unaligned > > sequences, but algorithms implemented in Biojava would also be a welcome > > addition to the app. > > If you want to wrap clustalw, the simplest way would be to create > Sequence objects in BioJava, write them out to Fasta using the BioJava > sequence IO tools, use the Java 'system' command (or one of the > alternatives to it) to run ClustalW. However you still then have the > problem of reading the output back in again. > > The classes in org.biojava.bio.alignment that I mentioned yesterday > implements several useful alignment algorithms which you can use as an > alternative to ClustalW. > > > But first things first. > > If I didn't have any sequences or an alignment in any files. What is the > > easiest way to get an alignment object in Java to have a play around > with? > > Make an instance of FlexibleAlignment from org.biojava.bio.alignment, > and use its methods to add sequences to it. It doesn't do any aligning > itself - it is just a placeholder to contain sequences and information > about how they align. You have to use its methods to add and remove > sequences from the alignment, to add/remove gaps and deletions, and get > things like consensus sequences etc. > > Technically I suppose you could use FlexibleAlignment in conjunction > with SequenceAlignmentSAXParser to read alignment members as strings, > construct sequences based on them, and add them to the alignment object, > but I haven't tried this myself. It'd probably require some extra > processing to convert the dashes (gaps) in the inputted strings into > proper gaps in the alignment. > > > Is there a way to just "magically" create a default alignment of say 5 > > sequences with 20 positions? > > You'd have to manually create yourself 5 sequences and add them to a > FlexibleAlignment as described above. > > cheers, > Richard > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-3, 12/05/2006 Tested on: 15/05/2006 10:24:25 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Mon May 15 05:49:47 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Mon, 15 May 2006 10:49:47 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> References: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> Message-ID: <1147686587.3950.10.camel@texas.ebi.ac.uk> One way to write a file parser (which I used in all the BioJavaX parsers) is to write an event-based one, which requires two parts: a parser, and an event listener. Basically, the parser reads a chunk from the file, recognises what kind of chunk it is and does some pre-parsing on it, for example stripping whitespace etc. or concatenating lines of sequence data. It then sends a signal to an event listener saying it has received a chunk of data of a certain kind, and asks the event listener to process that data. The event listener could receive this data in any order (and hence one listener can be adapted to listen for events from many file formats), so needs to be aware of its state at any given point during the parsing process. The code tends to get quite long and convoluted, but the concept is quite simple. Hopefully this gives you an idea of how to do it - you don't necessarily need to know any particular programming language in order to design this kind of parser/listener, just a good knowledge of the file format and the ability to describe the various interesting sections of a file and how to spot them. You can then convert these descriptions into Java or any other language once you've learnt the skills to do so. Regular expressions can be extremely useful, as are the Java String methods toUpperCase(), toLowerCase(), contains(), equals(), equalsIgnoreCase(), startsWith() and endsWith(). It gets a little more complicated once you start allowing for non- standard files, such as those containing irregular whitespace or extra blank lines, but if you write a strict parser first (which all the BioJavaX parsers are), this type of flexibility can be left till later. Good luck! cheers, Richard On Mon, 2006-05-15 at 10:24 +0100, Nathan S. Haigh wrote: > That's right, clustalw can output in several formats including fasta. It > would be nice to have Biojava able to read and write the clustalw format as > it is a widely used format. How, easy is it to write something like this? > Maybe when I start to learn more about Java I could have a go at doing this. > > Nath > > > -----Original Message----- > > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > > Sent: 15 May 2006 10:16 > > To: Richard Holland > > Cc: biojava-l at lists.open-bio.org; n.haigh at sheffield.ac.uk > > Subject: Re: [Biojava-l] Creating an alignment object > > > > I think ClustalW can output alignments as fasta alignment format which > > biojava definitely can read. > > > > - Mark > > > > > > > > > > > > Richard Holland > > Sent by: biojava-l-bounces at lists.open-bio.org > > 05/12/2006 04:34 PM > > > > > > To: n.haigh at sheffield.ac.uk > > cc: biojava-l at lists.open-bio.org, (bcc: Mark > > Schreiber/GP/Novartis) > > Subject: Re: [Biojava-l] Creating an alignment object > > > > > > Sorry for the delay in replying - I had to leave work a bit early > > yesterday. > > > > > Nope, I don't need to generate an alignment, I already have an alignment > > in > > > a file created by third party software (clustalw). > > > > There is nothing that I know of in BioJava that reads ClustalW files > > directly into Alignment objects. (If someone else knows different, > > please correct me). There are certainly methods in BioJava which read > > the alignments from ClustalW into a set of String objects, each one > > representing a member sequence (see SequenceAlignmentSAXParser), but I > > don't know of anything more detailed than that. > > > > The third-party package called Strap which I mentioned yesterday happily > > reads/writes many of the major alignment formats, and has wrappers for > > running ClustalW and other aligners programatically and reading back in > > the results, so it is definitely worth a look. You can use a lot of its > > functions without having to run the GUI, including reading/writing > > various alignment formats. > > > > > > > > In fact, the app I'd > > > eventually like to have written in Java would include some sort of > > wrapper > > > for clustalw in order to construct the alignments from a set of > > unaligned > > > sequences, but algorithms implemented in Biojava would also be a welcome > > > addition to the app. > > > > If you want to wrap clustalw, the simplest way would be to create > > Sequence objects in BioJava, write them out to Fasta using the BioJava > > sequence IO tools, use the Java 'system' command (or one of the > > alternatives to it) to run ClustalW. However you still then have the > > problem of reading the output back in again. > > > > The classes in org.biojava.bio.alignment that I mentioned yesterday > > implements several useful alignment algorithms which you can use as an > > alternative to ClustalW. > > > > > But first things first. > > > If I didn't have any sequences or an alignment in any files. What is the > > > easiest way to get an alignment object in Java to have a play around > > with? > > > > Make an instance of FlexibleAlignment from org.biojava.bio.alignment, > > and use its methods to add sequences to it. It doesn't do any aligning > > itself - it is just a placeholder to contain sequences and information > > about how they align. You have to use its methods to add and remove > > sequences from the alignment, to add/remove gaps and deletions, and get > > things like consensus sequences etc. > > > > Technically I suppose you could use FlexibleAlignment in conjunction > > with SequenceAlignmentSAXParser to read alignment members as strings, > > construct sequences based on them, and add them to the alignment object, > > but I haven't tried this myself. It'd probably require some extra > > processing to convert the dashes (gaps) in the inputted strings into > > proper gaps in the alignment. > > > > > Is there a way to just "magically" create a default alignment of say 5 > > > sequences with 20 positions? > > > > You'd have to manually create yourself 5 sequences and add them to a > > FlexibleAlignment as described above. > > > > cheers, > > Richard > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-3, 12/05/2006 > Tested on: 15/05/2006 10:24:25 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From mark.schreiber at novartis.com Wed May 17 01:44:18 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 17 May 2006 13:44:18 +0800 Subject: [Biojava-l] external processes Message-ID: Hi all - I noticed that someone has posted a tutorial to the wiki page (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how to launch ClustalW from biojava which is very much appreciated. The tutorial makes use of the standard Java Runtime and Process classes. Developers may also be interested in the ExecRunner class that is in the utils package of biojava1.4. There is also an entire API for handelling external processes in the CVS version of biojava (org.biojava.utils.process) which makes handling of external processes much simpler. - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From mark.schreiber at novartis.com Wed May 17 05:38:34 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 17 May 2006 17:38:34 +0800 Subject: [Biojava-l] external processes Message-ID: Sounds reasonable, Do you have CVS access? If so please submit this addition. Can you also put javadoc comments giving the example of why you might need to use a string[] as a parameter. Can you also add a @since 1.5 javadoc tag to the method and add yourself as an author to the class (javadoc doesn't allow for @author comments at the method level). Thanks, - Mark Andreas Dr?ger 05/17/2006 05:27 PM To: mark.schreiber at novartis.com cc: biojava-l at biojava.org Subject: Re: [Biojava-l] external processes Hello, I just tried the ExecRunner class with a compliation of a Matlab skript. My parameters are Matlab matrices and vectors like [1 2 3; 4 5 6] and so on. In the ExecRunner class this 2x3 matrix will be destroyed by the StringTokenizer and my Matlab skript will be started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't make any sense. I would like to suggest to add a method where one can pass the aruments as a String[]-vector. In my case I could pass every single Matlab matrix and every Matlab vector as a single String. I just tried this out with the following code: public static String execute(String command[]) throws IOException { String out = null, temp; Process exe = Runtime.getRuntime().exec(command); BufferedReader in = new BufferedReader( new InputStreamReader(exe.getInputStream())); for (out = ""; (temp = in.readLine()) != null; out += temp + "\n"); return out; } It works fine. I would add something similar to the class ExecRunner mentioned above, but adapted so that the other features of this class will also be maintained. Andreas Dr?ger mark.schreiber at novartis.com wrote: >Hi all - > >I noticed that someone has posted a tutorial to the wiki page >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how >to launch ClustalW from biojava which is very much appreciated. The >tutorial makes use of the standard Java Runtime and Process classes. >Developers may also be interested in the ExecRunner class that is in the >utils package of biojava1.4. > >There is also an entire API for handelling external processes in the CVS >version of biojava (org.biojava.utils.process) which makes handling of >external processes much simpler. > >- Mark > >Mark Schreiber >Research Investigator (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- ================================== Andreas Dr?ger PhD student Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Phone: +49-7071-29-70436 ================================== From andreas.draeger at uni-tuebingen.de Wed May 17 05:27:00 2006 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Wed, 17 May 2006 11:27:00 +0200 Subject: [Biojava-l] external processes In-Reply-To: References: Message-ID: <446AEC64.8070604@uni-tuebingen.de> Hello, I just tried the ExecRunner class with a compliation of a Matlab skript. My parameters are Matlab matrices and vectors like [1 2 3; 4 5 6] and so on. In the ExecRunner class this 2x3 matrix will be destroyed by the StringTokenizer and my Matlab skript will be started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't make any sense. I would like to suggest to add a method where one can pass the aruments as a String[]-vector. In my case I could pass every single Matlab matrix and every Matlab vector as a single String. I just tried this out with the following code: public static String execute(String command[]) throws IOException { String out = null, temp; Process exe = Runtime.getRuntime().exec(command); BufferedReader in = new BufferedReader( new InputStreamReader(exe.getInputStream())); for (out = ""; (temp = in.readLine()) != null; out += temp + "\n"); return out; } It works fine. I would add something similar to the class ExecRunner mentioned above, but adapted so that the other features of this class will also be maintained. Andreas Dr?ger mark.schreiber at novartis.com wrote: >Hi all - > >I noticed that someone has posted a tutorial to the wiki page >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how >to launch ClustalW from biojava which is very much appreciated. The >tutorial makes use of the standard Java Runtime and Process classes. >Developers may also be interested in the ExecRunner class that is in the >utils package of biojava1.4. > >There is also an entire API for handelling external processes in the CVS >version of biojava (org.biojava.utils.process) which makes handling of >external processes much simpler. > >- Mark > >Mark Schreiber >Research Investigator (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- ================================== Andreas Dr?ger PhD student Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Phone: +49-7071-29-70436 ================================== From guedes at unisul.br Wed May 17 08:18:26 2006 From: guedes at unisul.br (Dickson S. Guedes) Date: Wed, 17 May 2006 09:18:26 -0300 Subject: [Biojava-l] RES: external processes In-Reply-To: Message-ID: <200605171218.k4HCIWO5029410@relay.unisul.br> Hi All, Sorry, I don't have many time in this week but, I did see many question in list about MultAlign using Biojava, but Biojava DON'T have a Class to make MultAlign. So In my teses I?d want to use a set of pre-aligned sequences, then I create a Class to do it calling ClustalW as a external executable. A Simple Example founds at: http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW Sorry, I don't make any comments and any comments or javadoc in that class, but I'll do, for now I have other things to do at this week and don't have much time :( I accept sugestions too, and thanks for all. []s -- Dickson S. Guedes /* * UNISUL - Universidade do Sul de Santa Catarina * ATI - Assessoria de Tecnologia da Informa??o * Tubar?o - Santa Catarina - Brasil * (0xx48) 621-3200 - http://www.unisul.br * * "Quis custodiet ipsos custodes?" */ > -----Mensagem original----- > De: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] Em nome de > mark.schreiber at novartis.com > Enviada em: quarta-feira, 17 de maio de 2006 02:44 > Para: biojava-l at biojava.org > Assunto: [Biojava-l] external processes > > Hi all - > > I noticed that someone has posted a tutorial to the wiki page > (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing how to launch ClustalW from biojava which is very > much appreciated. The tutorial makes use of the standard Java > Runtime and Process classes. > Developers may also be interested in the ExecRunner class > that is in the utils package of biojava1.4. > > There is also an entire API for handelling external processes > in the CVS version of biojava (org.biojava.utils.process) > which makes handling of external processes much simpler. > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From russ at kepler-eng.com Wed May 17 18:23:31 2006 From: russ at kepler-eng.com (Russ Kepler) Date: Wed, 17 May 2006 16:23:31 -0600 Subject: [Biojava-l] external processes In-Reply-To: <446AEC64.8070604@uni-tuebingen.de> References: <446AEC64.8070604@uni-tuebingen.de> Message-ID: <200605171623.31936.russ@kepler-eng.com> On Wednesday 17 May 2006 03:27 am, Andreas Dr?ger wrote: > I just tried the ExecRunner class with a compliation of a Matlab skript. > My parameters are Matlab matrices and vectors like > [1 2 3; 4 5 6] > and so on. In the ExecRunner class this 2x3 matrix will be destroyed by > the StringTokenizer and my Matlab skript will be started with the arguments > [1, 2, 3;, 4, 5, 6] > which doesn't make any sense. As a workaround I simply put the command I wanted into a file in the local directory and execute that indirectly with "sh ./cmdfile". It's harder to make it portable, but then executing something seldom is very portable. From Martin.Szugat at GMX.net Wed May 17 19:41:26 2006 From: Martin.Szugat at GMX.net (Martin Szugat) Date: Thu, 18 May 2006 01:41:26 +0200 Subject: [Biojava-l] external processes In-Reply-To: Message-ID: <200605172347.k4HNlAMB000449@newportal.open-bio.org> Do you have tried the ExternalProcess class? (http://cvs.biojava.org/cgi-bin/viewcvs/viewcvs.cgi/biojava-live/src/org/bio java/utils/process/?cvsroot=biojava) As far as I understand the problem the ExternalProcess class isn't affected by it. In addition, because it uses multiple threads from a thread pool, it's more robust against locks, which e.g. can happen, if the called program writes out data faster than the data is read in the calling program. This is a common problem but which does only happen sporadically. So it's very hard to locate and debug. Best regards Martin > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of > mark.schreiber at novartis.com > Sent: Wednesday, May 17, 2006 11:39 AM > To: Andreas Dr?ger > Cc: mark.schreiber at novartis.com; biojava-l at biojava.org > Subject: Re: [Biojava-l] external processes > > Sounds reasonable, > > Do you have CVS access? If so please submit this addition. > Can you also put javadoc comments giving the example of why > you might need to use a string[] as a parameter. > > Can you also add a @since 1.5 javadoc tag to the method and > add yourself as an author to the class (javadoc doesn't allow > for @author comments at the method level). > > Thanks, > > - Mark > > > > > > Andreas Dr?ger > 05/17/2006 05:27 PM > > > To: mark.schreiber at novartis.com > cc: biojava-l at biojava.org > Subject: Re: [Biojava-l] external processes > > > Hello, > > I just tried the ExecRunner class with a compliation of a > Matlab skript. > My parameters are Matlab matrices and vectors like > [1 2 3; 4 5 6] > and so on. In the ExecRunner class this 2x3 matrix will be > destroyed by the StringTokenizer and my Matlab skript will be > started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't > make any sense. I would like to suggest to add a method where > one can pass the aruments as a String[]-vector. In my case I > could pass every single Matlab matrix and every Matlab vector > as a single String. I just tried this out with the following code: > > public static String execute(String command[]) throws IOException { > String out = null, temp; > Process exe = Runtime.getRuntime().exec(command); > BufferedReader in = new BufferedReader( > new InputStreamReader(exe.getInputStream())); > for (out = ""; (temp = in.readLine()) != null; out += > temp + "\n"); > return out; > } > > It works fine. I would add something similar to the class > ExecRunner mentioned above, but adapted so that the other > features of this class will also be maintained. > > Andreas Dr?ger > > mark.schreiber at novartis.com wrote: > > >Hi all - > > > >I noticed that someone has posted a tutorial to the wiki page > >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing > >how to launch ClustalW from biojava which is very much > appreciated. The > >tutorial makes use of the standard Java Runtime and Process classes. > >Developers may also be interested in the ExecRunner class that is in > >the utils package of biojava1.4. > > > >There is also an entire API for handelling external processes in the > >CVS version of biojava (org.biojava.utils.process) which > makes handling > >of external processes much simpler. > > > >- Mark > > > >Mark Schreiber > >Research Investigator (Bioinformatics) > > > >Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > >#05-01 Chromos > >Singapore 138670 > >www.nitd.novartis.com > > > >phone +65 6722 2973 > >fax +65 6722 2910 > > > >_______________________________________________ > >Biojava-l mailing list - Biojava-l at lists.open-bio.org > >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > > > > > > -- > ================================== > Andreas Dr?ger > PhD student > Eberhard Karls University T?bingen > Center for Bioinformatics (ZBIT) > Phone: +49-7071-29-70436 > ================================== > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From guedes at unisul.br Wed May 17 07:25:42 2006 From: guedes at unisul.br (Dickson S. Guedes) Date: Wed, 17 May 2006 08:25:42 -0300 Subject: [Biojava-l] RES: external processes In-Reply-To: Message-ID: <200605171126.k4HBQ2O5073901@relay.unisul.br> Hi All, Sorry, I don't have many time in this week but, I did see many question in list about MultAlign using Biojava, but Biojava DON'T have a Class to make MultAlign. So In my teses I?d want to use a set of pre-aligned sequences, then I create a Class to do it calling ClustalW as a external executable. A Simple Example founds at: http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW Sorry, I don't make any comments and any comments or javadoc in that class, but I'll do, for now I have other things to do at this week and don't have much time :( I accept sugestions too, and thanks for all. []s -- Dickson S. Guedes /* * UNISUL - Universidade do Sul de Santa Catarina * ATI - Assessoria de Tecnologia da Informa??o * Tubar?o - Santa Catarina - Brasil * (0xx48) 621-3200 - http://www.unisul.br * * "Quis custodiet ipsos custodes?" */ > -----Mensagem original----- > De: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] Em nome de > mark.schreiber at novartis.com > Enviada em: quarta-feira, 17 de maio de 2006 02:44 > Para: biojava-l at biojava.org > Assunto: [Biojava-l] external processes > > Hi all - > > I noticed that someone has posted a tutorial to the wiki page > (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing how to launch ClustalW from biojava which is very > much appreciated. The tutorial makes use of the standard Java > Runtime and Process classes. > Developers may also be interested in the ExecRunner class > that is in the utils package of biojava1.4. > > There is also an entire API for handelling external processes > in the CVS version of biojava (org.biojava.utils.process) > which makes handling of external processes much simpler. > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From n.haigh at sheffield.ac.uk Thu May 18 11:44:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 18 May 2006 16:44:00 +0100 Subject: [Biojava-l] Alignment consensus calculation Message-ID: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> I was wondering if there were any methods for generating a consensus sequence for alignments? Or any suggestions for calculating the frequency of symbols at each position in an alignment. ? I had a look at the DistributionTools after seeing a past e-mail to the list but couldn?t figure if this would do the job as I?m new to Java. ? Thanks Nath ? ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? ?????? ?????www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ?????? ---------------------------------------------------------------------------- ------ ? --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 18/05/2006 16:44:01 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From sanges at biogem.it Thu May 18 12:00:32 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 18 May 2006 18:00:32 +0200 Subject: [Biojava-l] Alignment consensus calculation In-Reply-To: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> References: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> Message-ID: <446C9A20.5010803@biogem.it> Nathan S. Haigh wrote: >I was wondering if there were any methods for generating a consensus >sequence for alignments? Or any suggestions for calculating the frequency of >symbols at each position in an alignment. > >I had a look at the DistributionTools after seeing a past e-mail to the list >but couldn?t figure if this would do the job as I?m new to Java. > > > I'm also new to Java and Biojava, BTW I have found very useful in the past to do these kind of things using the Bio::SimpleAlign module in Bioperl HTH Remo >Thanks >Nath > >---------------------------------------------------------------------------- >------ >Dr. Nathan S. Haigh >Bioinformatics PostDoctoral Research Associate > >Room B2 211 Tel: +44 (0)114 22 >20112 >Department of Animal and Plant Sciences Mob: +44 (0)7742 533 >569 >University of Sheffield Fax: +44 (0)114 22 >20002 >Western Bank Web: >www.bioinf.shef.ac.uk >Sheffield > www.petraea.shef.ac.uk >S10 2TN >---------------------------------------------------------------------------- >------ > > > >--- >avast! Antivirus: Outbound message clean. >Virus Database (VPS): 0620-2, 18/05/2006 >Tested on: 18/05/2006 16:44:01 >avast! - copyright (c) 1988-2006 ALWIL Software. >http://www.avast.com > > > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > From chen_li3 at yahoo.com Thu May 18 17:35:10 2006 From: chen_li3 at yahoo.com (chen li) Date: Thu, 18 May 2006 14:35:10 -0700 (PDT) Subject: [Biojava-l] Problems for testing demos Message-ID: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> Dear all, I am new to Biojava. I install 1) JDk on my Windows XP under c:\Program Files\java\....., 2) biojava.jar, bytecode-0.92.jar, commons-cli.jar, commons-collections-2.1.jar commons-dbcp-1.1.jar, commons-pool-1.1.jar all under c:\biojava folder. Then I go to >demos>seq and type javac TestEmbl.java I get some information on the screen. After searching google I put all *.jar files into this folder: C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext I compile the code again: go to >demos>seq and type javac TestEmbl.java and I get a new file and looks like it works: TestEmbl.class Then I type java TestEmbl I get these infomration on the screen: Exception in thread "main" java.lang.NoClassDefFoundError: TestEmbl (wrong name: seq/TestEmbl) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) I am not sure how to fix it. I search Biojava archies but get no answers. Any idea will be aprreciated. Li __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From mark.schreiber at novartis.com Thu May 18 22:16:00 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 19 May 2006 10:16:00 +0800 Subject: [Biojava-l] Alignment consensus calculation Message-ID: Hi - To get a Distribution[] over an alignment you could use DistributionTools.distOverAlignment(a) or one of the other overloaded methods. To get a consensus you could simply find the most frequent Symbol in each Distribution. To make a more sophisticated consensus you could have thresholds below which you would report an ambiguity. eg if: a = 0.50 t = 0.40 c = 0.0 g = 1.0 Your routine would need to decide if the consensus should be 'a' or 'w' or the IUPAC symbol for [atg] which I cannot remember. You would probably use some sort of cutoff value. It might be a routine like this: public SymbolList consensus(Alignment a, double threshold){ .... } It might be a method that others find useful so please post it back to the list. Hope this helps, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 "Nathan S. Haigh" Sent by: biojava-l-bounces at lists.open-bio.org 05/18/2006 11:44 PM Please respond to n.haigh To: cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Alignment consensus calculation I was wondering if there were any methods for generating a consensus sequence for alignments? Or any suggestions for calculating the frequency of symbols at each position in an alignment. ? I had a look at the DistributionTools after seeing a past e-mail to the list but couldn't figure if this would do the job as I'm new to Java. ? Thanks Nath ? ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? ?????? ?????www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ?????? ---------------------------------------------------------------------------- ------ ? --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 18/05/2006 16:44:01 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Thu May 18 22:25:01 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 19 May 2006 10:25:01 +0800 Subject: [Biojava-l] Problems for testing demos Message-ID: The most likely answer is that the folder containing TestEmbl.class is not on your class path. Either that or you could put TestEmbl.class in your C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext folder. By the way, using the ext folder can cause problems if you have different versions of JAR files or files with conflciting names, however for testing it should be fine. - Mark chen li Sent by: biojava-l-bounces at lists.open-bio.org 05/19/2006 05:35 AM To: biojava-l at lists.open-bio.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Problems for testing demos Dear all, I am new to Biojava. I install 1) JDk on my Windows XP under c:\Program Files\java\....., 2) biojava.jar, bytecode-0.92.jar, commons-cli.jar, commons-collections-2.1.jar commons-dbcp-1.1.jar, commons-pool-1.1.jar all under c:\biojava folder. Then I go to >demos>seq and type javac TestEmbl.java I get some information on the screen. After searching google I put all *.jar files into this folder: C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext I compile the code again: go to >demos>seq and type javac TestEmbl.java and I get a new file and looks like it works: TestEmbl.class Then I type java TestEmbl I get these infomration on the screen: Exception in thread "main" java.lang.NoClassDefFoundError: TestEmbl (wrong name: seq/TestEmbl) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) I am not sure how to fix it. I search Biojava archies but get no answers. Any idea will be aprreciated. Li __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From richard.holland at ebi.ac.uk Fri May 19 03:59:55 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Fri, 19 May 2006 08:59:55 +0100 Subject: [Biojava-l] Problems for testing demos In-Reply-To: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> References: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> Message-ID: <1148025595.4407.38.camel@texas.ebi.ac.uk> This is a basic Java problem, not a BioJava one... The class lives in the 'seq' folder, and is therefore part of the 'seq' package. To run it, you must change to the 'demos' folder which contains the 'seq' folder and type: java seq/TestEmbl.java cheers, Richard On Thu, 2006-05-18 at 14:35 -0700, chen li wrote: > Dear all, > > I am new to Biojava. I install > 1) JDk on my Windows XP under c:\Program > Files\java\....., > 2) biojava.jar, bytecode-0.92.jar, > commons-cli.jar, commons-collections-2.1.jar > commons-dbcp-1.1.jar, commons-pool-1.1.jar all under > c:\biojava folder. Then I go to >demos>seq and type > javac TestEmbl.java I get some information on the > screen. After searching google I put all *.jar files > into this folder: > > C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext > > I compile the code again: > > go to >demos>seq and type > > javac TestEmbl.java > > and I get a new file and looks like it works: > > TestEmbl.class > > Then I type > > java TestEmbl > > I get these infomration on the screen: > > Exception in thread "main" > java.lang.NoClassDefFoundError: TestEmbl (wrong name: > seq/TestEmbl) > at java.lang.ClassLoader.defineClass1(Native > Method) > at java.lang.ClassLoader.defineClass(Unknown > Source) > at > java.security.SecureClassLoader.defineClass(Unknown > Source) > at java.net.URLClassLoader.defineClass(Unknown > Source) > at java.net.URLClassLoader.access$100(Unknown > Source) > at java.net.URLClassLoader$1.run(Unknown > Source) > at > java.security.AccessController.doPrivileged(Native > Method) > at java.net.URLClassLoader.findClass(Unknown > Source) > at java.lang.ClassLoader.loadClass(Unknown > Source) > at > sun.misc.Launcher$AppClassLoader.loadClass(Unknown > Source) > at java.lang.ClassLoader.loadClass(Unknown > Source) > at > java.lang.ClassLoader.loadClassInternal(Unknown > Source) > > I am not sure how to fix it. I search Biojava archies > but get no answers. Any idea will be aprreciated. > > Li > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Fri May 19 06:23:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 19 May 2006 11:23:34 +0100 Subject: [Biojava-l] Alignment consensus calculation In-Reply-To: Message-ID: <003001c67b2e$49060080$9f5ea78f@bmbpc196> Sorry for being really thick :o) BUT, how do you get the frequencies of the symbols at each position in the alignment? I have: Distribution[] dist = DistributionTools.distOverAlignment(alignment, true); But can figure out how to access the frequencies I need. Cheers! :o) Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 19 May 2006 03:16 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Alignment consensus calculation > > Hi - > > To get a Distribution[] over an alignment you could use > DistributionTools.distOverAlignment(a) or one of the other overloaded > methods. > > To get a consensus you could simply find the most frequent Symbol in each > Distribution. To make a more sophisticated consensus you could have > thresholds below which you would report an ambiguity. > > eg if: > > a = 0.50 > t = 0.40 > c = 0.0 > g = 1.0 > > Your routine would need to decide if the consensus should be 'a' or 'w' or > the IUPAC symbol for [atg] which I cannot remember. You would probably use > some sort of cutoff value. It might be a routine like this: > > public SymbolList consensus(Alignment a, double threshold){ > .... > } > > It might be a method that others find useful so please post it back to the > list. > > Hope this helps, > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) > 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > > > > > "Nathan S. Haigh" > Sent by: biojava-l-bounces at lists.open-bio.org > 05/18/2006 11:44 PM > Please respond to n.haigh > > > To: > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] Alignment consensus calculation > > > I was wondering if there were any methods for generating a consensus > sequence for alignments? Or any suggestions for calculating the frequency > of > symbols at each position in an alignment. > > I had a look at the DistributionTools after seeing a past e-mail to the > list > but couldn't figure if this would do the job as I'm new to Java. > > Thanks > Nath > > -------------------------------------------------------------------------- > -- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 > 533 > 569 > University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 > 20002 > Western Bank???????????????????????????? ?????? ?????? Web: > www.bioinf.shef.ac.uk > Sheffield > ?????www.petraea.shef.ac.uk > S10 2TN > -------------------------------------------------------------------------- > -- > ------ > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0620-2, 18/05/2006 > Tested on: 18/05/2006 16:44:01 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 19/05/2006 11:23:32 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From mark.schreiber at novartis.com Sun May 21 20:53:54 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 22 May 2006 08:53:54 +0800 Subject: [Biojava-l] Alignment consensus calculation Message-ID: Hi - Take a look at the Distribution examples in http://biojava.org/wiki/BioJava:Cookbook "Nathan S. Haigh" Sent by: biojava-l-bounces at lists.open-bio.org 05/19/2006 06:23 PM Please respond to n.haigh To: cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Alignment consensus calculation Sorry for being really thick :o) BUT, how do you get the frequencies of the symbols at each position in the alignment? I have: Distribution[] dist = DistributionTools.distOverAlignment(alignment, true); But can figure out how to access the frequencies I need. Cheers! :o) Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 19 May 2006 03:16 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Alignment consensus calculation > > Hi - > > To get a Distribution[] over an alignment you could use > DistributionTools.distOverAlignment(a) or one of the other overloaded > methods. > > To get a consensus you could simply find the most frequent Symbol in each > Distribution. To make a more sophisticated consensus you could have > thresholds below which you would report an ambiguity. > > eg if: > > a = 0.50 > t = 0.40 > c = 0.0 > g = 1.0 > > Your routine would need to decide if the consensus should be 'a' or 'w' or > the IUPAC symbol for [atg] which I cannot remember. You would probably use > some sort of cutoff value. It might be a routine like this: > > public SymbolList consensus(Alignment a, double threshold){ > .... > } > > It might be a method that others find useful so please post it back to the > list. > > Hope this helps, > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) > 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > > > > > "Nathan S. Haigh" > Sent by: biojava-l-bounces at lists.open-bio.org > 05/18/2006 11:44 PM > Please respond to n.haigh > > > To: > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] Alignment consensus calculation > > > I was wondering if there were any methods for generating a consensus > sequence for alignments? Or any suggestions for calculating the frequency > of > symbols at each position in an alignment. > > I had a look at the DistributionTools after seeing a past e-mail to the > list > but couldn't figure if this would do the job as I'm new to Java. > > Thanks > Nath > > -------------------------------------------------------------------------- > -- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 > 533 > 569 > University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 > 20002 > Western Bank???????????????????????????? ?????? ?????? Web: > www.bioinf.shef.ac.uk > Sheffield > ?????www.petraea.shef.ac.uk > S10 2TN > -------------------------------------------------------------------------- > -- > ------ > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0620-2, 18/05/2006 > Tested on: 18/05/2006 16:44:01 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 19/05/2006 11:23:32 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From xiaoqing at cgcmail.cpmc.columbia.edu Wed May 17 10:46:45 2006 From: xiaoqing at cgcmail.cpmc.columbia.edu (Xiaoqing Zhang) Date: Wed, 17 May 2006 10:46:45 -0400 Subject: [Biojava-l] Colorful sequence logo for the transcription factor binding sites? Message-ID: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> Hi, I am trying to draw some TFBS logo with different colors, the classes I found under org.biojava.bio.gui can only draw black and grey. Anyone has some suggestions about where I can find a java package for colorful sequence logo? Thanks a lot. Xiaoqing From td2 at sanger.ac.uk Mon May 22 09:07:19 2006 From: td2 at sanger.ac.uk (Thomas Down) Date: Mon, 22 May 2006 14:07:19 +0100 Subject: [Biojava-l] Colorful sequence logo for the transcription factor binding sites? In-Reply-To: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> References: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <4F7A571D-007B-4373-9C69-9B7D80D8AAE9@sanger.ac.uk> On 17 May 2006, at 15:46, Xiaoqing Zhang wrote: > Hi, > > I am trying to draw some TFBS logo with different colors, the > classes I found > under org.biojava.bio.gui can only draw black and grey. Anyone has > some > suggestions about where I can find a java package for colorful > sequence logo? BioJava can produce coloured logos, but you need to specify a class which defines the "palette" of colors to use. Try something like: DistributionLogo dl = new DistributionLogo(); dl.setStyle(new DNAStyle(false)); // define a palette // ... Hope this helps, Thomas. From richard.holland at ebi.ac.uk Wed May 24 05:44:58 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 24 May 2006 10:44:58 +0100 Subject: [Biojava-l] EMBL 87 format Message-ID: <1148463898.3963.12.camel@texas.ebi.ac.uk> Hi all. I've updated the EMBLFormat in BioJavaX to be capable of reading/writing files in both Pre-87 and 87+ versions of the EMBL format. By default it'll read either, and write the new version. If you want it to write the older version, you have to call the writeSequence() methods directly and specify the format as EMBL_PRE87_FORMAT. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From shameer at ncbs.res.in Mon May 29 06:07:52 2006 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 29 May 2006 15:37:52 +0530 (IST) Subject: [Biojava-l] Reg. Integrated Server / CGI to pass PDB to multiple Servers Message-ID: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> Dear All, My query may not be directly related to BioPERL, But am sure I will get some idea to move on. Some possibilities wil be available from Pise or related modules Query : --------- We have several public servers(say a,b,c). All of them will take a pdb-file as an input and process it and displays it. Now, I need to create a web page(a meta-server/integrated web-server) with three radio buttons(a,b,c) and a single input form(to accept pdb file from the users ...:( - File passing as an argument seems to be some what impossible). I need output as 3 links in next page. Is there any Bio-PERL module / CGI / Perl tricks to do it ? Thanks in advance, -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India T - 91-080-23636420-32 EXT 4241 F - 91-080-23636662/23636675 W - http://caps.ncbs.res.in -------------------------------------------------- "Refrain from illusions, insist on work and not words, patiently seek divine and scientific truth." From fpepin at cs.mcgill.ca Mon May 29 12:17:24 2006 From: fpepin at cs.mcgill.ca (Francois Pepin) Date: Mon, 29 May 2006 12:17:24 -0400 Subject: [Biojava-l] Reg. Integrated Server / CGI to pass PDB to multiple Servers In-Reply-To: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> References: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> Message-ID: <1148919444.28198.33.camel@elm.mcb.mcgill.ca> Hi Shameer, this is the bio-java list :). The bioperl list is at bioperl-l at lists.open-bio.org. Cheers, Francois On Mon, 2006-05-29 at 15:37 +0530, Shameer Khadar wrote: > Dear All, > > My query may not be directly related to BioPERL, But am sure I will get > some idea to move on. Some possibilities wil be available from Pise or > related modules > > Query : > --------- > We have several public servers(say a,b,c). All of them will take a > pdb-file as an input and process it and displays it. Now, I need to create > a web page(a meta-server/integrated web-server) with three radio > buttons(a,b,c) and a single input form(to accept pdb file from the users > ...:( - File passing as an argument seems to be some what impossible). I > need output as 3 links in next page. > > Is there any Bio-PERL module / CGI / Perl tricks to do it ? > > Thanks in advance, From wendy.wong at gmail.com Tue May 30 05:49:01 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 30 May 2006 10:49:01 +0100 Subject: [Biojava-l] viterbi training in biojava Message-ID: Hi, I was wondering if viterbi training is implemented in biojava, or if there's any open source version implemented using biojava? thanks, Wendy From smh1008 at cam.ac.uk Tue May 30 07:19:15 2006 From: smh1008 at cam.ac.uk (David Huen) Date: 30 May 2006 12:19:15 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: On May 30 2006, wendy wong wrote: >Hi, > >I was wondering if viterbi training is implemented in biojava, or if >there's any open source version implemented using biojava? > There is one-head viterbi training already I think. The training framework doesn't work for two-head - I wrote a viterbi training API that works for two head but it is not fully compatible with the existing API so I never put it into CVS, plus it doesn't have Baum-Welch implemented either. If it is any use to you you can have it. Regards, David From wendy.wong at gmail.com Tue May 30 11:43:01 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 30 May 2006 16:43:01 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: thanks! i only need one head so BaumWelchSampler works fine with me. The default SCORETYPE is probability and when I tried it the score goes back and forth, like + for one time and - for the next time. I then changed it to LOGODDS and recompiled biojava and now that the score is steadily increasing. I was wondering if the SCORETYPE could be passed in as an argument in the next version of biojava? thanks, wendy On 30 May 2006 12:19:15 +0100, David Huen wrote: > On May 30 2006, wendy wong wrote: > > >Hi, > > > >I was wondering if viterbi training is implemented in biojava, or if > >there's any open source version implemented using biojava? > > > There is one-head viterbi training already I think. The training framework > doesn't work for two-head - I wrote a viterbi training API that works for > two head but it is not fully compatible with the existing API so I never > put it into CVS, plus it doesn't have Baum-Welch implemented either. > > If it is any use to you you can have it. > > Regards, > David > From christoph.gille at charite.de Tue May 30 16:37:29 2006 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Tue, 30 May 2006 22:37:29 +0200 (CEST) Subject: [Biojava-l] sequence alignments with Amap Message-ID: <60513.141.42.56.114.1149021449.squirrel@webmail.charite.de> Amap, Multiple Alignment by Sequence Annealing, had been developed by Ariel Schwartz at UC Berkeley. It implements novel algorithms to improve the computation of multiple sequence alignments. Java and BioJava-programmers can now take advantage of Amap using the STRAP-toolbox API. Please visit the section SequenceAligner in http://3d-alignment.eu/Scripting.html. Amap is also available for users of the STRAP-workbench http://3d-alignment.eu/. From richard.holland at ebi.ac.uk Wed May 31 04:49:40 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 31 May 2006 09:49:40 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: <1149065381.3948.2.camel@texas.ebi.ac.uk> I've modified BaumWelchSampler in CVS so that it accepts alternative score types as an additional parameter to singleSequenceIterator(). cheers, Richard. On Tue, 2006-05-30 at 16:43 +0100, wendy wong wrote: > thanks! i only need one head so BaumWelchSampler works fine with me. > The default SCORETYPE is probability and when I tried it the score > goes back and forth, like + for one time and - for the next time. I > then changed it to LOGODDS and recompiled biojava and now that the > score is steadily increasing. I was wondering if the SCORETYPE could > be passed in as an argument in the next version of biojava? > > thanks, > wendy > > > On 30 May 2006 12:19:15 +0100, David Huen wrote: > > On May 30 2006, wendy wong wrote: > > > > >Hi, > > > > > >I was wondering if viterbi training is implemented in biojava, or if > > >there's any open source version implemented using biojava? > > > > > There is one-head viterbi training already I think. The training framework > > doesn't work for two-head - I wrote a viterbi training API that works for > > two head but it is not fully compatible with the existing API so I never > > put it into CVS, plus it doesn't have Baum-Welch implemented either. > > > > If it is any use to you you can have it. > > > > Regards, > > David > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From ola.spjuth at farmbio.uu.se Tue May 2 13:15:18 2006 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Tue, 02 May 2006 15:15:18 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences Message-ID: <1146575718.5603.23.camel@localhost.localdomain> Hi, Implementing a Biojava reader/parser for sequences in Bioclipse [1,2] I have come up with a few questions: 1) I'd like to use Biojava-X with Bioclipse. Are there any problems running it with Java 1.5 (as is required by Bioclipse)? 2) I would propose the addition of a readStream(...) method in RichSequence.IOTools in addition to readFile(...). For the Bioclipse project it would be most useful to be able to guess the format of a Stream. As IOTools is marked final it cannot be subclassed. 3) Is HashBioEntryDB a suitable base object for storing 1-N RichSequences in memory or should I use RichSequence[]? Which solution has the simplest toByte() method for writing to e.g. a File? So, basically I am looking for the most convenient way of doing: i) Read byte[] (from a File containing 1-N sequences) into a base object in memory (HashBioEntryDB or RichSequence[]) ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then later to File using Bioclipse-methods) Cheers, .../Ola [1] http://www.bioclipse.net [2] http://wiki.bioclipse.net From mark.schreiber at novartis.com Wed May 3 01:19:02 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 3 May 2006 09:19:02 +0800 Subject: [Biojava-l] BioJava-X parsing of RichSequences Message-ID: Ola Spjuth Sent by: biojava-l-bounces at lists.open-bio.org 05/02/2006 09:15 PM To: biojava-l cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] BioJava-X parsing of RichSequences > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > running it with Java 1.5 (as is required by Bioclipse)? Shouldn't be a problem. Biojava-X doesn't use Java1.5 but JDK1.5 (JRE5.0) can run and compile biojava. >2) I would propose the addition of a readStream(...) method in >RichSequence.IOTools in addition to readFile(...). For the Bioclipse >project it would be most useful to be able to guess the format of a >Stream. As IOTools is marked final it cannot be subclassed. The reason you cannot do this is because format guessing involves reading some data from the source and then either pushing it back or re-opening when it has guessed the format. You cannot guarentee a pushback to a Stream and you cannot guarentee you could re-open it again. As a hack you could read the stream into a temp file and pass that to IOTools. You may also be able to read it to a ByteArrayBuffer and read that as a Stream. >3) Is HashBioEntryDB a suitable base object for storing 1-N >RichSequences in memory or should I use RichSequence[]? Which solution >has the simplest toByte() method for writing to e.g. a File? > >So, basically I am looking for the most convenient way of doing: > >i) Read byte[] (from a File containing 1-N sequences) into a base >object in memory (HashBioEntryDB or RichSequence[]) >ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then >later to File using Bioclipse-methods) > The simplist way to read in and write out directly is to take the RichSequenceIterator you get from the IOTools read method and pass it direct to the IOTools out method of choice. If you want to manipulate data in between a RichSequence[] is probably smaller in memory but not as user freindly as a DB object. You should also be aware that RichSequenceIterators are lazy, eg they only read data from a file for each request to nextRichSequence(), thus you can manipulate each sequence as it comes in and not have to worry about running out of memory. Hope this helps, - Mark From richard.holland at ebi.ac.uk Wed May 3 08:38:38 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 03 May 2006 09:38:38 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: References: Message-ID: <1146645518.3950.22.camel@texas.ebi.ac.uk> Ah yes, I hadn't thought about that aspect. In which case, a Stream- capable format-guesser is not going to be possible. But there's nothing stopping Ola from reading/writing to Streams directly, as long as he knows what format they're in. It's also worth pointing out that the format guesser is not to be relied on. It'll sometimes get it wrong and some formats it won't recognise at all. I wouldn't rely on it - it's there for simple applications only. cheers, Richard On Wed, 2006-05-03 at 09:19 +0800, mark.schreiber at novartis.com wrote: > Ola Spjuth > Sent by: biojava-l-bounces at lists.open-bio.org > 05/02/2006 09:15 PM > > > To: biojava-l > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] BioJava-X parsing of RichSequences > > > > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > > running it with Java 1.5 (as is required by Bioclipse)? > > Shouldn't be a problem. Biojava-X doesn't use Java1.5 but JDK1.5 (JRE5.0) > can run and compile biojava. > > >2) I would propose the addition of a readStream(...) method in > >RichSequence.IOTools in addition to readFile(...). For the Bioclipse > >project it would be most useful to be able to guess the format of a > >Stream. As IOTools is marked final it cannot be subclassed. > > The reason you cannot do this is because format guessing involves reading > some data from the source and then either pushing it back or re-opening > when it has guessed the format. You cannot guarentee a pushback to a > Stream and you cannot guarentee you could re-open it again. As a hack you > could read the stream into a temp file and pass that to IOTools. You may > also be able to read it to a ByteArrayBuffer and read that as a Stream. > > >3) Is HashBioEntryDB a suitable base object for storing 1-N > >RichSequences in memory or should I use RichSequence[]? Which solution > >has the simplest toByte() method for writing to e.g. a File? > > > >So, basically I am looking for the most convenient way of doing: > > > >i) Read byte[] (from a File containing 1-N sequences) into a base > >object in memory (HashBioEntryDB or RichSequence[]) > >ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then > >later to File using Bioclipse-methods) > > > > The simplist way to read in and write out directly is to take the > RichSequenceIterator you get from the IOTools read method and pass it > direct to the IOTools out method of choice. If you want to manipulate data > in between a RichSequence[] is probably smaller in memory but not as user > freindly as a DB object. > > You should also be aware that RichSequenceIterators are lazy, eg they only > read data from a file for each request to nextRichSequence(), thus you can > manipulate each sequence as it comes in and not have to worry about > running out of memory. > > Hope this helps, > > - Mark > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From fpepin at cs.mcgill.ca Wed May 3 16:50:56 2006 From: fpepin at cs.mcgill.ca (Francois Pepin) Date: Wed, 03 May 2006 12:50:56 -0400 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146645518.3950.22.camel@texas.ebi.ac.uk> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> Message-ID: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> On Wed, 2006-05-03 at 09:38 +0100, Richard Holland wrote: > Ah yes, I hadn't thought about that aspect. In which case, a Stream- > capable format-guesser is not going to be possible. But there's nothing > stopping Ola from reading/writing to Streams directly, as long as he > knows what format they're in. I would tend to disagree about the impossibility of using streams. How much of the file is generally being read before the guess is made? I'm thinking very little is needed, especially compared to how much memory Java usually takes. It would not be very difficult to save that first part of the stream and then play it back once the guess is made. I kind of like the idea of using streams, in cases where you are not reading from a file. Having to write everything to a temporary file to satisfy the API isn't a very appealing solution, I think. I could code something up if people are interested. Francois From rhett at detailedbalance.net Wed May 3 17:44:14 2006 From: rhett at detailedbalance.net (Rhett Sutphin) Date: Wed, 3 May 2006 12:44:14 -0500 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> Message-ID: <171BAAB6-D815-4EFC-8ACD-E0CEFAC407A1@detailedbalance.net> On May 3, 2006, at 11:50 AM, Francois Pepin wrote: > On Wed, 2006-05-03 at 09:38 +0100, Richard Holland wrote: >> Ah yes, I hadn't thought about that aspect. In which case, a Stream- >> capable format-guesser is not going to be possible. But there's >> nothing >> stopping Ola from reading/writing to Streams directly, as long as he >> knows what format they're in. > > I would tend to disagree about the impossibility of using streams. > > How much of the file is generally being read before the guess is made? > I'm thinking very little is needed, especially compared to how much > memory Java usually takes. > > It would not be very difficult to save that first part of the > stream and > then play it back once the guess is made. I encountered the same issue when writing the chromatogram reading code. I wrote org.biojava.utils.io.CachingInputStream as a solution. It may be useful as a starting point. Rhett From e.willighagen at science.ru.nl Wed May 3 17:54:23 2006 From: e.willighagen at science.ru.nl (Egon Willighagen) Date: Wed, 3 May 2006 19:54:23 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146675056.24875.71.camel@elm.mcb.mcgill.ca> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> Message-ID: <200605031954.23470.e.willighagen@science.ru.nl> On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > How much of the file is generally being read before the guess is made? > I'm thinking very little is needed, especially compared to how much > memory Java usually takes. Generally not much. Jmol uses 16384 bytes. > It would not be very difficult to save that first part of the stream and > then play it back once the guess is made. See how Jmol does it: http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > I kind of like the idea of using streams, in cases where you are not > reading from a file. Having to write everything to a temporary file to > satisfy the API isn't a very appealing solution, I think. > > I could code something up if people are interested. An additional advantage is that you get .gz support in one go: BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); InputStream is = bis; bis.mark(5); int countRead = 0; countRead = bis.read(abMagic, 0, 4); bis.reset(); if (countRead == 4 && abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) is = new GZIPInputStream(bis); where t is your InputStream, and is the stream to use after the gzip check/unzip. For the full working code, see again Jmol CVS: http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup Egon -- e.willighagen at science.ru.nl Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 From richard.holland at ebi.ac.uk Thu May 4 09:02:10 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 04 May 2006 10:02:10 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <200605031954.23470.e.willighagen@science.ru.nl> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> <200605031954.23470.e.willighagen@science.ru.nl> Message-ID: <1146733331.3955.0.camel@texas.ebi.ac.uk> I have added the capability to guess the format of streams, and read directly from them. See RichSequence.IOTools.readStream() for details. In CVS biojava-live now. cheers, Richard On Wed, 2006-05-03 at 19:54 +0200, Egon Willighagen wrote: > On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > > How much of the file is generally being read before the guess is made? > > I'm thinking very little is needed, especially compared to how much > > memory Java usually takes. > > Generally not much. Jmol uses 16384 bytes. > > > It would not be very difficult to save that first part of the stream and > > then play it back once the guess is made. > > See how Jmol does it: > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > > > I kind of like the idea of using streams, in cases where you are not > > reading from a file. Having to write everything to a temporary file to > > satisfy the API isn't a very appealing solution, I think. > > > > I could code something up if people are interested. > > An additional advantage is that you get .gz support in one go: > > BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); > InputStream is = bis; > bis.mark(5); > int countRead = 0; > countRead = bis.read(abMagic, 0, 4); > bis.reset(); > if (countRead == 4 && > abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) > is = new GZIPInputStream(bis); > > where t is your InputStream, and is the stream to use after the gzip > check/unzip. For the full working code, see again Jmol CVS: > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup > > Egon > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From ola.spjuth at farmbio.uu.se Thu May 4 09:25:16 2006 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 04 May 2006 11:25:16 +0200 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146733331.3955.0.camel@texas.ebi.ac.uk> References: <1146645518.3950.22.camel@texas.ebi.ac.uk> <1146675056.24875.71.camel@elm.mcb.mcgill.ca> <200605031954.23470.e.willighagen@science.ru.nl> <1146733331.3955.0.camel@texas.ebi.ac.uk> Message-ID: <1146734717.5587.46.camel@localhost.localdomain> Thank you very much! I shall update Bioclipse to use this for the next release (>0.9.0). Cheers, .../Ola On Thu, 2006-05-04 at 10:02 +0100, Richard Holland wrote: > I have added the capability to guess the format of streams, and read > directly from them. See RichSequence.IOTools.readStream() for details. > > In CVS biojava-live now. > > cheers, > Richard > > On Wed, 2006-05-03 at 19:54 +0200, Egon Willighagen wrote: > > On Wednesday 03 May 2006 18:50, Francois Pepin wrote: > > > How much of the file is generally being read before the guess is made? > > > I'm thinking very little is needed, especially compared to how much > > > memory Java usually takes. > > > > Generally not much. Jmol uses 16384 bytes. > > > > > It would not be very difficult to save that first part of the stream and > > > then play it back once the guess is made. > > > > See how Jmol does it: > > > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup > > > > > I kind of like the idea of using streams, in cases where you are not > > > reading from a file. Having to write everything to a temporary file to > > > satisfy the API isn't a very appealing solution, I think. > > > > > > I could code something up if people are interested. > > > > An additional advantage is that you get .gz support in one go: > > > > BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192); > > InputStream is = bis; > > bis.mark(5); > > int countRead = 0; > > countRead = bis.read(abMagic, 0, 4); > > bis.reset(); > > if (countRead == 4 && > > abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B) > > is = new GZIPInputStream(bis); > > > > where t is your InputStream, and is the stream to use after the gzip > > check/unzip. For the full working code, see again Jmol CVS: > > > > http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup > > > > Egon > > From richard.holland at ebi.ac.uk Thu May 4 15:45:38 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 04 May 2006 16:45:38 +0100 Subject: [Biojava-l] BioJava-X parsing of RichSequences In-Reply-To: <1146736092.5587.64.camel@localhost.localdomain> References: <1146575718.5603.23.camel@localhost.localdomain> <1146583993.3950.19.camel@texas.ebi.ac.uk> <1146608664.5603.37.camel@localhost.localdomain> <1146651509.3950.24.camel@texas.ebi.ac.uk> <1146736092.5587.64.camel@localhost.localdomain> Message-ID: <1146757539.3955.6.camel@texas.ebi.ac.uk> The UniProt file format has apparently changed since I wrote the parser, and the date lines now take a different format: DT 01-OCT-1994, integrated into UniProtKB/Swiss-Prot. DT 27-APR-2001, sequence version 3. DT 18-APR-2006, entry version 85. These are not recognised by the parser and are throwing an exception. Also, UniProt changed their Feature Table format. I've also fixed this. I've updated the parser in CVS to (hopefully) cope with this, although it now no longer recognises the old format (which was the same as the EMBL format). Can someone test it thoroughly please? cheers, Richard On Thu, 2006-05-04 at 11:48 +0200, Ola Spjuth wrote: > Richard, > > This is what I tried: > Class.forName("org.biojavax.bio.seq.io.EMBLFormat"); > Class.forName("org.biojavax.bio.seq.io.EMBLxmlFormat"); > Class.forName("org.biojavax.bio.seq.io.FastaFormat"); > Class.forName("org.biojavax.bio.seq.io.GenbankFormat"); > Class.forName("org.biojavax.bio.seq.io.INSDseqFormat"); > Class.forName("org.biojavax.bio.seq.io.RichSequenceFormat"); > Class.forName("org.biojavax.bio.seq.io.UniProtFormat"); > Class.forName("org.biojavax.bio.seq.io.UniProtXMLFormat"); > > Namespace ns = RichObjectFactory.getDefaultNamespace(); > RichSequenceIterator seqit; > seqit = RichSequence.IOTools.readFile(new File(MyFilename),ns); > > ArrayList seqs=new ArrayList(); > while (seqit.hasNext()){ > RichSequence rseq=null; > Sequence seq=null; > rseq = seqit.nextRichSequence(); > if (rseq!=null) > seqs.add(rseq); > } > > -- > > Seems that seqit.hasNext() returns true, but seqit.nextRichSequence() > throws an exception. > > It works with my Fasta-sequences but not with the attached UniProt > sequence (or else I'm doing something wrong). The test-file was attached > by Mark Southern (thanks Mark!) and works with biojavas SeqIOTools. > > Glad if you could have a look at it! > > Cheers, > > .../Ola > > > On Wed, 2006-05-03 at 11:18 +0100, Richard Holland wrote: > > Interesting - the code and file would be useful in trying to work out > > what is happening. > > > > cheers, > > Richard > > > > On Wed, 2006-05-03 at 00:24 +0200, Ola Spjuth wrote: > > > Hi Richard, > > > > > > Thanks a lot, I really appreciate that! I think Bioclipse will serve as > > > an excellent showcase for what can easily be achieved with Biojava. > > > > > > Another problem I found was that parsing of a UniprotFormat file > > > resulted in no RichSequences while it worked with the old Biojava > > > SeqIOtools. If you like I can provide the file and code used for my > > > reading of it. > > > > > > Cheers, > > > > > > .../Ola > > > > > > > > > On Tue, 2006-05-02 at 16:33 +0100, Richard Holland wrote: > > > > Hi Ola. I'll look into implementing something that'll help you. Give me > > > > a day or two and see what happens... :) > > > > > > > > cheers, > > > > Richard > > > > > > > > > > > > On Tue, 2006-05-02 at 15:15 +0200, Ola Spjuth wrote: > > > > > Hi, > > > > > > > > > > Implementing a Biojava reader/parser for sequences in Bioclipse [1,2] I > > > > > have come up with a few questions: > > > > > > > > > > 1) I'd like to use Biojava-X with Bioclipse. Are there any problems > > > > > running it with Java 1.5 (as is required by Bioclipse)? > > > > > > > > > > 2) I would propose the addition of a readStream(...) method in > > > > > RichSequence.IOTools in addition to readFile(...). For the Bioclipse > > > > > project it would be most useful to be able to guess the format of a > > > > > Stream. As IOTools is marked final it cannot be subclassed. > > > > > > > > > > 3) Is HashBioEntryDB a suitable base object for storing 1-N > > > > > RichSequences in memory or should I use RichSequence[]? Which solution > > > > > has the simplest toByte() method for writing to e.g. a File? > > > > > > > > > > So, basically I am looking for the most convenient way of doing: > > > > > > > > > > i) Read byte[] (from a File containing 1-N sequences) into a base > > > > > object in memory (HashBioEntryDB or RichSequence[]) > > > > > ii) Write the (HashBioEntryDB or RichSequence[]) to byte[] (and then > > > > > later to File using Bioclipse-methods) > > > > > > > > > > Cheers, > > > > > > > > > > .../Ola > > > > > > > > > > [1] http://www.bioclipse.net > > > > > [2] http://wiki.bioclipse.net > > > > > > > > > > > > > > > _______________________________________________ > > > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Tue May 9 11:19:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 9 May 2006 12:19:29 +0100 Subject: [Biojava-l] Access to variables Message-ID: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Apologies if this comes through more than once - I forgot to send in plain text without attachments! In case you don?t know ? I?m new to Java . I?m working out an interface/class structure for part of an app I want to convert from Perl to Java and I have a question about the best way to provide access to variables to the client programmer: Is it best to have variables you want the client programmer to access just made public or is it best to provide access to them via a get/set method? >From my limited reading of ?Thinking in Java? I would think it best to hide the implementation from the user and provide methods to access these variables e.g. setThreshold and getThreshold modify the private variable threshold ? is that correct or am I way off the mark!? Thanks for any clarification. Nath ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ---------------------------------------------------------------------------- ------ --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 12:18:14 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 12:19:29 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Tue May 9 12:56:30 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Tue, 09 May 2006 13:56:30 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> References: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Message-ID: <1147179391.3951.25.camel@texas.ebi.ac.uk> hi there. Get/Set methods with private fields are by far the preferred way of doing things. This ensures that the object gets to know whenever one of its variables has changed. For example, assume you had a class that represented a sequence, and one of the methods in that class computed some expensive statistic on that sequence and stored that statistic in another variable. If the sequence itself changed then you'd need to recompute the statistic too. Without get/set, there'd be no way of knowing the sequence had changed, and no way of knowing when to recompute the statistic. cheers, Richard On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > Apologies if this comes through more than once - I forgot to send in plain > text without attachments! > > In case you don?t know ? I?m new to Java?. > > I?m working out an interface/class structure for part of an app I want to > convert from Perl to Java and I have a question about the best way to > provide access to variables to the client programmer: > > Is it best to have variables you want the client programmer to access just > made public or is it best to provide access to them via a get/set method? > >From my limited reading of ?Thinking in Java? I would think it best to hide > the implementation from the user and provide methods to access these > variables e.g. setThreshold and getThreshold modify the private variable > threshold ? is that correct or am I way off the mark!? > > Thanks for any clarification. > > Nath > > ---------------------------------------------------------------------------- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211 Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences Mob: +44 (0)7742 533 > 569 > University of Sheffield Fax: +44 (0)114 22 > 20002 > Western Bank Web: > www.bioinf.shef.ac.uk > Sheffield > www.petraea.shef.ac.uk > S10 2TN > ---------------------------------------------------------------------------- > ------ > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 12:18:14 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 12:19:29 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Tue May 9 13:09:58 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 9 May 2006 14:09:58 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <1147179391.3951.25.camel@texas.ebi.ac.uk> Message-ID: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> Well, I've jumped straight in and am already planning to use get/set methods for most of my variables :o) In my app I plan to have a multiple alignment displayed and the user opts to calculate a consensus sequence as part of a larger process. The user will also be able to make changes to the alignment. Therefore, if a consensus sequence has already been calculated I'd like this to be automatically updated to reflect the changes in the alignment. Do you know of a small coded example of how this is done i.e. in your example: detecting if the sequence changed and processing a block of code if it has. Cheers Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 09 May 2006 13:57 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Access to variables > > hi there. > > Get/Set methods with private fields are by far the preferred way of > doing things. This ensures that the object gets to know whenever one of > its variables has changed. > > For example, assume you had a class that represented a sequence, and one > of the methods in that class computed some expensive statistic on that > sequence and stored that statistic in another variable. If the sequence > itself changed then you'd need to recompute the statistic too. Without > get/set, there'd be no way of knowing the sequence had changed, and no > way of knowing when to recompute the statistic. > > cheers, > Richard > > On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > > Apologies if this comes through more than once - I forgot to send in > plain > > text without attachments! > > > > In case you don't know - I'm new to Java.. > > > > I'm working out an interface/class structure for part of an app I want > to > > convert from Perl to Java and I have a question about the best way to > > provide access to variables to the client programmer: > > > > Is it best to have variables you want the client programmer to access > just > > made public or is it best to provide access to them via a get/set > method? > > >From my limited reading of "Thinking in Java" I would think it best to > hide > > the implementation from the user and provide methods to access these > > variables e.g. setThreshold and getThreshold modify the private variable > > threshold - is that correct or am I way off the mark!? > > > > Thanks for any clarification. > > > > Nath > > > > ------------------------------------------------------------------------ > ---- > > ------ > > Dr. Nathan S. Haigh > > Bioinformatics PostDoctoral Research Associate > > > > Room B2 211 Tel: +44 (0)114 > 22 > > 20112 > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > 533 > > 569 > > University of Sheffield Fax: +44 (0)114 > 22 > > 20002 > > Western Bank Web: > > www.bioinf.shef.ac.uk > > Sheffield > > www.petraea.shef.ac.uk > > S10 2TN > > ------------------------------------------------------------------------ > ---- > > ------ > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0615-2, 12/04/2006 > > Tested on: 09/05/2006 12:18:14 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0615-2, 12/04/2006 > > Tested on: 09/05/2006 12:19:29 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0615-2, 12/04/2006 Tested on: 09/05/2006 14:09:48 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From smh1008 at cam.ac.uk Tue May 9 13:12:24 2006 From: smh1008 at cam.ac.uk (David Huen) Date: 09 May 2006 14:12:24 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> References: <007701c6735a$703d3c30$9f5ea78f@bmbpc196> Message-ID: On May 9 2006, Nathan S. Haigh wrote: >Apologies if this comes through more than once - I forgot to send in plain >text without attachments! > >In case you don't know - I'm new to Java . > >I'm working out an interface/class structure for part of an app I want to >convert from Perl to Java and I have a question about the best way to >provide access to variables to the client programmer: > >Is it best to have variables you want the client programmer to access just >made public or is it best to provide access to them via a get/set method? >> From my limited reading of "Thinking in Java" I would think it best to >> hide >the implementation from the user and provide methods to access these >variables e.g. setThreshold and getThreshold modify the private variable >threshold - is that correct or am I way off the mark!? > Breaking object encapsulation is generally a bad thing in OO programming so, yes, avoid it when you can. We try to make it difficult to do so in BioJava anyway :-). Regards, David From richard.holland at ebi.ac.uk Tue May 9 13:28:28 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Tue, 09 May 2006 14:28:28 +0100 Subject: [Biojava-l] Access to variables In-Reply-To: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> References: <007c01c67369$dfa12c80$9f5ea78f@bmbpc196> Message-ID: <1147181309.3951.41.camel@texas.ebi.ac.uk> There is no easy way, but BioJava handles things like this using a listener model. What this means is that some objects are EventListeners, whilst other fire events to a central EventManager. The EventManager sends these events to all EventListeners registered as being interested in that kind of event. The simplest form is: public class Event { public final Object object; public final Object eventType; public Event(Object object, Object eventType) { this.object = object; this.eventType = eventType; } } public interface EventListener { public void eventOccurred(Event e); } public class EventManager { private static final Map eventListeners = new HashMap(); public static void registerEventListener(EventListener eventListener, Object eventType) { if (!eventListeners.containsKey(eventType)) eventListeners.put(eventType, new ArrayList()); ((List)eventListeners.get(eventType)).add(eventListener); } public static void fireEvent(Event e) { for (Iterator i = ((List)eventListeners.get(e.eventType)).iterator(); i.hasNext(); ) ((EventListener)i.next()).eventOccurred(e); } } In your example, the class representing the alignment would fire an event whenever the alignment changed, by calling EventManager.fireEvent () from the method which made the change. For instance, assuming the method which made the change was insertGap(): public void insertGap(int gapPosition) { // do the work of inserting the gap here. ... ... // Fire an event. EventManager.fireEvent(new Event(this, "gapInserted")); } In the class representing the consensus, which may or may not be the same class as the alignment, you would do this: public class Consensus implements EventListener { private Alignment alignment; public Consensus(Alignment alignment) { this.alignment = alignment; this.updateConsensus(); EventManager.registerEventListener(this, "gapInserted"); } public void eventOccurred(Event e) { if (e.eventType.equals("gapInserted")) { this.updateConsensus(); } } private void updateConsensus() { // do the updating here ... ... } } This is by far a simplistic example, but I hope you get the idea. There is much more out there on the web - Wikipedia is a good starting point for programming concepts such as these. cheers, Richard On Tue, 2006-05-09 at 14:09 +0100, Nathan S. Haigh wrote: > Well, I've jumped straight in and am already planning to use get/set methods > for most of my variables :o) > > In my app I plan to have a multiple alignment displayed and the user opts to > calculate a consensus sequence as part of a larger process. The user will > also be able to make changes to the alignment. Therefore, if a consensus > sequence has already been calculated I'd like this to be automatically > updated to reflect the changes in the alignment. Do you know of a small > coded example of how this is done i.e. in your example: detecting if the > sequence changed and processing a block of code if it has. > > Cheers > Nath > > > > -----Original Message----- > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > Sent: 09 May 2006 13:57 > > To: n.haigh at sheffield.ac.uk > > Cc: biojava-l at lists.open-bio.org > > Subject: Re: [Biojava-l] Access to variables > > > > hi there. > > > > Get/Set methods with private fields are by far the preferred way of > > doing things. This ensures that the object gets to know whenever one of > > its variables has changed. > > > > For example, assume you had a class that represented a sequence, and one > > of the methods in that class computed some expensive statistic on that > > sequence and stored that statistic in another variable. If the sequence > > itself changed then you'd need to recompute the statistic too. Without > > get/set, there'd be no way of knowing the sequence had changed, and no > > way of knowing when to recompute the statistic. > > > > cheers, > > Richard > > > > On Tue, 2006-05-09 at 12:19 +0100, Nathan S. Haigh wrote: > > > Apologies if this comes through more than once - I forgot to send in > > plain > > > text without attachments! > > > > > > In case you don't know - I'm new to Java.. > > > > > > I'm working out an interface/class structure for part of an app I want > > to > > > convert from Perl to Java and I have a question about the best way to > > > provide access to variables to the client programmer: > > > > > > Is it best to have variables you want the client programmer to access > > just > > > made public or is it best to provide access to them via a get/set > > method? > > > >From my limited reading of "Thinking in Java" I would think it best to > > hide > > > the implementation from the user and provide methods to access these > > > variables e.g. setThreshold and getThreshold modify the private variable > > > threshold - is that correct or am I way off the mark!? > > > > > > Thanks for any clarification. > > > > > > Nath > > > > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > Dr. Nathan S. Haigh > > > Bioinformatics PostDoctoral Research Associate > > > > > > Room B2 211 Tel: +44 (0)114 > > 22 > > > 20112 > > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > > 533 > > > 569 > > > University of Sheffield Fax: +44 (0)114 > > 22 > > > 20002 > > > Western Bank Web: > > > www.bioinf.shef.ac.uk > > > Sheffield > > > www.petraea.shef.ac.uk > > > S10 2TN > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0615-2, 12/04/2006 > > > Tested on: 09/05/2006 12:18:14 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0615-2, 12/04/2006 > > > Tested on: 09/05/2006 12:19:29 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0615-2, 12/04/2006 > Tested on: 09/05/2006 14:09:48 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From wendy.wong at gmail.com Tue May 9 15:46:03 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 9 May 2006 16:46:03 +0100 Subject: [Biojava-l] ScoreType.Odds Message-ID: Hi, I was wondering if I use ScoreType.Odds for my HMM is there a default cutoff value? or it just picks whichever state that has the highest odds ratio? if it uses a cutoff value is there a way to set it? thanks, wendy From mark.schreiber at novartis.com Wed May 10 03:54:07 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 10 May 2006 11:54:07 +0800 Subject: [Biojava-l] Access to variables Message-ID: Further to this I would add that sometimes get / set methods should not be public. This is usually the case for set methods where you don't want the possibility of something external to the class or package calling the set method and messing things up for you. For a set method to only be accesible internally you would make it private. If you make it protected you have more options. If you make it public you expose it to the world. Basically if you think your set method is not safe for general developers to use under normal circumstances or if it is only relevant to other classes in your API you should make it protected or private. Hope that was not too confusing. Bloch's Effective Java is probably much clearer/ - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 David Huen Sent by: biojava-l-bounces at lists.open-bio.org 05/09/2006 09:12 PM To: n.haigh at sheffield.ac.uk cc: biojava-l at lists.open-bio.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Access to variables On May 9 2006, Nathan S. Haigh wrote: >Apologies if this comes through more than once - I forgot to send in plain >text without attachments! > >In case you don't know - I'm new to Java?. > >I'm working out an interface/class structure for part of an app I want to >convert from Perl to Java and I have a question about the best way to >provide access to variables to the client programmer: > >Is it best to have variables you want the client programmer to access just >made public or is it best to provide access to them via a get/set method? >> From my limited reading of "Thinking in Java" I would think it best to >> hide >the implementation from the user and provide methods to access these >variables e.g. setThreshold and getThreshold modify the private variable >threshold - is that correct or am I way off the mark!? > Breaking object encapsulation is generally a bad thing in OO programming so, yes, avoid it when you can. We try to make it difficult to do so in BioJava anyway :-). Regards, David _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From n.haigh at sheffield.ac.uk Thu May 11 13:27:28 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 14:27:28 +0100 Subject: [Biojava-l] Creating an alignment object Message-ID: <004001c674fe$a637d570$9f5ea78f@bmbpc196> I'm new to Java and Biojava, but I've been having a play with writing and interface and some classes for an app I'd like to write in Java. The part I'm playing around with at the moment deals with alignments and groups of alignment positions. What is the easiest/best way to create an alignment that I can then play around with and generate Locations from? A self contained working example would be great because as I said, I'm really new to java! Cheers Nath ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ---------------------------------------------------------------------------- ------ --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 14:27:16 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Thu May 11 13:56:20 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 11 May 2006 14:56:20 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004001c674fe$a637d570$9f5ea78f@bmbpc196> References: <004001c674fe$a637d570$9f5ea78f@bmbpc196> Message-ID: <1147355780.3951.59.camel@texas.ebi.ac.uk> BioJava itself cannot align sequences. It can only create objects that are representations of alignments generated by third-party software. However, there is a third-party addon to BioJava called Strap, which can actually do the alignment work itself from within your Java program and return a BioJava alignment object that represents the results. It is available for download, along with an example of how to use it, from here: http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html cheers, Richard On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > I'm new to Java and Biojava, but I've been having a play with writing and > interface and some classes for an app I'd like to write in Java. > > The part I'm playing around with at the moment deals with alignments and > groups of alignment positions. What is the easiest/best way to create an > alignment that I can then play around with and generate Locations from? A > self contained working example would be great because as I said, I'm really > new to java! > > Cheers > Nath > > ---------------------------------------------------------------------------- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211 Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences Mob: +44 (0)7742 533 > 569 > University of Sheffield Fax: +44 (0)114 22 > 20002 > Western Bank Web: > www.bioinf.shef.ac.uk > Sheffield > www.petraea.shef.ac.uk > S10 2TN > ---------------------------------------------------------------------------- > ------ > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-2, 11/05/2006 > Tested on: 11/05/2006 14:27:16 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Thu May 11 14:26:59 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 15:26:59 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <1147355780.3951.59.camel@texas.ebi.ac.uk> Message-ID: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Sorry, I think I may have been unclear. For example I have an alignment file in FASTA format which looks like: >seq1 ACGTTGCA >seq2 ATGTTGCG >seq3 AGGTTGCT >seq4 AGGTTGCC How do I get this into an alignment object? Or, better still, can I create an alignment object without specifying an alignment file, but somehow creating the alignment by hand? Maybe create, a sequence object for each of the above sequences and add them to an alignment object? Something like that! :o) Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 11 May 2006 14:56 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Creating an alignment object > > BioJava itself cannot align sequences. It can only create objects that > are representations of alignments generated by third-party software. > > However, there is a third-party addon to BioJava called Strap, which can > actually do the alignment work itself from within your Java program and > return a BioJava alignment object that represents the results. It is > available for download, along with an example of how to use it, from > here: > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > cheers, > Richard > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > I'm new to Java and Biojava, but I've been having a play with writing > and > > interface and some classes for an app I'd like to write in Java. > > > > The part I'm playing around with at the moment deals with alignments and > > groups of alignment positions. What is the easiest/best way to create an > > alignment that I can then play around with and generate Locations from? > A > > self contained working example would be great because as I said, I'm > really > > new to java! > > > > Cheers > > Nath > > > > ------------------------------------------------------------------------ > ---- > > ------ > > Dr. Nathan S. Haigh > > Bioinformatics PostDoctoral Research Associate > > > > Room B2 211 Tel: +44 (0)114 > 22 > > 20112 > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > 533 > > 569 > > University of Sheffield Fax: +44 (0)114 > 22 > > 20002 > > Western Bank Web: > > www.bioinf.shef.ac.uk > > Sheffield > > www.petraea.shef.ac.uk > > S10 2TN > > ------------------------------------------------------------------------ > ---- > > ------ > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0619-2, 11/05/2006 > > Tested on: 11/05/2006 14:27:16 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 15:26:57 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Thu May 11 14:43:23 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Thu, 11 May 2006 15:43:23 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004901c67506$f6e03460$9f5ea78f@bmbpc196> References: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Message-ID: <1147358603.3951.73.camel@texas.ebi.ac.uk> Andreas Prlic just pointed out to me that... "Andreas Draeger provided the org.biojava.bio.alignment classes, where one can do e.g. swith waterman and needleman wunsch...". Having just had a look at this it's very powerful and you should be able to implement SequenceAlignment with your own algorithm to construct a FlexibleAlignment object, if that's what you're ultimately intending to do. Basically you add sequences to/from a FlexibleAlignment, then insert gaps and deletions as necessary, all from the SequenceAlignment implementation which is passed as input a set of Sequence objects to align. cheers, Richard On Thu, 2006-05-11 at 15:26 +0100, Nathan S. Haigh wrote: > Sorry, I think I may have been unclear. > > For example I have an alignment file in FASTA format which looks like: > > >seq1 > ACGTTGCA > >seq2 > ATGTTGCG > >seq3 > AGGTTGCT > >seq4 > AGGTTGCC > > > How do I get this into an alignment object? Or, better still, can I create > an alignment object without specifying an alignment file, but somehow > creating the alignment by hand? Maybe create, a sequence object for each of > the above sequences and add them to an alignment object? > > Something like that! :o) > > Nath > > > -----Original Message----- > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > Sent: 11 May 2006 14:56 > > To: n.haigh at sheffield.ac.uk > > Cc: biojava-l at lists.open-bio.org > > Subject: Re: [Biojava-l] Creating an alignment object > > > > BioJava itself cannot align sequences. It can only create objects that > > are representations of alignments generated by third-party software. > > > > However, there is a third-party addon to BioJava called Strap, which can > > actually do the alignment work itself from within your Java program and > > return a BioJava alignment object that represents the results. It is > > available for download, along with an example of how to use it, from > > here: > > > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > > > cheers, > > Richard > > > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > > I'm new to Java and Biojava, but I've been having a play with writing > > and > > > interface and some classes for an app I'd like to write in Java. > > > > > > The part I'm playing around with at the moment deals with alignments and > > > groups of alignment positions. What is the easiest/best way to create an > > > alignment that I can then play around with and generate Locations from? > > A > > > self contained working example would be great because as I said, I'm > > really > > > new to java! > > > > > > Cheers > > > Nath > > > > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > Dr. Nathan S. Haigh > > > Bioinformatics PostDoctoral Research Associate > > > > > > Room B2 211 Tel: +44 (0)114 > > 22 > > > 20112 > > > Department of Animal and Plant Sciences Mob: +44 (0)7742 > > 533 > > > 569 > > > University of Sheffield Fax: +44 (0)114 > > 22 > > > 20002 > > > Western Bank Web: > > > www.bioinf.shef.ac.uk > > > Sheffield > > > www.petraea.shef.ac.uk > > > S10 2TN > > > ------------------------------------------------------------------------ > > ---- > > > ------ > > > > > > --- > > > avast! Antivirus: Outbound message clean. > > > Virus Database (VPS): 0619-2, 11/05/2006 > > > Tested on: 11/05/2006 14:27:16 > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > http://www.avast.com > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-2, 11/05/2006 > Tested on: 11/05/2006 15:26:57 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Thu May 11 14:52:27 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 11 May 2006 15:52:27 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <1147358603.3951.73.camel@texas.ebi.ac.uk> Message-ID: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> Nope, I don't need to generate an alignment, I already have an alignment in a file created by third party software (clustalw). In fact, the app I'd eventually like to have written in Java would include some sort of wrapper for clustalw in order to construct the alignments from a set of unaligned sequences, but algorithms implemented in Biojava would also be a welcome addition to the app. But first things first. If I didn't have any sequences or an alignment in any files. What is the easiest way to get an alignment object in Java to have a play around with? Is there a way to just "magically" create a default alignment of say 5 sequences with 20 positions? Nath > -----Original Message----- > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > Sent: 11 May 2006 15:43 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: RE: [Biojava-l] Creating an alignment object > > Andreas Prlic just pointed out to me that... "Andreas Draeger provided > the org.biojava.bio.alignment classes, where one can do e.g. swith > waterman and needleman wunsch...". > > Having just had a look at this it's very powerful and you should be able > to implement SequenceAlignment with your own algorithm to construct a > FlexibleAlignment object, if that's what you're ultimately intending to > do. > > Basically you add sequences to/from a FlexibleAlignment, then insert > gaps and deletions as necessary, all from the SequenceAlignment > implementation which is passed as input a set of Sequence objects to > align. > > cheers, > Richard > > On Thu, 2006-05-11 at 15:26 +0100, Nathan S. Haigh wrote: > > Sorry, I think I may have been unclear. > > > > For example I have an alignment file in FASTA format which looks like: > > > > >seq1 > > ACGTTGCA > > >seq2 > > ATGTTGCG > > >seq3 > > AGGTTGCT > > >seq4 > > AGGTTGCC > > > > > > How do I get this into an alignment object? Or, better still, can I > create > > an alignment object without specifying an alignment file, but somehow > > creating the alignment by hand? Maybe create, a sequence object for each > of > > the above sequences and add them to an alignment object? > > > > Something like that! :o) > > > > Nath > > > > > -----Original Message----- > > > From: Richard Holland [mailto:richard.holland at ebi.ac.uk] > > > Sent: 11 May 2006 14:56 > > > To: n.haigh at sheffield.ac.uk > > > Cc: biojava-l at lists.open-bio.org > > > Subject: Re: [Biojava-l] Creating an alignment object > > > > > > BioJava itself cannot align sequences. It can only create objects that > > > are representations of alignments generated by third-party software. > > > > > > However, there is a third-party addon to BioJava called Strap, which > can > > > actually do the alignment work itself from within your Java program > and > > > return a BioJava alignment object that represents the results. It is > > > available for download, along with an example of how to use it, from > > > here: > > > > > > > http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html > > > > > > cheers, > > > Richard > > > > > > On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: > > > > I'm new to Java and Biojava, but I've been having a play with > writing > > > and > > > > interface and some classes for an app I'd like to write in Java. > > > > > > > > The part I'm playing around with at the moment deals with alignments > and > > > > groups of alignment positions. What is the easiest/best way to > create an > > > > alignment that I can then play around with and generate Locations > from? > > > A > > > > self contained working example would be great because as I said, I'm > > > really > > > > new to java! > > > > > > > > Cheers > > > > Nath > > > > > > > > -------------------------------------------------------------------- > ---- > > > ---- > > > > ------ > > > > Dr. Nathan S. Haigh > > > > Bioinformatics PostDoctoral Research Associate > > > > > > > > Room B2 211 Tel: +44 > (0)114 > > > 22 > > > > 20112 > > > > Department of Animal and Plant Sciences Mob: +44 > (0)7742 > > > 533 > > > > 569 > > > > University of Sheffield Fax: +44 > (0)114 > > > 22 > > > > 20002 > > > > Western Bank Web: > > > > www.bioinf.shef.ac.uk > > > > Sheffield > > > > www.petraea.shef.ac.uk > > > > S10 2TN > > > > -------------------------------------------------------------------- > ---- > > > ---- > > > > ------ > > > > > > > > --- > > > > avast! Antivirus: Outbound message clean. > > > > Virus Database (VPS): 0619-2, 11/05/2006 > > > > Tested on: 11/05/2006 14:27:16 > > > > avast! - copyright (c) 1988-2006 ALWIL Software. > > > > http://www.avast.com > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > -- > > > Richard Holland (BioMart Team) > > > EMBL-EBI > > > Wellcome Trust Genome Campus > > > Hinxton > > > Cambridge CB10 1SD > > > UNITED KINGDOM > > > Tel: +44-(0)1223-494416 > > > > --- > > avast! Antivirus: Outbound message clean. > > Virus Database (VPS): 0619-2, 11/05/2006 > > Tested on: 11/05/2006 15:26:57 > > avast! - copyright (c) 1988-2006 ALWIL Software. > > http://www.avast.com > > > > > > > > > > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-2, 11/05/2006 Tested on: 11/05/2006 15:52:15 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From md5 at sanger.ac.uk Thu May 11 14:51:08 2006 From: md5 at sanger.ac.uk (Mutlu Dogruel) Date: Thu, 11 May 2006 15:51:08 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004901c67506$f6e03460$9f5ea78f@bmbpc196> References: <004901c67506$f6e03460$9f5ea78f@bmbpc196> Message-ID: <44634F5C.8000705@sanger.ac.uk> Hi Nathan, You need something like this: BufferedReader br = new BufferedReader(new FileReader("file.txt")); FastaAlignmentFormat faf = new FastaAlignmentFormat(); Alignment aligned = faf.read( br ); br.close(); Cheers M. Nathan S. Haigh wrote: >Sorry, I think I may have been unclear. > >For example I have an alignment file in FASTA format which looks like: > > > >>seq1 >> >> >ACGTTGCA > > >>seq2 >> >> >ATGTTGCG > > >>seq3 >> >> >AGGTTGCT > > >>seq4 >> >> >AGGTTGCC > > >How do I get this into an alignment object? Or, better still, can I create >an alignment object without specifying an alignment file, but somehow >creating the alignment by hand? Maybe create, a sequence object for each of >the above sequences and add them to an alignment object? > >Something like that! :o) > >Nath > > > >>-----Original Message----- >>From: Richard Holland [mailto:richard.holland at ebi.ac.uk] >>Sent: 11 May 2006 14:56 >>To: n.haigh at sheffield.ac.uk >>Cc: biojava-l at lists.open-bio.org >>Subject: Re: [Biojava-l] Creating an alignment object >> >>BioJava itself cannot align sequences. It can only create objects that >>are representations of alignments generated by third-party software. >> >>However, there is a third-party addon to BioJava called Strap, which can >>actually do the alignment work itself from within your Java program and >>return a BioJava alignment object that represents the results. It is >>available for download, along with an example of how to use it, from >>here: >> >> http://www.charite.de/bioinf/strap/biojavaInAnger_SequenceAligner.html >> >>cheers, >>Richard >> >>On Thu, 2006-05-11 at 14:27 +0100, Nathan S. Haigh wrote: >> >> >>>I'm new to Java and Biojava, but I've been having a play with writing >>> >>> >>and >> >> >>>interface and some classes for an app I'd like to write in Java. >>> >>>The part I'm playing around with at the moment deals with alignments and >>>groups of alignment positions. What is the easiest/best way to create an >>>alignment that I can then play around with and generate Locations from? >>> >>> >>A >> >> >>>self contained working example would be great because as I said, I'm >>> >>> >>really >> >> >>>new to java! >>> >>>Cheers >>>Nath >>> >>>------------------------------------------------------------------------ >>> >>> >>---- >> >> >>>------ >>>Dr. Nathan S. Haigh >>>Bioinformatics PostDoctoral Research Associate >>> >>>Room B2 211 Tel: +44 (0)114 >>> >>> >>22 >> >> >>>20112 >>>Department of Animal and Plant Sciences Mob: +44 (0)7742 >>> >>> >>533 >> >> >>>569 >>>University of Sheffield Fax: +44 (0)114 >>> >>> >>22 >> >> >>>20002 >>>Western Bank Web: >>>www.bioinf.shef.ac.uk >>>Sheffield >>>www.petraea.shef.ac.uk >>>S10 2TN >>>------------------------------------------------------------------------ >>> >>> >>---- >> >> >>>------ >>> >>>--- >>>avast! Antivirus: Outbound message clean. >>>Virus Database (VPS): 0619-2, 11/05/2006 >>>Tested on: 11/05/2006 14:27:16 >>>avast! - copyright (c) 1988-2006 ALWIL Software. >>>http://www.avast.com >>> >>> >>> >>> >>> >>>_______________________________________________ >>>Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> >>> >>-- >>Richard Holland (BioMart Team) >>EMBL-EBI >>Wellcome Trust Genome Campus >>Hinxton >>Cambridge CB10 1SD >>UNITED KINGDOM >>Tel: +44-(0)1223-494416 >> >> > >--- >avast! Antivirus: Outbound message clean. >Virus Database (VPS): 0619-2, 11/05/2006 >Tested on: 11/05/2006 15:26:57 >avast! - copyright (c) 1988-2006 ALWIL Software. >http://www.avast.com > > > > > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > From richard.holland at ebi.ac.uk Fri May 12 08:34:41 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Fri, 12 May 2006 09:34:41 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> References: <004b01c6750a$853d2ee0$9f5ea78f@bmbpc196> Message-ID: <1147422881.19855.11.camel@texas.ebi.ac.uk> Sorry for the delay in replying - I had to leave work a bit early yesterday. > Nope, I don't need to generate an alignment, I already have an alignment in > a file created by third party software (clustalw). There is nothing that I know of in BioJava that reads ClustalW files directly into Alignment objects. (If someone else knows different, please correct me). There are certainly methods in BioJava which read the alignments from ClustalW into a set of String objects, each one representing a member sequence (see SequenceAlignmentSAXParser), but I don't know of anything more detailed than that. The third-party package called Strap which I mentioned yesterday happily reads/writes many of the major alignment formats, and has wrappers for running ClustalW and other aligners programatically and reading back in the results, so it is definitely worth a look. You can use a lot of its functions without having to run the GUI, including reading/writing various alignment formats. > > In fact, the app I'd > eventually like to have written in Java would include some sort of wrapper > for clustalw in order to construct the alignments from a set of unaligned > sequences, but algorithms implemented in Biojava would also be a welcome > addition to the app. If you want to wrap clustalw, the simplest way would be to create Sequence objects in BioJava, write them out to Fasta using the BioJava sequence IO tools, use the Java 'system' command (or one of the alternatives to it) to run ClustalW. However you still then have the problem of reading the output back in again. The classes in org.biojava.bio.alignment that I mentioned yesterday implements several useful alignment algorithms which you can use as an alternative to ClustalW. > But first things first. > If I didn't have any sequences or an alignment in any files. What is the > easiest way to get an alignment object in Java to have a play around with? Make an instance of FlexibleAlignment from org.biojava.bio.alignment, and use its methods to add sequences to it. It doesn't do any aligning itself - it is just a placeholder to contain sequences and information about how they align. You have to use its methods to add and remove sequences from the alignment, to add/remove gaps and deletions, and get things like consensus sequences etc. Technically I suppose you could use FlexibleAlignment in conjunction with SequenceAlignmentSAXParser to read alignment members as strings, construct sequences based on them, and add them to the alignment object, but I haven't tried this myself. It'd probably require some extra processing to convert the dashes (gaps) in the inputted strings into proper gaps in the alignment. > Is there a way to just "magically" create a default alignment of say 5 > sequences with 20 positions? You'd have to manually create yourself 5 sequences and add them to a FlexibleAlignment as described above. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From mark.schreiber at novartis.com Mon May 15 09:15:50 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 15 May 2006 17:15:50 +0800 Subject: [Biojava-l] Creating an alignment object Message-ID: I think ClustalW can output alignments as fasta alignment format which biojava definitely can read. - Mark Richard Holland Sent by: biojava-l-bounces at lists.open-bio.org 05/12/2006 04:34 PM To: n.haigh at sheffield.ac.uk cc: biojava-l at lists.open-bio.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Creating an alignment object Sorry for the delay in replying - I had to leave work a bit early yesterday. > Nope, I don't need to generate an alignment, I already have an alignment in > a file created by third party software (clustalw). There is nothing that I know of in BioJava that reads ClustalW files directly into Alignment objects. (If someone else knows different, please correct me). There are certainly methods in BioJava which read the alignments from ClustalW into a set of String objects, each one representing a member sequence (see SequenceAlignmentSAXParser), but I don't know of anything more detailed than that. The third-party package called Strap which I mentioned yesterday happily reads/writes many of the major alignment formats, and has wrappers for running ClustalW and other aligners programatically and reading back in the results, so it is definitely worth a look. You can use a lot of its functions without having to run the GUI, including reading/writing various alignment formats. > > In fact, the app I'd > eventually like to have written in Java would include some sort of wrapper > for clustalw in order to construct the alignments from a set of unaligned > sequences, but algorithms implemented in Biojava would also be a welcome > addition to the app. If you want to wrap clustalw, the simplest way would be to create Sequence objects in BioJava, write them out to Fasta using the BioJava sequence IO tools, use the Java 'system' command (or one of the alternatives to it) to run ClustalW. However you still then have the problem of reading the output back in again. The classes in org.biojava.bio.alignment that I mentioned yesterday implements several useful alignment algorithms which you can use as an alternative to ClustalW. > But first things first. > If I didn't have any sequences or an alignment in any files. What is the > easiest way to get an alignment object in Java to have a play around with? Make an instance of FlexibleAlignment from org.biojava.bio.alignment, and use its methods to add sequences to it. It doesn't do any aligning itself - it is just a placeholder to contain sequences and information about how they align. You have to use its methods to add and remove sequences from the alignment, to add/remove gaps and deletions, and get things like consensus sequences etc. Technically I suppose you could use FlexibleAlignment in conjunction with SequenceAlignmentSAXParser to read alignment members as strings, construct sequences based on them, and add them to the alignment object, but I haven't tried this myself. It'd probably require some extra processing to convert the dashes (gaps) in the inputted strings into proper gaps in the alignment. > Is there a way to just "magically" create a default alignment of say 5 > sequences with 20 positions? You'd have to manually create yourself 5 sequences and add them to a FlexibleAlignment as described above. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From n.haigh at sheffield.ac.uk Mon May 15 09:24:27 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 15 May 2006 10:24:27 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: Message-ID: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> That's right, clustalw can output in several formats including fasta. It would be nice to have Biojava able to read and write the clustalw format as it is a widely used format. How, easy is it to write something like this? Maybe when I start to learn more about Java I could have a go at doing this. Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 15 May 2006 10:16 > To: Richard Holland > Cc: biojava-l at lists.open-bio.org; n.haigh at sheffield.ac.uk > Subject: Re: [Biojava-l] Creating an alignment object > > I think ClustalW can output alignments as fasta alignment format which > biojava definitely can read. > > - Mark > > > > > > Richard Holland > Sent by: biojava-l-bounces at lists.open-bio.org > 05/12/2006 04:34 PM > > > To: n.haigh at sheffield.ac.uk > cc: biojava-l at lists.open-bio.org, (bcc: Mark > Schreiber/GP/Novartis) > Subject: Re: [Biojava-l] Creating an alignment object > > > Sorry for the delay in replying - I had to leave work a bit early > yesterday. > > > Nope, I don't need to generate an alignment, I already have an alignment > in > > a file created by third party software (clustalw). > > There is nothing that I know of in BioJava that reads ClustalW files > directly into Alignment objects. (If someone else knows different, > please correct me). There are certainly methods in BioJava which read > the alignments from ClustalW into a set of String objects, each one > representing a member sequence (see SequenceAlignmentSAXParser), but I > don't know of anything more detailed than that. > > The third-party package called Strap which I mentioned yesterday happily > reads/writes many of the major alignment formats, and has wrappers for > running ClustalW and other aligners programatically and reading back in > the results, so it is definitely worth a look. You can use a lot of its > functions without having to run the GUI, including reading/writing > various alignment formats. > > > > > In fact, the app I'd > > eventually like to have written in Java would include some sort of > wrapper > > for clustalw in order to construct the alignments from a set of > unaligned > > sequences, but algorithms implemented in Biojava would also be a welcome > > addition to the app. > > If you want to wrap clustalw, the simplest way would be to create > Sequence objects in BioJava, write them out to Fasta using the BioJava > sequence IO tools, use the Java 'system' command (or one of the > alternatives to it) to run ClustalW. However you still then have the > problem of reading the output back in again. > > The classes in org.biojava.bio.alignment that I mentioned yesterday > implements several useful alignment algorithms which you can use as an > alternative to ClustalW. > > > But first things first. > > If I didn't have any sequences or an alignment in any files. What is the > > easiest way to get an alignment object in Java to have a play around > with? > > Make an instance of FlexibleAlignment from org.biojava.bio.alignment, > and use its methods to add sequences to it. It doesn't do any aligning > itself - it is just a placeholder to contain sequences and information > about how they align. You have to use its methods to add and remove > sequences from the alignment, to add/remove gaps and deletions, and get > things like consensus sequences etc. > > Technically I suppose you could use FlexibleAlignment in conjunction > with SequenceAlignmentSAXParser to read alignment members as strings, > construct sequences based on them, and add them to the alignment object, > but I haven't tried this myself. It'd probably require some extra > processing to convert the dashes (gaps) in the inputted strings into > proper gaps in the alignment. > > > Is there a way to just "magically" create a default alignment of say 5 > > sequences with 20 positions? > > You'd have to manually create yourself 5 sequences and add them to a > FlexibleAlignment as described above. > > cheers, > Richard > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-3, 12/05/2006 Tested on: 15/05/2006 10:24:25 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From richard.holland at ebi.ac.uk Mon May 15 09:49:47 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Mon, 15 May 2006 10:49:47 +0100 Subject: [Biojava-l] Creating an alignment object In-Reply-To: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> References: <003501c67801$5cea09a0$9f5ea78f@bmbpc196> Message-ID: <1147686587.3950.10.camel@texas.ebi.ac.uk> One way to write a file parser (which I used in all the BioJavaX parsers) is to write an event-based one, which requires two parts: a parser, and an event listener. Basically, the parser reads a chunk from the file, recognises what kind of chunk it is and does some pre-parsing on it, for example stripping whitespace etc. or concatenating lines of sequence data. It then sends a signal to an event listener saying it has received a chunk of data of a certain kind, and asks the event listener to process that data. The event listener could receive this data in any order (and hence one listener can be adapted to listen for events from many file formats), so needs to be aware of its state at any given point during the parsing process. The code tends to get quite long and convoluted, but the concept is quite simple. Hopefully this gives you an idea of how to do it - you don't necessarily need to know any particular programming language in order to design this kind of parser/listener, just a good knowledge of the file format and the ability to describe the various interesting sections of a file and how to spot them. You can then convert these descriptions into Java or any other language once you've learnt the skills to do so. Regular expressions can be extremely useful, as are the Java String methods toUpperCase(), toLowerCase(), contains(), equals(), equalsIgnoreCase(), startsWith() and endsWith(). It gets a little more complicated once you start allowing for non- standard files, such as those containing irregular whitespace or extra blank lines, but if you write a strict parser first (which all the BioJavaX parsers are), this type of flexibility can be left till later. Good luck! cheers, Richard On Mon, 2006-05-15 at 10:24 +0100, Nathan S. Haigh wrote: > That's right, clustalw can output in several formats including fasta. It > would be nice to have Biojava able to read and write the clustalw format as > it is a widely used format. How, easy is it to write something like this? > Maybe when I start to learn more about Java I could have a go at doing this. > > Nath > > > -----Original Message----- > > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > > Sent: 15 May 2006 10:16 > > To: Richard Holland > > Cc: biojava-l at lists.open-bio.org; n.haigh at sheffield.ac.uk > > Subject: Re: [Biojava-l] Creating an alignment object > > > > I think ClustalW can output alignments as fasta alignment format which > > biojava definitely can read. > > > > - Mark > > > > > > > > > > > > Richard Holland > > Sent by: biojava-l-bounces at lists.open-bio.org > > 05/12/2006 04:34 PM > > > > > > To: n.haigh at sheffield.ac.uk > > cc: biojava-l at lists.open-bio.org, (bcc: Mark > > Schreiber/GP/Novartis) > > Subject: Re: [Biojava-l] Creating an alignment object > > > > > > Sorry for the delay in replying - I had to leave work a bit early > > yesterday. > > > > > Nope, I don't need to generate an alignment, I already have an alignment > > in > > > a file created by third party software (clustalw). > > > > There is nothing that I know of in BioJava that reads ClustalW files > > directly into Alignment objects. (If someone else knows different, > > please correct me). There are certainly methods in BioJava which read > > the alignments from ClustalW into a set of String objects, each one > > representing a member sequence (see SequenceAlignmentSAXParser), but I > > don't know of anything more detailed than that. > > > > The third-party package called Strap which I mentioned yesterday happily > > reads/writes many of the major alignment formats, and has wrappers for > > running ClustalW and other aligners programatically and reading back in > > the results, so it is definitely worth a look. You can use a lot of its > > functions without having to run the GUI, including reading/writing > > various alignment formats. > > > > > > > > In fact, the app I'd > > > eventually like to have written in Java would include some sort of > > wrapper > > > for clustalw in order to construct the alignments from a set of > > unaligned > > > sequences, but algorithms implemented in Biojava would also be a welcome > > > addition to the app. > > > > If you want to wrap clustalw, the simplest way would be to create > > Sequence objects in BioJava, write them out to Fasta using the BioJava > > sequence IO tools, use the Java 'system' command (or one of the > > alternatives to it) to run ClustalW. However you still then have the > > problem of reading the output back in again. > > > > The classes in org.biojava.bio.alignment that I mentioned yesterday > > implements several useful alignment algorithms which you can use as an > > alternative to ClustalW. > > > > > But first things first. > > > If I didn't have any sequences or an alignment in any files. What is the > > > easiest way to get an alignment object in Java to have a play around > > with? > > > > Make an instance of FlexibleAlignment from org.biojava.bio.alignment, > > and use its methods to add sequences to it. It doesn't do any aligning > > itself - it is just a placeholder to contain sequences and information > > about how they align. You have to use its methods to add and remove > > sequences from the alignment, to add/remove gaps and deletions, and get > > things like consensus sequences etc. > > > > Technically I suppose you could use FlexibleAlignment in conjunction > > with SequenceAlignmentSAXParser to read alignment members as strings, > > construct sequences based on them, and add them to the alignment object, > > but I haven't tried this myself. It'd probably require some extra > > processing to convert the dashes (gaps) in the inputted strings into > > proper gaps in the alignment. > > > > > Is there a way to just "magically" create a default alignment of say 5 > > > sequences with 20 positions? > > > > You'd have to manually create yourself 5 sequences and add them to a > > FlexibleAlignment as described above. > > > > cheers, > > Richard > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0619-3, 12/05/2006 > Tested on: 15/05/2006 10:24:25 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > > -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From mark.schreiber at novartis.com Wed May 17 05:44:18 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 17 May 2006 13:44:18 +0800 Subject: [Biojava-l] external processes Message-ID: Hi all - I noticed that someone has posted a tutorial to the wiki page (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how to launch ClustalW from biojava which is very much appreciated. The tutorial makes use of the standard Java Runtime and Process classes. Developers may also be interested in the ExecRunner class that is in the utils package of biojava1.4. There is also an entire API for handelling external processes in the CVS version of biojava (org.biojava.utils.process) which makes handling of external processes much simpler. - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From mark.schreiber at novartis.com Wed May 17 09:38:34 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Wed, 17 May 2006 17:38:34 +0800 Subject: [Biojava-l] external processes Message-ID: Sounds reasonable, Do you have CVS access? If so please submit this addition. Can you also put javadoc comments giving the example of why you might need to use a string[] as a parameter. Can you also add a @since 1.5 javadoc tag to the method and add yourself as an author to the class (javadoc doesn't allow for @author comments at the method level). Thanks, - Mark Andreas Dr?ger 05/17/2006 05:27 PM To: mark.schreiber at novartis.com cc: biojava-l at biojava.org Subject: Re: [Biojava-l] external processes Hello, I just tried the ExecRunner class with a compliation of a Matlab skript. My parameters are Matlab matrices and vectors like [1 2 3; 4 5 6] and so on. In the ExecRunner class this 2x3 matrix will be destroyed by the StringTokenizer and my Matlab skript will be started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't make any sense. I would like to suggest to add a method where one can pass the aruments as a String[]-vector. In my case I could pass every single Matlab matrix and every Matlab vector as a single String. I just tried this out with the following code: public static String execute(String command[]) throws IOException { String out = null, temp; Process exe = Runtime.getRuntime().exec(command); BufferedReader in = new BufferedReader( new InputStreamReader(exe.getInputStream())); for (out = ""; (temp = in.readLine()) != null; out += temp + "\n"); return out; } It works fine. I would add something similar to the class ExecRunner mentioned above, but adapted so that the other features of this class will also be maintained. Andreas Dr?ger mark.schreiber at novartis.com wrote: >Hi all - > >I noticed that someone has posted a tutorial to the wiki page >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how >to launch ClustalW from biojava which is very much appreciated. The >tutorial makes use of the standard Java Runtime and Process classes. >Developers may also be interested in the ExecRunner class that is in the >utils package of biojava1.4. > >There is also an entire API for handelling external processes in the CVS >version of biojava (org.biojava.utils.process) which makes handling of >external processes much simpler. > >- Mark > >Mark Schreiber >Research Investigator (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- ================================== Andreas Dr?ger PhD student Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Phone: +49-7071-29-70436 ================================== From andreas.draeger at uni-tuebingen.de Wed May 17 09:27:00 2006 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Wed, 17 May 2006 11:27:00 +0200 Subject: [Biojava-l] external processes In-Reply-To: References: Message-ID: <446AEC64.8070604@uni-tuebingen.de> Hello, I just tried the ExecRunner class with a compliation of a Matlab skript. My parameters are Matlab matrices and vectors like [1 2 3; 4 5 6] and so on. In the ExecRunner class this 2x3 matrix will be destroyed by the StringTokenizer and my Matlab skript will be started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't make any sense. I would like to suggest to add a method where one can pass the aruments as a String[]-vector. In my case I could pass every single Matlab matrix and every Matlab vector as a single String. I just tried this out with the following code: public static String execute(String command[]) throws IOException { String out = null, temp; Process exe = Runtime.getRuntime().exec(command); BufferedReader in = new BufferedReader( new InputStreamReader(exe.getInputStream())); for (out = ""; (temp = in.readLine()) != null; out += temp + "\n"); return out; } It works fine. I would add something similar to the class ExecRunner mentioned above, but adapted so that the other features of this class will also be maintained. Andreas Dr?ger mark.schreiber at novartis.com wrote: >Hi all - > >I noticed that someone has posted a tutorial to the wiki page >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) showing how >to launch ClustalW from biojava which is very much appreciated. The >tutorial makes use of the standard Java Runtime and Process classes. >Developers may also be interested in the ExecRunner class that is in the >utils package of biojava1.4. > >There is also an entire API for handelling external processes in the CVS >version of biojava (org.biojava.utils.process) which makes handling of >external processes much simpler. > >- Mark > >Mark Schreiber >Research Investigator (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- ================================== Andreas Dr?ger PhD student Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Phone: +49-7071-29-70436 ================================== From guedes at unisul.br Wed May 17 12:18:26 2006 From: guedes at unisul.br (Dickson S. Guedes) Date: Wed, 17 May 2006 09:18:26 -0300 Subject: [Biojava-l] RES: external processes In-Reply-To: Message-ID: <200605171218.k4HCIWO5029410@relay.unisul.br> Hi All, Sorry, I don't have many time in this week but, I did see many question in list about MultAlign using Biojava, but Biojava DON'T have a Class to make MultAlign. So In my teses I?d want to use a set of pre-aligned sequences, then I create a Class to do it calling ClustalW as a external executable. A Simple Example founds at: http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW Sorry, I don't make any comments and any comments or javadoc in that class, but I'll do, for now I have other things to do at this week and don't have much time :( I accept sugestions too, and thanks for all. []s -- Dickson S. Guedes /* * UNISUL - Universidade do Sul de Santa Catarina * ATI - Assessoria de Tecnologia da Informa??o * Tubar?o - Santa Catarina - Brasil * (0xx48) 621-3200 - http://www.unisul.br * * "Quis custodiet ipsos custodes?" */ > -----Mensagem original----- > De: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] Em nome de > mark.schreiber at novartis.com > Enviada em: quarta-feira, 17 de maio de 2006 02:44 > Para: biojava-l at biojava.org > Assunto: [Biojava-l] external processes > > Hi all - > > I noticed that someone has posted a tutorial to the wiki page > (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing how to launch ClustalW from biojava which is very > much appreciated. The tutorial makes use of the standard Java > Runtime and Process classes. > Developers may also be interested in the ExecRunner class > that is in the utils package of biojava1.4. > > There is also an entire API for handelling external processes > in the CVS version of biojava (org.biojava.utils.process) > which makes handling of external processes much simpler. > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From russ at kepler-eng.com Wed May 17 22:23:31 2006 From: russ at kepler-eng.com (Russ Kepler) Date: Wed, 17 May 2006 16:23:31 -0600 Subject: [Biojava-l] external processes In-Reply-To: <446AEC64.8070604@uni-tuebingen.de> References: <446AEC64.8070604@uni-tuebingen.de> Message-ID: <200605171623.31936.russ@kepler-eng.com> On Wednesday 17 May 2006 03:27 am, Andreas Dr?ger wrote: > I just tried the ExecRunner class with a compliation of a Matlab skript. > My parameters are Matlab matrices and vectors like > [1 2 3; 4 5 6] > and so on. In the ExecRunner class this 2x3 matrix will be destroyed by > the StringTokenizer and my Matlab skript will be started with the arguments > [1, 2, 3;, 4, 5, 6] > which doesn't make any sense. As a workaround I simply put the command I wanted into a file in the local directory and execute that indirectly with "sh ./cmdfile". It's harder to make it portable, but then executing something seldom is very portable. From Martin.Szugat at GMX.net Wed May 17 23:41:26 2006 From: Martin.Szugat at GMX.net (Martin Szugat) Date: Thu, 18 May 2006 01:41:26 +0200 Subject: [Biojava-l] external processes In-Reply-To: Message-ID: <200605172347.k4HNlAMB000449@newportal.open-bio.org> Do you have tried the ExternalProcess class? (http://cvs.biojava.org/cgi-bin/viewcvs/viewcvs.cgi/biojava-live/src/org/bio java/utils/process/?cvsroot=biojava) As far as I understand the problem the ExternalProcess class isn't affected by it. In addition, because it uses multiple threads from a thread pool, it's more robust against locks, which e.g. can happen, if the called program writes out data faster than the data is read in the calling program. This is a common problem but which does only happen sporadically. So it's very hard to locate and debug. Best regards Martin > -----Original Message----- > From: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of > mark.schreiber at novartis.com > Sent: Wednesday, May 17, 2006 11:39 AM > To: Andreas Dr?ger > Cc: mark.schreiber at novartis.com; biojava-l at biojava.org > Subject: Re: [Biojava-l] external processes > > Sounds reasonable, > > Do you have CVS access? If so please submit this addition. > Can you also put javadoc comments giving the example of why > you might need to use a string[] as a parameter. > > Can you also add a @since 1.5 javadoc tag to the method and > add yourself as an author to the class (javadoc doesn't allow > for @author comments at the method level). > > Thanks, > > - Mark > > > > > > Andreas Dr?ger > 05/17/2006 05:27 PM > > > To: mark.schreiber at novartis.com > cc: biojava-l at biojava.org > Subject: Re: [Biojava-l] external processes > > > Hello, > > I just tried the ExecRunner class with a compliation of a > Matlab skript. > My parameters are Matlab matrices and vectors like > [1 2 3; 4 5 6] > and so on. In the ExecRunner class this 2x3 matrix will be > destroyed by the StringTokenizer and my Matlab skript will be > started with the arguments [1, 2, 3;, 4, 5, 6] which doesn't > make any sense. I would like to suggest to add a method where > one can pass the aruments as a String[]-vector. In my case I > could pass every single Matlab matrix and every Matlab vector > as a single String. I just tried this out with the following code: > > public static String execute(String command[]) throws IOException { > String out = null, temp; > Process exe = Runtime.getRuntime().exec(command); > BufferedReader in = new BufferedReader( > new InputStreamReader(exe.getInputStream())); > for (out = ""; (temp = in.readLine()) != null; out += > temp + "\n"); > return out; > } > > It works fine. I would add something similar to the class > ExecRunner mentioned above, but adapted so that the other > features of this class will also be maintained. > > Andreas Dr?ger > > mark.schreiber at novartis.com wrote: > > >Hi all - > > > >I noticed that someone has posted a tutorial to the wiki page > >(http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing > >how to launch ClustalW from biojava which is very much > appreciated. The > >tutorial makes use of the standard Java Runtime and Process classes. > >Developers may also be interested in the ExecRunner class that is in > >the utils package of biojava1.4. > > > >There is also an entire API for handelling external processes in the > >CVS version of biojava (org.biojava.utils.process) which > makes handling > >of external processes much simpler. > > > >- Mark > > > >Mark Schreiber > >Research Investigator (Bioinformatics) > > > >Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > >#05-01 Chromos > >Singapore 138670 > >www.nitd.novartis.com > > > >phone +65 6722 2973 > >fax +65 6722 2910 > > > >_______________________________________________ > >Biojava-l mailing list - Biojava-l at lists.open-bio.org > >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > > > > > > -- > ================================== > Andreas Dr?ger > PhD student > Eberhard Karls University T?bingen > Center for Bioinformatics (ZBIT) > Phone: +49-7071-29-70436 > ================================== > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From guedes at unisul.br Wed May 17 11:25:42 2006 From: guedes at unisul.br (Dickson S. Guedes) Date: Wed, 17 May 2006 08:25:42 -0300 Subject: [Biojava-l] RES: external processes In-Reply-To: Message-ID: <200605171126.k4HBQ2O5073901@relay.unisul.br> Hi All, Sorry, I don't have many time in this week but, I did see many question in list about MultAlign using Biojava, but Biojava DON'T have a Class to make MultAlign. So In my teses I?d want to use a set of pre-aligned sequences, then I create a Class to do it calling ClustalW as a external executable. A Simple Example founds at: http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW Sorry, I don't make any comments and any comments or javadoc in that class, but I'll do, for now I have other things to do at this week and don't have much time :( I accept sugestions too, and thanks for all. []s -- Dickson S. Guedes /* * UNISUL - Universidade do Sul de Santa Catarina * ATI - Assessoria de Tecnologia da Informa??o * Tubar?o - Santa Catarina - Brasil * (0xx48) 621-3200 - http://www.unisul.br * * "Quis custodiet ipsos custodes?" */ > -----Mensagem original----- > De: biojava-l-bounces at lists.open-bio.org > [mailto:biojava-l-bounces at lists.open-bio.org] Em nome de > mark.schreiber at novartis.com > Enviada em: quarta-feira, 17 de maio de 2006 02:44 > Para: biojava-l at biojava.org > Assunto: [Biojava-l] external processes > > Hi all - > > I noticed that someone has posted a tutorial to the wiki page > (http://biojava.org/wiki/BioJava:Tutorial:MultiAlignClustalW) > showing how to launch ClustalW from biojava which is very > much appreciated. The tutorial makes use of the standard Java > Runtime and Process classes. > Developers may also be interested in the ExecRunner class > that is in the utils package of biojava1.4. > > There is also an entire API for handelling external processes > in the CVS version of biojava (org.biojava.utils.process) > which makes handling of external processes much simpler. > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From n.haigh at sheffield.ac.uk Thu May 18 15:44:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 18 May 2006 16:44:00 +0100 Subject: [Biojava-l] Alignment consensus calculation Message-ID: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> I was wondering if there were any methods for generating a consensus sequence for alignments? Or any suggestions for calculating the frequency of symbols at each position in an alignment. ? I had a look at the DistributionTools after seeing a past e-mail to the list but couldn?t figure if this would do the job as I?m new to Java. ? Thanks Nath ? ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? ?????? ?????www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ?????? ---------------------------------------------------------------------------- ------ ? --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 18/05/2006 16:44:01 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From sanges at biogem.it Thu May 18 16:00:32 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 18 May 2006 18:00:32 +0200 Subject: [Biojava-l] Alignment consensus calculation In-Reply-To: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> References: <000401c67a91$e1fbf3f0$9f5ea78f@bmbpc196> Message-ID: <446C9A20.5010803@biogem.it> Nathan S. Haigh wrote: >I was wondering if there were any methods for generating a consensus >sequence for alignments? Or any suggestions for calculating the frequency of >symbols at each position in an alignment. > >I had a look at the DistributionTools after seeing a past e-mail to the list >but couldn?t figure if this would do the job as I?m new to Java. > > > I'm also new to Java and Biojava, BTW I have found very useful in the past to do these kind of things using the Bio::SimpleAlign module in Bioperl HTH Remo >Thanks >Nath > >---------------------------------------------------------------------------- >------ >Dr. Nathan S. Haigh >Bioinformatics PostDoctoral Research Associate > >Room B2 211 Tel: +44 (0)114 22 >20112 >Department of Animal and Plant Sciences Mob: +44 (0)7742 533 >569 >University of Sheffield Fax: +44 (0)114 22 >20002 >Western Bank Web: >www.bioinf.shef.ac.uk >Sheffield > www.petraea.shef.ac.uk >S10 2TN >---------------------------------------------------------------------------- >------ > > > >--- >avast! Antivirus: Outbound message clean. >Virus Database (VPS): 0620-2, 18/05/2006 >Tested on: 18/05/2006 16:44:01 >avast! - copyright (c) 1988-2006 ALWIL Software. >http://www.avast.com > > > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l > > > From chen_li3 at yahoo.com Thu May 18 21:35:10 2006 From: chen_li3 at yahoo.com (chen li) Date: Thu, 18 May 2006 14:35:10 -0700 (PDT) Subject: [Biojava-l] Problems for testing demos Message-ID: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> Dear all, I am new to Biojava. I install 1) JDk on my Windows XP under c:\Program Files\java\....., 2) biojava.jar, bytecode-0.92.jar, commons-cli.jar, commons-collections-2.1.jar commons-dbcp-1.1.jar, commons-pool-1.1.jar all under c:\biojava folder. Then I go to >demos>seq and type javac TestEmbl.java I get some information on the screen. After searching google I put all *.jar files into this folder: C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext I compile the code again: go to >demos>seq and type javac TestEmbl.java and I get a new file and looks like it works: TestEmbl.class Then I type java TestEmbl I get these infomration on the screen: Exception in thread "main" java.lang.NoClassDefFoundError: TestEmbl (wrong name: seq/TestEmbl) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) I am not sure how to fix it. I search Biojava archies but get no answers. Any idea will be aprreciated. Li __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From mark.schreiber at novartis.com Fri May 19 02:16:00 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 19 May 2006 10:16:00 +0800 Subject: [Biojava-l] Alignment consensus calculation Message-ID: Hi - To get a Distribution[] over an alignment you could use DistributionTools.distOverAlignment(a) or one of the other overloaded methods. To get a consensus you could simply find the most frequent Symbol in each Distribution. To make a more sophisticated consensus you could have thresholds below which you would report an ambiguity. eg if: a = 0.50 t = 0.40 c = 0.0 g = 1.0 Your routine would need to decide if the consensus should be 'a' or 'w' or the IUPAC symbol for [atg] which I cannot remember. You would probably use some sort of cutoff value. It might be a routine like this: public SymbolList consensus(Alignment a, double threshold){ .... } It might be a method that others find useful so please post it back to the list. Hope this helps, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 "Nathan S. Haigh" Sent by: biojava-l-bounces at lists.open-bio.org 05/18/2006 11:44 PM Please respond to n.haigh To: cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Alignment consensus calculation I was wondering if there were any methods for generating a consensus sequence for alignments? Or any suggestions for calculating the frequency of symbols at each position in an alignment. ? I had a look at the DistributionTools after seeing a past e-mail to the list but couldn't figure if this would do the job as I'm new to Java. ? Thanks Nath ? ---------------------------------------------------------------------------- ------ Dr. Nathan S. Haigh Bioinformatics PostDoctoral Research Associate ? Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 20112 Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 533 569 University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 20002 Western Bank???????????????????????????? ?????? ?????? Web: www.bioinf.shef.ac.uk Sheffield??????????????????????????????? ?????? ?????? ?????www.petraea.shef.ac.uk S10 2TN????????????????????????????????? ?????? ?????? ---------------------------------------------------------------------------- ------ ? --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 18/05/2006 16:44:01 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Fri May 19 02:25:01 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 19 May 2006 10:25:01 +0800 Subject: [Biojava-l] Problems for testing demos Message-ID: The most likely answer is that the folder containing TestEmbl.class is not on your class path. Either that or you could put TestEmbl.class in your C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext folder. By the way, using the ext folder can cause problems if you have different versions of JAR files or files with conflciting names, however for testing it should be fine. - Mark chen li Sent by: biojava-l-bounces at lists.open-bio.org 05/19/2006 05:35 AM To: biojava-l at lists.open-bio.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Problems for testing demos Dear all, I am new to Biojava. I install 1) JDk on my Windows XP under c:\Program Files\java\....., 2) biojava.jar, bytecode-0.92.jar, commons-cli.jar, commons-collections-2.1.jar commons-dbcp-1.1.jar, commons-pool-1.1.jar all under c:\biojava folder. Then I go to >demos>seq and type javac TestEmbl.java I get some information on the screen. After searching google I put all *.jar files into this folder: C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext I compile the code again: go to >demos>seq and type javac TestEmbl.java and I get a new file and looks like it works: TestEmbl.class Then I type java TestEmbl I get these infomration on the screen: Exception in thread "main" java.lang.NoClassDefFoundError: TestEmbl (wrong name: seq/TestEmbl) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) I am not sure how to fix it. I search Biojava archies but get no answers. Any idea will be aprreciated. Li __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From richard.holland at ebi.ac.uk Fri May 19 07:59:55 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Fri, 19 May 2006 08:59:55 +0100 Subject: [Biojava-l] Problems for testing demos In-Reply-To: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> References: <20060518213510.22256.qmail@web36815.mail.mud.yahoo.com> Message-ID: <1148025595.4407.38.camel@texas.ebi.ac.uk> This is a basic Java problem, not a BioJava one... The class lives in the 'seq' folder, and is therefore part of the 'seq' package. To run it, you must change to the 'demos' folder which contains the 'seq' folder and type: java seq/TestEmbl.java cheers, Richard On Thu, 2006-05-18 at 14:35 -0700, chen li wrote: > Dear all, > > I am new to Biojava. I install > 1) JDk on my Windows XP under c:\Program > Files\java\....., > 2) biojava.jar, bytecode-0.92.jar, > commons-cli.jar, commons-collections-2.1.jar > commons-dbcp-1.1.jar, commons-pool-1.1.jar all under > c:\biojava folder. Then I go to >demos>seq and type > javac TestEmbl.java I get some information on the > screen. After searching google I put all *.jar files > into this folder: > > C:\Program Files\Java\jdk1.5.0_06\jre\lib\ext > > I compile the code again: > > go to >demos>seq and type > > javac TestEmbl.java > > and I get a new file and looks like it works: > > TestEmbl.class > > Then I type > > java TestEmbl > > I get these infomration on the screen: > > Exception in thread "main" > java.lang.NoClassDefFoundError: TestEmbl (wrong name: > seq/TestEmbl) > at java.lang.ClassLoader.defineClass1(Native > Method) > at java.lang.ClassLoader.defineClass(Unknown > Source) > at > java.security.SecureClassLoader.defineClass(Unknown > Source) > at java.net.URLClassLoader.defineClass(Unknown > Source) > at java.net.URLClassLoader.access$100(Unknown > Source) > at java.net.URLClassLoader$1.run(Unknown > Source) > at > java.security.AccessController.doPrivileged(Native > Method) > at java.net.URLClassLoader.findClass(Unknown > Source) > at java.lang.ClassLoader.loadClass(Unknown > Source) > at > sun.misc.Launcher$AppClassLoader.loadClass(Unknown > Source) > at java.lang.ClassLoader.loadClass(Unknown > Source) > at > java.lang.ClassLoader.loadClassInternal(Unknown > Source) > > I am not sure how to fix it. I search Biojava archies > but get no answers. Any idea will be aprreciated. > > Li > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From n.haigh at sheffield.ac.uk Fri May 19 10:23:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 19 May 2006 11:23:34 +0100 Subject: [Biojava-l] Alignment consensus calculation In-Reply-To: Message-ID: <003001c67b2e$49060080$9f5ea78f@bmbpc196> Sorry for being really thick :o) BUT, how do you get the frequencies of the symbols at each position in the alignment? I have: Distribution[] dist = DistributionTools.distOverAlignment(alignment, true); But can figure out how to access the frequencies I need. Cheers! :o) Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 19 May 2006 03:16 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Alignment consensus calculation > > Hi - > > To get a Distribution[] over an alignment you could use > DistributionTools.distOverAlignment(a) or one of the other overloaded > methods. > > To get a consensus you could simply find the most frequent Symbol in each > Distribution. To make a more sophisticated consensus you could have > thresholds below which you would report an ambiguity. > > eg if: > > a = 0.50 > t = 0.40 > c = 0.0 > g = 1.0 > > Your routine would need to decide if the consensus should be 'a' or 'w' or > the IUPAC symbol for [atg] which I cannot remember. You would probably use > some sort of cutoff value. It might be a routine like this: > > public SymbolList consensus(Alignment a, double threshold){ > .... > } > > It might be a method that others find useful so please post it back to the > list. > > Hope this helps, > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) > 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > > > > > "Nathan S. Haigh" > Sent by: biojava-l-bounces at lists.open-bio.org > 05/18/2006 11:44 PM > Please respond to n.haigh > > > To: > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] Alignment consensus calculation > > > I was wondering if there were any methods for generating a consensus > sequence for alignments? Or any suggestions for calculating the frequency > of > symbols at each position in an alignment. > > I had a look at the DistributionTools after seeing a past e-mail to the > list > but couldn't figure if this would do the job as I'm new to Java. > > Thanks > Nath > > -------------------------------------------------------------------------- > -- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 > 533 > 569 > University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 > 20002 > Western Bank???????????????????????????? ?????? ?????? Web: > www.bioinf.shef.ac.uk > Sheffield > ?????www.petraea.shef.ac.uk > S10 2TN > -------------------------------------------------------------------------- > -- > ------ > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0620-2, 18/05/2006 > Tested on: 18/05/2006 16:44:01 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 19/05/2006 11:23:32 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com From mark.schreiber at novartis.com Mon May 22 00:53:54 2006 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 22 May 2006 08:53:54 +0800 Subject: [Biojava-l] Alignment consensus calculation Message-ID: Hi - Take a look at the Distribution examples in http://biojava.org/wiki/BioJava:Cookbook "Nathan S. Haigh" Sent by: biojava-l-bounces at lists.open-bio.org 05/19/2006 06:23 PM Please respond to n.haigh To: cc: biojava-l at lists.open-bio.org Subject: Re: [Biojava-l] Alignment consensus calculation Sorry for being really thick :o) BUT, how do you get the frequencies of the symbols at each position in the alignment? I have: Distribution[] dist = DistributionTools.distOverAlignment(alignment, true); But can figure out how to access the frequencies I need. Cheers! :o) Nath > -----Original Message----- > From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com] > Sent: 19 May 2006 03:16 > To: n.haigh at sheffield.ac.uk > Cc: biojava-l at lists.open-bio.org > Subject: Re: [Biojava-l] Alignment consensus calculation > > Hi - > > To get a Distribution[] over an alignment you could use > DistributionTools.distOverAlignment(a) or one of the other overloaded > methods. > > To get a consensus you could simply find the most frequent Symbol in each > Distribution. To make a more sophisticated consensus you could have > thresholds below which you would report an ambiguity. > > eg if: > > a = 0.50 > t = 0.40 > c = 0.0 > g = 1.0 > > Your routine would need to decide if the consensus should be 'a' or 'w' or > the IUPAC symbol for [atg] which I cannot remember. You would probably use > some sort of cutoff value. It might be a routine like this: > > public SymbolList consensus(Alignment a, double threshold){ > .... > } > > It might be a method that others find useful so please post it back to the > list. > > Hope this helps, > > - Mark > > Mark Schreiber > Research Investigator (Bioinformatics) > > Novartis Institute for Tropical Diseases (NITD) > 10 Biopolis Road > #05-01 Chromos > Singapore 138670 > www.nitd.novartis.com > > phone +65 6722 2973 > fax +65 6722 2910 > > > > > > "Nathan S. Haigh" > Sent by: biojava-l-bounces at lists.open-bio.org > 05/18/2006 11:44 PM > Please respond to n.haigh > > > To: > cc: (bcc: Mark Schreiber/GP/Novartis) > Subject: [Biojava-l] Alignment consensus calculation > > > I was wondering if there were any methods for generating a consensus > sequence for alignments? Or any suggestions for calculating the frequency > of > symbols at each position in an alignment. > > I had a look at the DistributionTools after seeing a past e-mail to the > list > but couldn't figure if this would do the job as I'm new to Java. > > Thanks > Nath > > -------------------------------------------------------------------------- > -- > ------ > Dr. Nathan S. Haigh > Bioinformatics PostDoctoral Research Associate > > Room B2 211????????????????????????????? ?????? ?????? Tel: +44 (0)114 22 > 20112 > Department of Animal and Plant Sciences???????? ?????? Mob: +44 (0)7742 > 533 > 569 > University of Sheffield???????????????????????? ?????? Fax: +44 (0)114 22 > 20002 > Western Bank???????????????????????????? ?????? ?????? Web: > www.bioinf.shef.ac.uk > Sheffield > ?????www.petraea.shef.ac.uk > S10 2TN > -------------------------------------------------------------------------- > -- > ------ > > > > --- > avast! Antivirus: Outbound message clean. > Virus Database (VPS): 0620-2, 18/05/2006 > Tested on: 18/05/2006 16:44:01 > avast! - copyright (c) 1988-2006 ALWIL Software. > http://www.avast.com > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0620-2, 18/05/2006 Tested on: 19/05/2006 11:23:32 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From xiaoqing at cgcmail.cpmc.columbia.edu Wed May 17 14:46:45 2006 From: xiaoqing at cgcmail.cpmc.columbia.edu (Xiaoqing Zhang) Date: Wed, 17 May 2006 10:46:45 -0400 Subject: [Biojava-l] Colorful sequence logo for the transcription factor binding sites? Message-ID: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> Hi, I am trying to draw some TFBS logo with different colors, the classes I found under org.biojava.bio.gui can only draw black and grey. Anyone has some suggestions about where I can find a java package for colorful sequence logo? Thanks a lot. Xiaoqing From td2 at sanger.ac.uk Mon May 22 13:07:19 2006 From: td2 at sanger.ac.uk (Thomas Down) Date: Mon, 22 May 2006 14:07:19 +0100 Subject: [Biojava-l] Colorful sequence logo for the transcription factor binding sites? In-Reply-To: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> References: <9D9AA9C4E0C18545B248A75E9FB1E35312B515@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <4F7A571D-007B-4373-9C69-9B7D80D8AAE9@sanger.ac.uk> On 17 May 2006, at 15:46, Xiaoqing Zhang wrote: > Hi, > > I am trying to draw some TFBS logo with different colors, the > classes I found > under org.biojava.bio.gui can only draw black and grey. Anyone has > some > suggestions about where I can find a java package for colorful > sequence logo? BioJava can produce coloured logos, but you need to specify a class which defines the "palette" of colors to use. Try something like: DistributionLogo dl = new DistributionLogo(); dl.setStyle(new DNAStyle(false)); // define a palette // ... Hope this helps, Thomas. From richard.holland at ebi.ac.uk Wed May 24 09:44:58 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 24 May 2006 10:44:58 +0100 Subject: [Biojava-l] EMBL 87 format Message-ID: <1148463898.3963.12.camel@texas.ebi.ac.uk> Hi all. I've updated the EMBLFormat in BioJavaX to be capable of reading/writing files in both Pre-87 and 87+ versions of the EMBL format. By default it'll read either, and write the new version. If you want it to write the older version, you have to call the writeSequence() methods directly and specify the format as EMBL_PRE87_FORMAT. cheers, Richard -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416 From shameer at ncbs.res.in Mon May 29 10:07:52 2006 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 29 May 2006 15:37:52 +0530 (IST) Subject: [Biojava-l] Reg. Integrated Server / CGI to pass PDB to multiple Servers Message-ID: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> Dear All, My query may not be directly related to BioPERL, But am sure I will get some idea to move on. Some possibilities wil be available from Pise or related modules Query : --------- We have several public servers(say a,b,c). All of them will take a pdb-file as an input and process it and displays it. Now, I need to create a web page(a meta-server/integrated web-server) with three radio buttons(a,b,c) and a single input form(to accept pdb file from the users ...:( - File passing as an argument seems to be some what impossible). I need output as 3 links in next page. Is there any Bio-PERL module / CGI / Perl tricks to do it ? Thanks in advance, -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India T - 91-080-23636420-32 EXT 4241 F - 91-080-23636662/23636675 W - http://caps.ncbs.res.in -------------------------------------------------- "Refrain from illusions, insist on work and not words, patiently seek divine and scientific truth." From fpepin at cs.mcgill.ca Mon May 29 16:17:24 2006 From: fpepin at cs.mcgill.ca (Francois Pepin) Date: Mon, 29 May 2006 12:17:24 -0400 Subject: [Biojava-l] Reg. Integrated Server / CGI to pass PDB to multiple Servers In-Reply-To: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> References: <49344.192.168.1.1.1148897272.squirrel@192.168.1.1> Message-ID: <1148919444.28198.33.camel@elm.mcb.mcgill.ca> Hi Shameer, this is the bio-java list :). The bioperl list is at bioperl-l at lists.open-bio.org. Cheers, Francois On Mon, 2006-05-29 at 15:37 +0530, Shameer Khadar wrote: > Dear All, > > My query may not be directly related to BioPERL, But am sure I will get > some idea to move on. Some possibilities wil be available from Pise or > related modules > > Query : > --------- > We have several public servers(say a,b,c). All of them will take a > pdb-file as an input and process it and displays it. Now, I need to create > a web page(a meta-server/integrated web-server) with three radio > buttons(a,b,c) and a single input form(to accept pdb file from the users > ...:( - File passing as an argument seems to be some what impossible). I > need output as 3 links in next page. > > Is there any Bio-PERL module / CGI / Perl tricks to do it ? > > Thanks in advance, From wendy.wong at gmail.com Tue May 30 09:49:01 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 30 May 2006 10:49:01 +0100 Subject: [Biojava-l] viterbi training in biojava Message-ID: Hi, I was wondering if viterbi training is implemented in biojava, or if there's any open source version implemented using biojava? thanks, Wendy From smh1008 at cam.ac.uk Tue May 30 11:19:15 2006 From: smh1008 at cam.ac.uk (David Huen) Date: 30 May 2006 12:19:15 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: On May 30 2006, wendy wong wrote: >Hi, > >I was wondering if viterbi training is implemented in biojava, or if >there's any open source version implemented using biojava? > There is one-head viterbi training already I think. The training framework doesn't work for two-head - I wrote a viterbi training API that works for two head but it is not fully compatible with the existing API so I never put it into CVS, plus it doesn't have Baum-Welch implemented either. If it is any use to you you can have it. Regards, David From wendy.wong at gmail.com Tue May 30 15:43:01 2006 From: wendy.wong at gmail.com (wendy wong) Date: Tue, 30 May 2006 16:43:01 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: thanks! i only need one head so BaumWelchSampler works fine with me. The default SCORETYPE is probability and when I tried it the score goes back and forth, like + for one time and - for the next time. I then changed it to LOGODDS and recompiled biojava and now that the score is steadily increasing. I was wondering if the SCORETYPE could be passed in as an argument in the next version of biojava? thanks, wendy On 30 May 2006 12:19:15 +0100, David Huen wrote: > On May 30 2006, wendy wong wrote: > > >Hi, > > > >I was wondering if viterbi training is implemented in biojava, or if > >there's any open source version implemented using biojava? > > > There is one-head viterbi training already I think. The training framework > doesn't work for two-head - I wrote a viterbi training API that works for > two head but it is not fully compatible with the existing API so I never > put it into CVS, plus it doesn't have Baum-Welch implemented either. > > If it is any use to you you can have it. > > Regards, > David > From christoph.gille at charite.de Tue May 30 20:37:29 2006 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Tue, 30 May 2006 22:37:29 +0200 (CEST) Subject: [Biojava-l] sequence alignments with Amap Message-ID: <60513.141.42.56.114.1149021449.squirrel@webmail.charite.de> Amap, Multiple Alignment by Sequence Annealing, had been developed by Ariel Schwartz at UC Berkeley. It implements novel algorithms to improve the computation of multiple sequence alignments. Java and BioJava-programmers can now take advantage of Amap using the STRAP-toolbox API. Please visit the section SequenceAligner in http://3d-alignment.eu/Scripting.html. Amap is also available for users of the STRAP-workbench http://3d-alignment.eu/. From richard.holland at ebi.ac.uk Wed May 31 08:49:40 2006 From: richard.holland at ebi.ac.uk (Richard Holland) Date: Wed, 31 May 2006 09:49:40 +0100 Subject: [Biojava-l] viterbi training in biojava In-Reply-To: References: Message-ID: <1149065381.3948.2.camel@texas.ebi.ac.uk> I've modified BaumWelchSampler in CVS so that it accepts alternative score types as an additional parameter to singleSequenceIterator(). cheers, Richard. On Tue, 2006-05-30 at 16:43 +0100, wendy wong wrote: > thanks! i only need one head so BaumWelchSampler works fine with me. > The default SCORETYPE is probability and when I tried it the score > goes back and forth, like + for one time and - for the next time. I > then changed it to LOGODDS and recompiled biojava and now that the > score is steadily increasing. I was wondering if the SCORETYPE could > be passed in as an argument in the next version of biojava? > > thanks, > wendy > > > On 30 May 2006 12:19:15 +0100, David Huen wrote: > > On May 30 2006, wendy wong wrote: > > > > >Hi, > > > > > >I was wondering if viterbi training is implemented in biojava, or if > > >there's any open source version implemented using biojava? > > > > > There is one-head viterbi training already I think. The training framework > > doesn't work for two-head - I wrote a viterbi training API that works for > > two head but it is not fully compatible with the existing API so I never > > put it into CVS, plus it doesn't have Baum-Welch implemented either. > > > > If it is any use to you you can have it. > > > > Regards, > > David > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l -- Richard Holland (BioMart Team) EMBL-EBI Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UNITED KINGDOM Tel: +44-(0)1223-494416