From mhassel at bcgsc.bc.ca Tue Feb 3 18:27:26 2004 From: mhassel at bcgsc.bc.ca (Maik Hassel) Date: Tue Feb 3 18:36:09 2004 Subject: [Biojava-l] Uppercase/lowercase sequences Message-ID: <40202E5E.8050909@bcgsc.bc.ca> Hello everybody! I saw some postings about this topic already on the list, but no "real" solution so far. I take it that seqeunces are always handled in lowercase letters, so that basically biojava is unsuitable for working together with any algorythms that use soft masked sequences. Is there any plan to actually change that in the near future, or is there already a solution that I missed? Thanks for any comments Maik P.S: When replying, please please please cc directly to me, too! From mark.schreiber at group.novartis.com Tue Feb 3 19:47:22 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Tue Feb 3 19:50:26 2004 Subject: [Biojava-l] Uppercase/lowercase sequences Message-ID: Hello, There has been some talk of changing that in future versions of biojava. However you may be able to get the desired behaivour by making your own SymbolTokenizer and making it case sensitive. In this way you can map lower or upper case characters to any Symbol you want. You could even go as far as making your own Alphabet. It really depends on what you wish to do with these soft masked sequences. Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Maik Hassel Sent by: biojava-l-bounces@portal.open-bio.org 02/04/2004 07:27 AM To: biojava-l@biojava.org cc: Subject: [Biojava-l] Uppercase/lowercase sequences Hello everybody! I saw some postings about this topic already on the list, but no "real" solution so far. I take it that seqeunces are always handled in lowercase letters, so that basically biojava is unsuitable for working together with any algorythms that use soft masked sequences. Is there any plan to actually change that in the near future, or is there already a solution that I missed? Thanks for any comments Maik P.S: When replying, please please please cc directly to me, too! _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 08:18:31 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 08:22:08 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format Message-ID: Hello, I am a new user to biojava (and almost new to java). The following code works fine reading a 'FASTA' format file, but causes an error reading 'MSF' format... ----- String file = args[0]; String format = args[1]; String alphabet = args[2]; BufferedReader br = new BufferedReader(new FileReader(file)); SequenceIterator seqi = null; Alignment align = null; if ( format != "MSF" && format != "msf" ){ seqi = (SequenceIterator)SeqIOTools.fileToBiojava( format, alphabet, br ); } else{ align = (Alignment)SeqIOTools.fileToBiojava( format, alphabet, br ); } ---- Error... --- Exception in thread "main" java.lang.IllegalArgumentException: No alphabet was set in the identifier at org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) at org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) at ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) --- Line 60 corresponds to the "align = ..." line above. Like I said, works fine as... java prog.java fa.fasta fasta PROTEIN but java prog.java msf.msf msf PROTEIN Gives above error... Just as I thought I was begining to understand :.( I will look at details for Alignment objects... From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 08:25:43 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 08:28:57 2004 Subject: [Biojava-l] tipical dumb user problem (BUILD FAILED) Message-ID: Test.java uses or overrides a deprecated API. [javac] Note: Recompile with -deprecation for details. [javac] 100 errors BUILD FAILED file:/home/dmb/BioJava/biojava-1.30/build.xml:291: Compile failed; see the compiler error output for details. I don't know what this means or where to go next! Any help is appreciated - I have binaries working, but I wanted to build tests to make sure things were generally OK on my system. ant package ant javadocs Worked fine, but ant runtests Gave final message above (and many more). Is this a $CLASSPATH problem? Cheers, Dan. From thomas at derkholm.net Wed Feb 4 08:58:07 2004 From: thomas at derkholm.net (Thomas Down) Date: Wed Feb 4 09:06:33 2004 Subject: [Biojava-l] tipical dumb user problem (BUILD FAILED) In-Reply-To: References: Message-ID: <20040204135807.GA1618@firechild> Once upon a time, Dan Bolser wrote: > > Test.java uses or overrides a deprecated API. > [javac] Note: Recompile with -deprecation for details. > [javac] 100 errors > > BUILD FAILED > file:/home/dmb/BioJava/biojava-1.30/build.xml:291: Compile failed; see > the compiler error output for details. > A > I don't know what this means or where to go next! > > Any help is appreciated - I have binaries working, but I wanted to build > tests to make sure things were generally OK on my system. > > ant package > ant javadocs > > Worked fine, but > > ant runtests > > Gave final message above (and many more). You don't give the actual error messages -- my guess it that they're complaining about not finding the TestCase class. To run the test suite, you need to download JUnit (http://www.junit.org/) and add junit.jar to your CLASSPATH. The test suite is aimed mainly at people who are actively developing the library, rather than as something to be run whenever you install it. The tests are also run by the nightly build system, and you can see results at: http://www.derkholm.net/autobuild/testreports/ Thomas. From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 09:41:46 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 09:45:22 2004 Subject: [Biojava-l] tipical dumb user problem (BUILD FAILED) In-Reply-To: <20040204135807.GA1618@firechild> Message-ID: ug, java is weird. Thanks for the details, Dan. On Wed, 4 Feb 2004, Thomas Down wrote: > Once upon a time, Dan Bolser wrote: > > > > Test.java uses or overrides a deprecated API. > > [javac] Note: Recompile with -deprecation for details. > > [javac] 100 errors > > > > BUILD FAILED > > file:/home/dmb/BioJava/biojava-1.30/build.xml:291: Compile failed; see > > the compiler error output for details. > > > A > > I don't know what this means or where to go next! > > > > Any help is appreciated - I have binaries working, but I wanted to build > > tests to make sure things were generally OK on my system. > > > > ant package > > ant javadocs > > > > Worked fine, but > > > > ant runtests > > > > Gave final message above (and many more). > > > You don't give the actual error messages -- my guess it that > they're complaining about not finding the TestCase class. To > run the test suite, you need to download JUnit (http://www.junit.org/) > and add junit.jar to your CLASSPATH. > > The test suite is aimed mainly at people who are actively developing > the library, rather than as something to be run whenever you install > it. The tests are also run by the nightly build system, and you can > see results at: > > http://www.derkholm.net/autobuild/testreports/ > > Thomas. > From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 09:43:02 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 09:46:36 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format In-Reply-To: Message-ID: I forgot the (probably) important import java.io.*; import java.util.*; import org.biojava.bio.*; import org.biojava.bio.dist.*; import org.biojava.bio.seq.*; import org.biojava.bio.seq.io.*; import org.biojava.bio.symbol.*; On Wed, 4 Feb 2004, Dan Bolser wrote: > > Hello, I am a new user to biojava (and almost new to java). > > The following code works fine reading a 'FASTA' format file, > but causes an error reading 'MSF' format... > > ----- > String file = args[0]; > String format = args[1]; > String alphabet = args[2]; > > BufferedReader br = new BufferedReader(new FileReader(file)); > > SequenceIterator seqi = null; > Alignment align = null; > > if ( format != "MSF" && format != "msf" ){ > seqi = > (SequenceIterator)SeqIOTools.fileToBiojava( format, alphabet, br ); > } > else{ > align = > (Alignment)SeqIOTools.fileToBiojava( format, alphabet, br ); > } > ---- > > Error... > > --- > Exception in thread "main" java.lang.IllegalArgumentException: No alphabet > was set in the identifier > at > org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) > at > org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) > at > ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) > --- > > Line 60 corresponds to the "align = ..." line above. > > Like I said, works fine as... > > java prog.java fa.fasta fasta PROTEIN > > but > > java prog.java msf.msf msf PROTEIN > > Gives above error... Just as I thought I was begining to understand :.( > > I will look at details for Alignment objects... > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > From kdj at sanger.ac.uk Wed Feb 4 10:14:07 2004 From: kdj at sanger.ac.uk (Keith James) Date: Wed Feb 4 10:20:17 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format In-Reply-To: References: Message-ID: >>>>> "Dan" == Dan Bolser writes: Dan> Hello, I am a new user to biojava (and almost new to java). Dan> The following code works fine reading a 'FASTA' format file, Dan> but causes an error reading 'MSF' format... [...] Dan> --- Exception in thread "main" Dan> java.lang.IllegalArgumentException: No alphabet was set in Dan> the identifier at Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) Dan> at Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) Dan> at Dan> ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) The exception message here is referring to the integer identifier which biojava has for every known combination of file-format (fasta, genbank, embl) and alphabet-type (dna, rna, protein). The way these are created/interpreted is documented in SeqIOConstants (for the sequence formats) and AlignIOConstants (for the alignment formats). All the common ones exist as static int fields so that you can compare using == or use them in switches. The format guessing code (in SeqIOTools.identifyFormat) appears to be missing "msf" and "clustal". This is a bug - I'll fix it today. The result is that it guesses SeqIOConstants.UNKNOWN as the format identifier (which has no alphabet set - hence the message). The public method fileToBiojava(int fileType, BufferedReader br) should work if you pass it the value AlignIOConstants.MSF_AA Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 10:37:36 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 10:40:46 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format In-Reply-To: Message-ID: On 4 Feb 2004, Keith James wrote: > >>>>> "Dan" == Dan Bolser writes: > > Dan> Hello, I am a new user to biojava (and almost new to java). > > Dan> The following code works fine reading a 'FASTA' format file, > Dan> but causes an error reading 'MSF' format... > > [...] > > Dan> --- Exception in thread "main" > Dan> java.lang.IllegalArgumentException: No alphabet was set in > Dan> the identifier at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) > Dan> at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) > Dan> at > Dan> ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) > > The exception message here is referring to the integer identifier > which biojava has for every known combination of file-format (fasta, > genbank, embl) and alphabet-type (dna, rna, protein). The way these > are created/interpreted is documented in SeqIOConstants (for the > sequence formats) and AlignIOConstants (for the alignment > formats). All the common ones exist as static int fields so that you > can compare using == or use them in switches. > > The format guessing code (in SeqIOTools.identifyFormat) appears to be > missing "msf" and "clustal". This is a bug - I'll fix it today. The > result is that it guesses SeqIOConstants.UNKNOWN as the format > identifier (which has no alphabet set - hence the message). > > The public method fileToBiojava(int fileType, BufferedReader br) > should work if you pass it the value AlignIOConstants.MSF_AA Just for the record, I see two forms of the fileToBiojava function... fileToBiojava( int fileType, java.io.BufferedReader br ); fileToBiojava( java.lang.String formatName, java.lang.String alphabetName, java.io.BufferedReader br ); I was using the second form which sould not have to guess the format (perhaps I misunderstand what you said above). Aditionaly, I explicitly pass an alphabet name... Why are formats linked to alphabets? "... SeqIOConstants.UNKNOWN as the format identifier (which has no alphabet set...". Please let me know if I am terminally confused Cheers, Dan. > > Keith > > From lhummel at pasteur.fr Tue Feb 3 13:09:33 2004 From: lhummel at pasteur.fr (Laurence Hummel) Date: Wed Feb 4 10:56:25 2004 Subject: [Biojava-l] Run blast on multiple databases Message-ID: <1E3EAA92-5674-11D8-870D-000393C635CC@pasteur.fr> Hi, I want to run blastall on several databases, so I run it with -d "dbName1 dbName2"... This work well if I launch the command line in a terminal, but not if I launch it from my java program... It seems like the "" are not reconized. Here is what I've done : Process blastRun; String pOption, dOption, oOption, iOption, eOption, tOption, blastCommandLine, line; BufferedReader errorBuff; File outFile; try { pOption = " -p " + program; tOption = " -T"; // out file in html format dOption = " -d \"" + db + "\""; iOption = " -i " + query; eOption = " -e " + eValue; oOption = " -o " + outFilePath; blastCommandLine = exePath + pOption + dOption + tOption + iOption + eOption + oOption; System.out.println("\nblastcommandLine = " + blastCommandLine); // Call external blastall programm blastRun = Runtime.getRuntime().exec(blastCommandLine); // Catch blast error System.out.println("\nErrorStream = " + blastRun.getErrorStream().read()); errorBuff = new BufferedReader(new InputStreamReader(blastRun.getErrorStream())); System.out.println("\nerrorBuff = "); while ((line = errorBuff.readLine()) != null) { System.out.println(line); } System.out.println("\nOutputStream = " + blastRun.getOutputStream()); } catch (IOException ex) { System.out.println("BLAST NOT RUN"); System.out.println("IOException : " + ex.toString()); } And here is the error : blastcommandLine = /Applications/blast/blastall -p blastp -d "db1 db2" -T -i query.faa -e 10.0 -o /out.html ErrorStream = 91 errorBuff = blastall] ERROR: Arguments must start with '-' (the offending argument #5 was: db2"') OutputStream = java.io.BufferedOutputStream@629355 The exactly same command line copy/pasted in a terminal works very well... Is someone has an idea? Laurence -- Laurence HUMMEL G?nopole - Plate-Forme 4 "Int?gration et Analyse G?nomiques" Institut Pasteur 28 rue du docteur Roux - 75015 PARIS Tel : 01.44.38.95.36 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 3246 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20040203/42ac6ad8/attachment.bin From lhummel at pasteur.fr Wed Feb 4 10:48:01 2004 From: lhummel at pasteur.fr (Laurence Hummel) Date: Wed Feb 4 10:56:25 2004 Subject: [Biojava-l] blast on multiple databases Message-ID: <835E1C3E-5729-11D8-BFF8-000393C635CC@pasteur.fr> Hi, I want to run blastall on several databases, so I run it with -d "dbName1 dbName2"... This work well if I launch the command line in a terminal, but not if I launch it from my java program... It seems like the "" are not reconized. Here is what I've done : Process blastRun; String pOption, dOption, oOption, iOption, eOption, tOption, blastCommandLine, line; BufferedReader errorBuff; File outFile; try { pOption = " -p " + program; tOption = " -T"; dOption = " -d \"" + db + "\""; iOption = " -i " + query; eOption = " -e " + eValue; oOption = " -o " + outFilePath; blastCommandLine = exePath + pOption + dOption + tOption + iOption + eOption + oOption; System.out.println("blastcommandLine = " + blastCommandLine); blastRun = Runtime.getRuntime().exec(blastCommandLine); System.out.println("ErrorStream = " + blastRun.getErrorStream().read()); errorBuff = new BufferedReader(new InputStreamReader(blastRun.getErrorStream())); System.out.println("errorBuff = "); while ((line = errorBuff.readLine()) != null) { System.out.println(line); } } And here is the error : blastcommandLine = /Applications/blast/blastall -p blastp -d "db1 db2" -T -i query.faa -e 10.0 -o /out.html ErrorStream = 91 errorBuff = blastall] ERROR: Arguments must start with '-' (the offending argument #5 was: db2"') The exactly same command line copy/pasted in a terminal works very well... Is someone has an idea? Laurence -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1735 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20040204/47614390/attachment.bin From dmb at mrc-dunn.cam.ac.uk Wed Feb 4 10:55:44 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed Feb 4 10:59:07 2004 Subject: [Biojava-l] TMTOWTDI in biojava? In-Reply-To: Message-ID: Hello, I found an alternate solution... --- BufferedReader br = new BufferedReader(new FileReader(file)); MSFAlignmentFormat x = new MSFAlignmentFormat(); Alignment align = x.read( br ); --- (MSFAlignmentFormat.read( br ) didn't work) Is this just a matter of taste? Are their preferred ways to do things, or should we just do things any which way? Does functional overlap exists for specific reasons, or for exactly this kind of flexibility? As a new java programer, I am natrually insecure about my code, do I just need confidence? Ta, Dan. On 4 Feb 2004, Keith James wrote: > >>>>> "Dan" == Dan Bolser writes: > > Dan> Hello, I am a new user to biojava (and almost new to java). > > Dan> The following code works fine reading a 'FASTA' format file, > Dan> but causes an error reading 'MSF' format... > > [...] > > Dan> --- Exception in thread "main" > Dan> java.lang.IllegalArgumentException: No alphabet was set in > Dan> the identifier at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) > Dan> at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) > Dan> at > Dan> ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) > > The exception message here is referring to the integer identifier > which biojava has for every known combination of file-format (fasta, > genbank, embl) and alphabet-type (dna, rna, protein). The way these > are created/interpreted is documented in SeqIOConstants (for the > sequence formats) and AlignIOConstants (for the alignment > formats). All the common ones exist as static int fields so that you > can compare using == or use them in switches. > > The format guessing code (in SeqIOTools.identifyFormat) appears to be > missing "msf" and "clustal". This is a bug - I'll fix it today. The > result is that it guesses SeqIOConstants.UNKNOWN as the format > identifier (which has no alphabet set - hence the message). > > The public method fileToBiojava(int fileType, BufferedReader br) > should work if you pass it the value AlignIOConstants.MSF_AA > > Keith > > From MCCon012 at mc.duke.edu Wed Feb 4 11:06:35 2004 From: MCCon012 at mc.duke.edu (Patrick McConnell) Date: Wed Feb 4 11:12:49 2004 Subject: [Biojava-l] Run blast on multiple databases Message-ID: Pass a String array to the exec function instead of one string. The first element of the array is the executable, the rest are the parameters. For example: Runtime.getRuntime().exec(new String[] { "/usr/bin/blast/blastall", "-p", "blastn", "-d", "nt human mouse", "-i", "/home/me/in.fasta", "-o", "/home/me/out.txt" }); This way, java handles sending the parameter with spaces correctly. -Patrick McConnell Duke Binformatics Shared Resource Duke University mccon012@mc.duke.edu Laurence Hummel To: biojava-l@biojava.org Sent by: cc: biojava-l-bounces@portal.o Subject: [Biojava-l] Run blast on multiple databases pen-bio.org 02/03/2004 01:09 PM Hi, I want to run blastall on several databases, so I run it with -d "dbName1 dbName2"... This work well if I launch the command line in a terminal, but not if I launch it from my java program... It seems like the "" are not reconized. Here is what I've done : Process blastRun; String pOption, dOption, oOption, iOption, eOption, tOption, blastCommandLine, line; BufferedReader errorBuff; File outFile; try { pOption = " -p " + program; tOption = " -T"; // out file in html format dOption = " -d \"" + db + "\""; iOption = " -i " + query; eOption = " -e " + eValue; oOption = " -o " + outFilePath; blastCommandLine = exePath + pOption + dOption + tOption + iOption + eOption + oOption; System.out.println("\nblastcommandLine = " + blastCommandLine); // Call external blastall programm blastRun = Runtime.getRuntime().exec(blastCommandLine); // Catch blast error System.out.println("\nErrorStream = " + blastRun.getErrorStream().read()); errorBuff = new BufferedReader(new InputStreamReader(blastRun.getErrorStream())); System.out.println("\nerrorBuff = "); while ((line = errorBuff.readLine()) != null) { System.out.println(line); } System.out.println("\nOutputStream = " + blastRun.getOutputStream()); } catch (IOException ex) { System.out.println("BLAST NOT RUN"); System.out.println("IOException : " + ex.toString()); } And here is the error : blastcommandLine = /Applications/blast/blastall -p blastp -d "db1 db2" -T -i query.faa -e 10.0 -o /out.html ErrorStream = 91 errorBuff = blastall] ERROR: Arguments must start with '-' (the offending argument #5 was: db2"') OutputStream = java.io.BufferedOutputStream@629355 The exactly same command line copy/pasted in a terminal works very well... Is someone has an idea? Laurence -- Laurence HUMMEL G?nopole - Plate-Forme 4 "Int?gration et Analyse G?nomiques" Institut Pasteur 28 rue du docteur Roux - 75015 PARIS Tel : 01.44.38.95.36 _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From kdj at sanger.ac.uk Wed Feb 4 11:10:16 2004 From: kdj at sanger.ac.uk (Keith James) Date: Wed Feb 4 11:16:26 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format In-Reply-To: References: Message-ID: >>>>> "Dan" == Dan Bolser writes: [...] Dan> Just for the record, I see two forms of the fileToBiojava Dan> function... Dan> fileToBiojava( int fileType, java.io.BufferedReader br ); Dan> fileToBiojava( java.lang.String formatName, java.lang.String Dan> alphabetName, java.io.BufferedReader br ); Yep. One tries to guess the format from the strings passed as arguments (e.g. you can use both 'protein' or 'aa' for the protein alphabet) and then reads the file. Internally all it does is to call the other method with the correct integer argument. The methods in any class called FooTools or BarTools in biojava are convenience methods, often built from other lower level API in biojava. Dan> I was using the second form which sould not have to guess the Dan> format (perhaps I misunderstand what you said Dan> above). Aditionaly, I explicitly pass an alphabet name... Why Dan> are formats linked to alphabets? "... SeqIOConstants.UNKNOWN Dan> as the format identifier (which has no alphabet set...". It's fine to use either. Formats are not linked to alphabets per se. It is just convenient to have both pieces of information stored in a single value (and int) for reading files. There are lots of possible alphabets and file formats so it's handy to be able to calculate identifiers for all possible current (and future) combinations. Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From kdj at sanger.ac.uk Wed Feb 4 11:11:13 2004 From: kdj at sanger.ac.uk (Keith James) Date: Wed Feb 4 11:17:22 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format References: Message-ID: I've checked in the changes. Looking back at my notes I can see why I left out MSF and Clustal from the auto-guessing method. By adding them we end up treating some alignment formats differently to others i.e. MSF/Clustal vs. fasta/raw. As fasta and raw can be either alignments or single sequences, depending on the intention of the user I thought is better never to allow guessing and always make the application programmer specify explicitly themselves (using the method I mentioned in my other post). However, I think I was wrong as it's obviously given a nasty surprise. So I've applied the fix and added the relevant tests. Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From iris.pallida at free.fr Wed Feb 4 12:17:51 2004 From: iris.pallida at free.fr (Iris Pallida) Date: Wed Feb 4 12:22:41 2004 Subject: [Biojava-l] BLASTLikeSAXParser and HMMER Message-ID: <200402041817.51319.iris.pallida@free.fr> Hello, I'm a (french) newbie... I would like to use the script from Biojava In Anger ("How do I set up a BLAST parser?") in order to parse my file "resut" from hmmpfam. The API says it is OK for the 2.1.1 version of HMMER and I have the last one... But I've built my result with the hmmer --compat option, in order to have a 2.1.1 compatible file. The parser doesn't work anyway ("org.xml.sax.SAXException: Program hmmer Version 2.3.2 is not supported by the biojava blast-like parsing framework"). I d'ont know if the hmmer option '--compat' doesn't work or if there is another way to do that. Is there someone here who did it and could help me ? Thank you in advance Iris From mark.schreiber at group.novartis.com Wed Feb 4 20:57:45 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Wed Feb 4 21:00:47 2004 Subject: [Biojava-l] TMTOWTDI in biojava? Message-ID: Hi Dan - It pretty much is a matter of taste. Assuming we have all the tests in place (which we may not) either method should be equal (at least in terms of results if not performace). The method you were using from SeqIOTools allows for a dynamic choice of format. Eg the user could supply "fasta" and "dna" as command line parameters or "genbank" "DNA" to the same program and it would figure out which to use. If you use a specific hardcoded format your user may not be able to select the format at runtime. If this is a problem the SeqIOTools method is more flexible and therefore better. If it's not a problem use which ever one you want. From a code documentation point of view it might be more obvious what you are doing if you use the hardcoded version but it shouldn't matter. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Dan Bolser Sent by: biojava-l-bounces@portal.open-bio.org 02/04/2004 11:55 PM To: Keith James cc: biojava-l@biojava.org Subject: [Biojava-l] TMTOWTDI in biojava? Hello, I found an alternate solution... --- BufferedReader br = new BufferedReader(new FileReader(file)); MSFAlignmentFormat x = new MSFAlignmentFormat(); Alignment align = x.read( br ); --- (MSFAlignmentFormat.read( br ) didn't work) Is this just a matter of taste? Are their preferred ways to do things, or should we just do things any which way? Does functional overlap exists for specific reasons, or for exactly this kind of flexibility? As a new java programer, I am natrually insecure about my code, do I just need confidence? Ta, Dan. On 4 Feb 2004, Keith James wrote: > >>>>> "Dan" == Dan Bolser writes: > > Dan> Hello, I am a new user to biojava (and almost new to java). > > Dan> The following code works fine reading a 'FASTA' format file, > Dan> but causes an error reading 'MSF' format... > > [...] > > Dan> --- Exception in thread "main" > Dan> java.lang.IllegalArgumentException: No alphabet was set in > Dan> the identifier at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) > Dan> at > Dan> org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) > Dan> at > Dan> ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) > > The exception message here is referring to the integer identifier > which biojava has for every known combination of file-format (fasta, > genbank, embl) and alphabet-type (dna, rna, protein). The way these > are created/interpreted is documented in SeqIOConstants (for the > sequence formats) and AlignIOConstants (for the alignment > formats). All the common ones exist as static int fields so that you > can compare using == or use them in switches. > > The format guessing code (in SeqIOTools.identifyFormat) appears to be > missing "msf" and "clustal". This is a bug - I'll fix it today. The > result is that it guesses SeqIOConstants.UNKNOWN as the format > identifier (which has no alphabet set - hence the message). > > The public method fileToBiojava(int fileType, BufferedReader br) > should work if you pass it the value AlignIOConstants.MSF_AA > > Keith > > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at group.novartis.com Wed Feb 4 21:01:28 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Wed Feb 4 21:04:28 2004 Subject: [Biojava-l] BLASTLikeSAXParser and HMMER Message-ID: Hi - You may need to call the setModeLazy() method on the BLASTLikeSAXParser which will tell it to not care so much about the version number and try parsing the file anyway. If you do this you should check the output carefully to make sure it hasn't made any mistakes. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Iris Pallida Sent by: biojava-l-bounces@portal.open-bio.org 02/05/2004 01:17 AM To: biojava-l@biojava.org cc: Subject: [Biojava-l] BLASTLikeSAXParser and HMMER Hello, I'm a (french) newbie... I would like to use the script from Biojava In Anger ("How do I set up a BLAST parser?") in order to parse my file "resut" from hmmpfam. The API says it is OK for the 2.1.1 version of HMMER and I have the last one... But I've built my result with the hmmer --compat option, in order to have a 2.1.1 compatible file. The parser doesn't work anyway ("org.xml.sax.SAXException: Program hmmer Version 2.3.2 is not supported by the biojava blast-like parsing framework"). I d'ont know if the hmmer option '--compat' doesn't work or if there is another way to do that. Is there someone here who did it and could help me ? Thank you in advance Iris _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From verhoeff2 at gis.a-star.edu.sg Thu Feb 5 02:34:18 2004 From: verhoeff2 at gis.a-star.edu.sg (VERHOEF Frans) Date: Thu Feb 5 02:43:16 2004 Subject: [Biojava-l] beginners bug fileToBiojava problem with msf format Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D560B0448@BIONIC.biopolis.one-north.com> > Hello, I am a new user to biojava (and almost new to java). > > The following code works fine reading a 'FASTA' format file, > but causes an error reading 'MSF' format... > > ----- > String file = args[0]; > String format = args[1]; > String alphabet = args[2]; > > BufferedReader br = new BufferedReader(new FileReader(file)); > > SequenceIterator seqi = null; > Alignment align = null; > > if ( format != "MSF" && format != "msf" ){ Change this line to: if ( format.toLowerCase().equals("msf") == false ){ Because I wonder whether you ever get in here. For your information, if you use == or != to compare 2 strings (or other objects) it will not compare the content but it will only compare whether the 2 objects have the same address. If you would have assigned format directly with "msf" (i.e. String format = "str"), then it would have worked, because then both format and the string "msf" in the if statement are pointing to the same address. In this case format is assigned an external argument which happens to have the same value, but is most probably pointing to a different address. I hope it is clear what I try to explain ;-) > seqi = > (SequenceIterator)SeqIOTools.fileToBiojava( format, alphabet, br ); > } > else{ > align = > (Alignment)SeqIOTools.fileToBiojava( format, alphabet, br ); > } > ---- > > Error... > > --- > Exception in thread "main" java.lang.IllegalArgumentException: No alphabet > was set in the identifier > at > org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:801) > at > org.biojava.bio.seq.io.SeqIOTools.fileToBiojava(SeqIOTools.java:787) > at > ReadAlignMakeDistribution.main(ReadAlignMakeDistribution.java:60) > --- > > Line 60 corresponds to the "align = ..." line above. > > Like I said, works fine as... > > java prog.java fa.fasta fasta PROTEIN > > but > > java prog.java msf.msf msf PROTEIN > > Gives above error... Just as I thought I was begining to understand :.( > > I will look at details for Alignment objects... > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l From len at reeltwo.com Sun Feb 8 20:09:19 2004 From: len at reeltwo.com (Len Trigg) Date: Sun Feb 8 20:15:31 2004 Subject: [Biojava-l] 2 problems found in org.biojava.bio.seq.db.biosql code In-Reply-To: <001501c3e507$286deb90$7f7ba8c0@SISKA> References: <40168D81.7000506@izbi.uni-leipzig.de> <001501c3e507$286deb90$7f7ba8c0@SISKA> Message-ID: "Frederik Decouttere" wrote: > After doing some tests with the biojava - biosql code I think there are > 2 (little) bugs in there: > > - when persisting a Sequence which contains a Feature with > BetweenLocation this Location gets converted to a RangeLocation upon > retrieval > > - when persisting a Sequence which contains (a) Feature(s) in 2 > different biodatabases an exception is thrown in the ontology code part > of biojava I have got as far as inserting your example code into the BioSQL JUnit test and it replicates the problem perfectly. I think I know what's going on with the ontology bug, but haven't looked into the Location one. I probably won't be able to look at it in more detail for a couple more days though (if anyone wants to jump the gun, I can commit the test so the tests start failing :-)). Cheers, Len. From mark.schreiber at group.novartis.com Sun Feb 8 20:26:17 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Sun Feb 8 20:29:10 2004 Subject: [Biojava-l] 2 problems found in org.biojava.bio.seq.db.biosql code Message-ID: Hi Len - I think it's probably worth committing the tests. It will serve as a warning that we know something doesn't work as advertised until someone gets around to fixing it. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Len Trigg Sent by: biojava-l-bounces@portal.open-bio.org 02/09/2004 09:09 AM To: "Frederik Decouttere" cc: biojava-l@biojava.org Subject: Re: [Biojava-l] 2 problems found in org.biojava.bio.seq.db.biosql code "Frederik Decouttere" wrote: > After doing some tests with the biojava - biosql code I think there are > 2 (little) bugs in there: > > - when persisting a Sequence which contains a Feature with > BetweenLocation this Location gets converted to a RangeLocation upon > retrieval > > - when persisting a Sequence which contains (a) Feature(s) in 2 > different biodatabases an exception is thrown in the ontology code part > of biojava I have got as far as inserting your example code into the BioSQL JUnit test and it replicates the problem perfectly. I think I know what's going on with the ontology bug, but haven't looked into the Location one. I probably won't be able to look at it in more detail for a couple more days though (if anyone wants to jump the gun, I can commit the test so the tests start failing :-)). Cheers, Len. _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From len at reeltwo.com Sun Feb 8 21:24:15 2004 From: len at reeltwo.com (Len Trigg) Date: Sun Feb 8 21:30:22 2004 Subject: [Biojava-l] 2 problems found in org.biojava.bio.seq.db.biosql code In-Reply-To: References: Message-ID: Mark Schreiber wrote: > I think it's probably worth committing the tests. It will serve as a > warning that we know something doesn't work as advertised until someone > gets around to fixing it. As much as I hate committing tests that fail, 'tis done. Thomas, could you ensure that the autobuilder has hsqldb.jar in it's CLASSPATH, so that the biosql tests get run? (I use 1.7.2 alpha T, but I see there's an RC1 available.) Cheers, Len. From atlan_d at web.de Wed Feb 11 05:52:42 2004 From: atlan_d at web.de (david atlan (PHENO)) Date: Wed Feb 11 05:58:44 2004 Subject: [Biojava-l] Experience with Phred base calling on Beckman CEQ 2000? Message-ID: Hi, not sure this fits in the biojava-l, but I don't know where else to ask. I am writing a java GUI that uses phred files (.phd.1 and .poly) to analyse sequences and find mutations. On .ab1 files Phred works nicely, but files from a CEQ2000 (saved as .scf) show very often inserted pseudo N or doubling of bases (e.g. GG i.o. G). Has anyone out there experience with Phred and Beckman .scf files? The main differences I have noticed between the CEQ2000 and ABI3100 files are less variations between areas in the ABI as well as peak spacing of around 12 pix in the ABI vs 20 in the CEQ2000... Thanks a lot, david here is an extract of inserted N in the middle of a sequence. G 2222 35258.9974 0.941488 N -1 -1 -1 472.356505 98.422154 4532.105402 23.156642 A 2246 90044.20537 2.331656 N -1 -1 -1 10705.5386 42.180923 32.016443 271.750002 T 2264 55872.20905 1.243214 N -1 -1 -1 418.865077 57.245538 0 6615.988769 N 2276 -1 -1 N -1 -1 -1 38.078643 121.521231 0 633.40226 T 2286 15458.42054 0.33629 N -1 -1 -1 225.751957 131.564308 165.41829 1944.476831 N 2306 -1 -1 N -1 -1 -1 342.70779 177.762462 4757.109849 249.955516 N 2319 -1 -1 N -1 -1 -1 733.467202 300.288 471.353191 213.858398 G 2306 40290.91505 0.892624 N -1 -1 -1 342.70779 177.762462 4757.109849 249.955516 A 2328 62598.56983 1.390287 N -1 -1 -1 7399.949698 283.214769 124.50839 79.686091 A 2349 28561.70244 0.594684 N -1 -1 -1 3681.842162 99.426462 41.799245 34.053885 A 2369 27606.10982 0.61948 C 2385 3717.947077 0.076516 3552.193447 155.667692 100.496058 21.113409 A 2389 28438.40017 0.667774 N -1 -1 -1 3690.908505 361.550769 56.028776 90.583334 C 2406 57411.24923 1.369483 N -1 -1 -1 310.068953 6816.236308 118.282971 79.686091 A 2424 22989.52763 0.520666 T 2421 4193.395393 0.090929 3229.431613 241.033846 31.127098 493.781332 From daviddebeule at pandora.be Wed Feb 11 14:57:10 2004 From: daviddebeule at pandora.be (david de beule) Date: Wed Feb 11 15:03:08 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence References: <20040106113355.80521.qmail@web60501.mail.yahoo.com> <200401061356.46203.david.huen@ntlworld.com> Message-ID: <008101c3f0d9$3c7fe510$f416a451@davidpc> Hi, I have got another small problem with SimpleGappedSequence. This code: Sequence sequence = DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); SimpleGappedSequence s = new SimpleGappedSequence(sequence); s.removeGap(4); results in: org.biojava.bio.symbol.IllegalSymbolException: Attempted to remove a gap at a non-gap index: 4 -> [] at org.biojava.bio.symbol.SimpleGappedSymbolList.removeGap(SimpleGappedSymbolLi st.java:426) Is this intented or a bug ? Thanks in advance, David De Beule ----- Original Message ----- From: "David Huen" To: Sent: Tuesday, January 06, 2004 2:56 PM Subject: Re: [Biojava-l] biojava doubts and problems > On Tuesday 06 Jan 2004 11:33 am, nandakumar sridharan wrote: > > any reference books available for the biojava docs and tutorials. > > Please look at www.biojava.org for some material. Follow the link there to > "Biojava In Anger" for further useful cookbook style materials. > > > GCContent .java gives exception "usage: java GCContent filename.fa" how > > to solve it > > > The above is not an exception but a Unix-style way of telling you that the > command line format for invoking the code. > > In this case, it appears to be saying you need to provide it a file in FASTA > format:- > java GCContent > > Regards, > David Huen > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > > From mark.schreiber at group.novartis.com Wed Feb 11 20:07:18 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Wed Feb 11 20:09:59 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence Message-ID: Hi - The problem seems to be that DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); Creates a SimpleSequence and not a gapped sequence. Then when you call SimpleGappedSequence s = new SimpleGappedSequence(sequence); You get back a view onto sequence. The view can only remove gaps that are introduced in that view. I guess that DNATools.createDNASequence and createDNA methods may need modification. There is a method in DNATools called createGappedDNASequence which will do what you want but it would be nice if the other two could call it as appropriate. Probably need to add a createGappedDNA as well. If no one gets to this in the next few days I'll have a hack at it. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 "david de beule" Sent by: biojava-l-bounces@portal.open-bio.org 02/12/2004 03:57 AM To: cc: Subject: [Biojava-l] removeGap problem with SimpleGappedSequence Hi, I have got another small problem with SimpleGappedSequence. This code: Sequence sequence = DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); SimpleGappedSequence s = new SimpleGappedSequence(sequence); s.removeGap(4); results in: org.biojava.bio.symbol.IllegalSymbolException: Attempted to remove a gap at a non-gap index: 4 -> [] at org.biojava.bio.symbol.SimpleGappedSymbolList.removeGap(SimpleGappedSymbolLi st.java:426) Is this intented or a bug ? Thanks in advance, David De Beule ----- Original Message ----- From: "David Huen" To: Sent: Tuesday, January 06, 2004 2:56 PM Subject: Re: [Biojava-l] biojava doubts and problems > On Tuesday 06 Jan 2004 11:33 am, nandakumar sridharan wrote: > > any reference books available for the biojava docs and tutorials. > > Please look at www.biojava.org for some material. Follow the link there to > "Biojava In Anger" for further useful cookbook style materials. > > > GCContent .java gives exception "usage: java GCContent filename.fa" how > > to solve it > > > The above is not an exception but a Unix-style way of telling you that the > command line format for invoking the code. > > In this case, it appears to be saying you need to provide it a file in FASTA > format:- > java GCContent > > Regards, > David Huen > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From daviddebeule at pandora.be Thu Feb 12 04:04:27 2004 From: daviddebeule at pandora.be (daviddebeule@pandora.be) Date: Thu Feb 12 04:11:43 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence Message-ID: Hi, 'createGappedDNASequence' would solve the problem in this example but in fact it was just an example, in the real application the sequences are not always created with DNATools and a lot of time we create a SimpleGappedSequence from a Sequence. I was wondering if it would be possible to let SimpleGappedSequence(sequence) create a view that immediately contains the gaps available in the original sequence. With that view it would be possible to remove the gaps in the original sequence and add/remove new gaps. David >----- Oorspronkelijk bericht ----- >Van : mark.schreiber@group.novartis.com [mailto:mark.schreiber@group.novartis.com] >Verzonden : donderdag , februari 12, 2004 01:07 AM >Aan : 'david de beule' >CC : biojava-l@biojava.org >Onderwerp : Re: [Biojava-l] removeGap problem with SimpleGappedSequence > >Hi - > >The problem seems to be that > > DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); > >Creates a SimpleSequence and not a gapped sequence. Then when you call > > SimpleGappedSequence s = new SimpleGappedSequence(sequence); > >You get back a view onto sequence. The view can only remove gaps that are >introduced in that view. I guess that DNATools.createDNASequence and >createDNA methods may need modification. There is a method in DNATools >called createGappedDNASequence which will do what you want but it would be >nice if the other two could call it as appropriate. > >Probably need to add a createGappedDNA as well. > >If no one gets to this in the next few days I'll have a hack at it. > >- Mark > >Mark Schreiber >Principal Scientist (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >1 Science Park Road >#04-14 The Capricorn >Singapore 117528 > >phone +65 6722 2973 >fax +65 6722 2910 > > > > > >"david de beule" >Sent by: biojava-l-bounces@portal.open-bio.org >02/12/2004 03:57 AM > > > To: > cc: > Subject: [Biojava-l] removeGap problem with SimpleGappedSequence > > >Hi, > >I have got another small problem with SimpleGappedSequence. > >This code: > >Sequence sequence = DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); >SimpleGappedSequence s = new SimpleGappedSequence(sequence); >s.removeGap(4); > >results in: > >org.biojava.bio.symbol.IllegalSymbolException: Attempted to remove a gap >at >a non-gap index: 4 -> [] > at >org.biojava.bio.symbol.SimpleGappedSymbolList.removeGap(SimpleGappedSymbolLi >st.java:426) > >Is this intented or a bug ? > >Thanks in advance, >David De Beule > >----- Original Message ----- >From: "David Huen" >To: >Sent: Tuesday, January 06, 2004 2:56 PM >Subject: Re: [Biojava-l] biojava doubts and problems > > >> On Tuesday 06 Jan 2004 11:33 am, nandakumar sridharan wrote: >> > any reference books available for the biojava docs and tutorials. >> >> Please look at www.biojava.org for some material. Follow the link there >to >> "Biojava In Anger" for further useful cookbook style materials. >> >> > GCContent .java gives exception "usage: java GCContent filename.fa" >how >> > to solve it >> > >> The above is not an exception but a Unix-style way of telling you that >the >> command line format for invoking the code. >> >> In this case, it appears to be saying you need to provide it a file in >FASTA >> format:- >> java GCContent >> >> Regards, >> David Huen >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l@biojava.org >> http://biojava.org/mailman/listinfo/biojava-l >> >> > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l > > > > > > From mark.schreiber at group.novartis.com Thu Feb 12 04:20:49 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Thu Feb 12 04:23:29 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence Message-ID: Sounds like a pretty sensible suggestion. Can anyone think of why this might not be a 'good idea'? If not, i'll add it to the list of things to fix :) - Mark daviddebeule@pandora.be Sent by: biojava-l-bounces@portal.open-bio.org 02/12/2004 05:04 PM To: biojava-l@biojava.org cc: Subject: Re: [Biojava-l] removeGap problem with SimpleGappedSequence Hi, 'createGappedDNASequence' would solve the problem in this example but in fact it was just an example, in the real application the sequences are not always created with DNATools and a lot of time we create a SimpleGappedSequence from a Sequence. I was wondering if it would be possible to let SimpleGappedSequence(sequence) create a view that immediately contains the gaps available in the original sequence. With that view it would be possible to remove the gaps in the original sequence and add/remove new gaps. David >----- Oorspronkelijk bericht ----- >Van : mark.schreiber@group.novartis.com [mailto:mark.schreiber@group.novartis.com] >Verzonden : donderdag , februari 12, 2004 01:07 AM >Aan : 'david de beule' >CC : biojava-l@biojava.org >Onderwerp : Re: [Biojava-l] removeGap problem with SimpleGappedSequence > >Hi - > >The problem seems to be that > > DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); > >Creates a SimpleSequence and not a gapped sequence. Then when you call > > SimpleGappedSequence s = new SimpleGappedSequence(sequence); > >You get back a view onto sequence. The view can only remove gaps that are >introduced in that view. I guess that DNATools.createDNASequence and >createDNA methods may need modification. There is a method in DNATools >called createGappedDNASequence which will do what you want but it would be >nice if the other two could call it as appropriate. > >Probably need to add a createGappedDNA as well. > >If no one gets to this in the next few days I'll have a hack at it. > >- Mark > >Mark Schreiber >Principal Scientist (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >1 Science Park Road >#04-14 The Capricorn >Singapore 117528 > >phone +65 6722 2973 >fax +65 6722 2910 > > > > > >"david de beule" >Sent by: biojava-l-bounces@portal.open-bio.org >02/12/2004 03:57 AM > > > To: > cc: > Subject: [Biojava-l] removeGap problem with SimpleGappedSequence > > >Hi, > >I have got another small problem with SimpleGappedSequence. > >This code: > >Sequence sequence = DNATools.createDNASequence("ACT--GGACCTAAGG", "test"); >SimpleGappedSequence s = new SimpleGappedSequence(sequence); >s.removeGap(4); > >results in: > >org.biojava.bio.symbol.IllegalSymbolException: Attempted to remove a gap >at >a non-gap index: 4 -> [] > at >org.biojava.bio.symbol.SimpleGappedSymbolList.removeGap(SimpleGappedSymbolLi >st.java:426) > >Is this intented or a bug ? > >Thanks in advance, >David De Beule > >----- Original Message ----- >From: "David Huen" >To: >Sent: Tuesday, January 06, 2004 2:56 PM >Subject: Re: [Biojava-l] biojava doubts and problems > > >> On Tuesday 06 Jan 2004 11:33 am, nandakumar sridharan wrote: >> > any reference books available for the biojava docs and tutorials. >> >> Please look at www.biojava.org for some material. Follow the link there >to >> "Biojava In Anger" for further useful cookbook style materials. >> >> > GCContent .java gives exception "usage: java GCContent filename.fa" >how >> > to solve it >> > >> The above is not an exception but a Unix-style way of telling you that >the >> command line format for invoking the code. >> >> In this case, it appears to be saying you need to provide it a file in >FASTA >> format:- >> java GCContent >> >> Regards, >> David Huen >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l@biojava.org >> http://biojava.org/mailman/listinfo/biojava-l >> >> > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l > > > > > > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From matthew_pocock at yahoo.co.uk Thu Feb 12 06:02:39 2004 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Feb 12 06:08:45 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence In-Reply-To: References: Message-ID: <402B5D4F.2000908@yahoo.co.uk> Hi, Seems like we have a bit of an 'expected behavior' and 'implemented behavior' gap. If we decide to modify the GappedSymbolList constructor to find all gaps in the original sequence, I think we should add it as an option: new GappedSymbolList(origSyms, mergeOriginalGaps) and make the current constructor equivalent to this(syms, false). Finding all these gaps, making an ungapped underlying symbol list, and building the gap insertion data structures is a potentialy expensive operation (imagine gapping a genome! you would pull the whole thing into memory and do a linear scan), so we should be careful not to force it upon the world. This would also change the contract of getSourceSymbolList() and also what happens if that source is modified, wether changes to it are tracked. This could be worked around by implementing an "UnGappedView" class that does the oposite mapping of GappedSymbolList - removes all gaps in the source - then we could gap this putting them all back, making it editable. I don't wan't to be the one to write it though - writing GappedSymbolList made my brain hurt. Matthew mark.schreiber@group.novartis.com wrote: >Sounds like a pretty sensible suggestion. Can anyone think of why this >might not be a 'good idea'? > >If not, i'll add it to the list of things to fix :) > >- Mark > > From daviddebeule at pandora.be Thu Feb 12 15:03:26 2004 From: daviddebeule at pandora.be (david de beule) Date: Thu Feb 12 15:09:24 2004 Subject: [Biojava-l] removeGap problem with SimpleGappedSequence References: <402B5D4F.2000908@yahoo.co.uk> Message-ID: <000701c3f1a3$470fd8d0$f416a451@davidpc> Hi, your first solution (new constructor: new GappedSymbolList(origSyms, mergeOriginalGaps) sounds good to me These new constructors for GappedSymbolList and GappedSequence would be very helpful. Thanks in advance. David ----- Original Message ----- From: "Matthew Pocock" To: Cc: ; Sent: Thursday, February 12, 2004 12:02 PM Subject: Re: [Biojava-l] removeGap problem with SimpleGappedSequence > Hi, > > Seems like we have a bit of an 'expected behavior' and 'implemented > behavior' gap. If we decide to modify the GappedSymbolList constructor > to find all gaps in the original sequence, I think we should add it as > an option: > > new GappedSymbolList(origSyms, mergeOriginalGaps) > > and make the current constructor equivalent to this(syms, false). > Finding all these gaps, making an ungapped underlying symbol list, and > building the gap insertion data structures is a potentialy expensive > operation (imagine gapping a genome! you would pull the whole thing into > memory and do a linear scan), so we should be careful not to force it > upon the world. > > This would also change the contract of getSourceSymbolList() and also > what happens if that source is modified, wether changes to it are tracked. > > This could be worked around by implementing an "UnGappedView" class that > does the oposite mapping of GappedSymbolList - removes all gaps in the > source - then we could gap this putting them all back, making it > editable. I don't wan't to be the one to write it though - writing > GappedSymbolList made my brain hurt. > > Matthew > > mark.schreiber@group.novartis.com wrote: > > >Sounds like a pretty sensible suggestion. Can anyone think of why this > >might not be a 'good idea'? > > > >If not, i'll add it to the list of things to fix :) > > > >- Mark > > > > > > > From fangl at genomics.org.cn Sun Feb 15 22:13:43 2004 From: fangl at genomics.org.cn (Magic Fang) Date: Sun Feb 15 22:20:49 2004 Subject: [Biojava-l] can i download the demos code, the tutorial is somewhat limited Message-ID: <40303567.4090803@genomics.org.cn> hi, i am a beginner of java and biojava. i currently read the tutorial, but there are too little examples. can i download the demos code from biojava site. or if there is better and detail examples? thank u. magic fang From mark.schreiber at group.novartis.com Mon Feb 16 00:25:01 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Mon Feb 16 00:27:38 2004 Subject: [Biojava-l] can i download the demos code, the tutorial is somewhat limited Message-ID: Hi - You might also want to look at http://www.biojava.org/docs/bj_in_anger/index.htm, There is also a simplified chinese translation of this at http://www.cbi.pku.edu.cn/chinese/documents/PUMA/biojava/index-cn.html Hope this helps, Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Magic Fang Sent by: biojava-l-bounces@portal.open-bio.org 02/16/2004 11:13 AM To: biojava-l@biojava.org cc: Subject: [Biojava-l] can i download the demos code, the tutorial is somewhat limited hi, i am a beginner of java and biojava. i currently read the tutorial, but there are too little examples. can i download the demos code from biojava site. or if there is better and detail examples? thank u. magic fang _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From jan.wuerthner at uni-duesseldorf.de Mon Feb 16 12:08:27 2004 From: jan.wuerthner at uni-duesseldorf.de (Jan =?iso-8859-15?q?W=FCrthner?=) Date: Mon Feb 16 12:12:38 2004 Subject: [Biojava-l] DNA 'weak' comparison Message-ID: <200402161808.27346.jan.wuerthner@uni-duesseldorf.de> Hi folks, is there a BioJava tool to compare two DNAs by category, like 'R'=='A': true 'Y'=='C': true 'N'=='G': true 'B'=='A': false etc ? kind regards Jan -- Jan W?rthner Institute for Medical Microbiology Building 22.21 Heinrich-Heine-University Universit?tsstra?e 1 40225 Duesseldorf Tel. +49 (0) 211 81 12461 URL: www.medmikro.uni-duesseldorf.de From mark.schreiber at group.novartis.com Mon Feb 16 20:27:10 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Mon Feb 16 20:29:41 2004 Subject: [Biojava-l] DNA 'weak' comparison Message-ID: Hi Jan - Biojava deals with ambiguities using BasisSymbols. They are like a Symbol that is the set of all Symbols that make it up. For example W contains the Symbols A and T. For some details see: http://www.biojava.org/docs/bj_in_anger/ambig.htm You cannot say W == A as that would be testing canonical memory locations. You can however say W == W as even ambiguities are singletons. You can call the getMatches() method on one of these Symbols which will give you an Alphabet that contains only those Symbols that match the ambiguity. The contains() method of the resulting Alphabet will tell you if any Symbol is contained by the ambiguity. Some pseudo code for example: Symbol w; //see biojava in anger site above for how to initialize this symbol. Symbol a = DNATools.a(); Symbol g = DNATools.g(); Alphabet ambig = w.getMathches(); ambig.contains(a); //true ambig.contains(g); //false Jan W?rthner Sent by: biojava-l-bounces@portal.open-bio.org 02/17/2004 01:08 AM To: biojava-l@biojava.org cc: Subject: [Biojava-l] DNA 'weak' comparison Hi folks, is there a BioJava tool to compare two DNAs by category, like 'R'=='A': true 'Y'=='C': true 'N'=='G': true 'B'=='A': false etc ? kind regards Jan -- Jan W?rthner Institute for Medical Microbiology Building 22.21 Heinrich-Heine-University Universit?tsstra?e 1 40225 Duesseldorf Tel. +49 (0) 211 81 12461 URL: www.medmikro.uni-duesseldorf.de _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From vc100 at doc.ic.ac.uk Wed Feb 18 02:38:11 2004 From: vc100 at doc.ic.ac.uk (Vasa Curcin) Date: Wed Feb 18 02:43:47 2004 Subject: [Biojava-l] Serialization of RemoteFeature.Region Message-ID: <40331663.2000605@doc.ic.ac.uk> Hello, Serialization buff that I am, it's a wonder this one slipped past me earlier. RemoteFeature.Region won't serialize. Could this inner class be made serializable - some sequences obtained from Genbank won't serialize because of this. Cheers, Vasa From ambesi at tigem.it Thu Feb 26 10:32:49 2004 From: ambesi at tigem.it (Alberto Ambesi) Date: Thu Feb 26 10:38:54 2004 Subject: [Biojava-l] how to make a subsequence serializable? Message-ID: <08D3891E-6871-11D8-B5E7-000A958EE60A@tigem.it> I tried to wrap it around a VeiwSequence, but I still get a java.io.NotSerializableException: org.biojava.bio.seq.SubSequence. Can anyone help me out please? From mark.schreiber at group.novartis.com Thu Feb 26 19:58:26 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Thu Feb 26 20:01:04 2004 Subject: [Biojava-l] how to make a subsequence serializable? Message-ID: Hi - SubSequence was not marked as implementing java.io.Serializable. I will fix this in CVS. If you can't wait for the update just changing one line in your local code base should solve the problem. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 Alberto Ambesi Sent by: biojava-l-bounces@portal.open-bio.org 02/26/2004 11:32 PM To: biojava-l@biojava.org cc: Subject: [Biojava-l] how to make a subsequence serializable? I tried to wrap it around a VeiwSequence, but I still get a java.io.NotSerializableException: org.biojava.bio.seq.SubSequence. Can anyone help me out please? _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From wux at mail.cbi.pku.edu.cn Fri Feb 27 01:48:44 2004 From: wux at mail.cbi.pku.edu.cn (wux@mail.cbi.pku.edu.cn) Date: Fri Feb 27 01:58:56 2004 Subject: [Biojava-l] ask help in biojava GUI Message-ID: <200402270652.i1R6qU4q010289@mail.cbi.pku.edu.cn> Dear all: I have a question about biojava sequence GUI. I want to draw a lot of sequences in one panel with their features. e.g: m1 m2 m1 ( feature label on each feature) -----> ----> <----- ( features : there are two motif1s and one motif2) seq1 ---------------------------------- ( this is the ruler) acgtaaaaacccggggtttttttttttttttttt ( this is the real sequence) m1 m3 m1 ( feature label on each feature) -----> ----> <----- ( features : there are two motif1s and one motif2) seq2 ---------------------------------- ( this is the ruler) acgtaaaaacccggggtaggaggttttttttttt ( this is the real sequence) ............. I hope I can add each seq name in front of ruler , feature lable on top of the feature body , feature in different color according to its label and many sequences in one panel. How can I achieve that? I have looked througth the biojava doc 1.3 carefully, but It seems that SequencePanel can only display one sequence. The MultiLineRenderer can not resolve this either. Who can give me a hand ? Thanks in advance. ¡¡¡¡ ¡¡¡¡¡¡¡¡¡¡¡¡ Yours faithfully, ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ wux ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ wux@mail.cbi.pku.edu.cn ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ 2004-02-27 ***************************************************** WuXin Ph.D student of CBI (Center of Bioinformatics) Peking University 100871 P.R.China Email: wux@mail.cbi.pku.edu.cn Tel: 010-62762409 (dorm) 010-62755206 (office) Address: Building 47#2026 Peking University ***************************************************** From kdj at sanger.ac.uk Fri Feb 27 05:05:05 2004 From: kdj at sanger.ac.uk (Keith James) Date: Fri Feb 27 05:11:13 2004 Subject: [Biojava-l] ask help in biojava GUI In-Reply-To: <200402270652.i1R6qU4q010289@mail.cbi.pku.edu.cn> References: <200402270652.i1R6qU4q010289@mail.cbi.pku.edu.cn> Message-ID: >>>>> "wux" == wux@mail cbi pku edu cn writes: wux> Dear all: I have a question about biojava sequence GUI. I wux> want to draw a lot of sequences in one panel with their wux> features. e.g: [...] wux> I hope I can add each seq name in front of ruler , feature wux> lable on top of the feature body , feature in different color wux> according to its label and many sequences in one panel. How wux> can I achieve that? I have looked througth the biojava doc wux> 1.3 carefully, but It seems that SequencePanel can only wux> display one sequence. The MultiLineRenderer can not resolve wux> this either. Who can give me a hand ? Thanks in advance. Hi, You are right - a SequencePanel is designed to display one Sequence. You can create a multi-Sequence display by packing several SequencePanels into another container, using the standard Java LayoutManagers to organise them. hth, Keith -- - Keith James Microarray Facility, Team 65 - - The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK - From matthew_pocock at yahoo.co.uk Fri Feb 27 07:17:11 2004 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Fri Feb 27 07:23:45 2004 Subject: [Biojava-l] ask help in biojava GUI In-Reply-To: References: <200402270652.i1R6qU4q010289@mail.cbi.pku.edu.cn> Message-ID: <403F3547.6030401@yahoo.co.uk> Alternatively, if you have a self-hate complex, you can pack all your sequences into an alignment (with gaps in the right places) and use the alignment renderer to render all of them in a single display. It realy depends on whether you want each sequence to be ligned up perfectly with each other one. Matthew >Hi, > >You are right - a SequencePanel is designed to display one >Sequence. You can create a multi-Sequence display by packing several >SequencePanels into another container, using the standard Java >LayoutManagers to organise them. > >hth, Keith > > > From orion2480 at hotmail.com Fri Feb 27 15:39:21 2004 From: orion2480 at hotmail.com (Orion Hunter) Date: Fri Feb 27 15:45:23 2004 Subject: [Biojava-l] Intsall Problems Message-ID: I am trying to install biojava. I d/l biojava1.3.1, unziped and untarred it. I installed ANT, but I am confused. It specifies to execute ANT from the "biojava-live" directory, but I cannot find this directory, nor any build.xml files anywhere. So, then I try to download the jar file. I download all three required files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my classpath (and have confirmed that the classpath is correct and is remembered by the machine in my env). So, with a set of jars, I decided to try out the demo. Following the instructions on the website (in the "getting started" section), I cd to the demos/, and type the following: /home/user/biojava/>javac seq/TestEmbl.java I get a whole slew of errors, of which all look like they have to do with the fact that it can't find the packages in org.biojava.bio, etc. Now, I can see this package structure in /home/user/biojava/main/org/biojava/bio, etc. And I was trying to execute from /home/user/biojava/demos So, why can't it find the packages? I'm not new to java, but it's been a long time since I've had to deal with packages, and I can't recall how to make biojava see where the packages are located. Any suggestions? _________________________________________________________________ Say “good-bye” to spam, viruses and pop-ups with MSN Premium -- free trial offer! http://click.atdmt.com/AVE/go/onm00200359ave/direct/01/ From orion2480 at hotmail.com Fri Feb 27 16:11:47 2004 From: orion2480 at hotmail.com (Orion Hunter) Date: Fri Feb 27 16:17:46 2004 Subject: [Biojava-l] Intsall Problems Message-ID: In case it makes a difference, I forgot to mention that I am using j2sdk1.5.0beta on an Intel based RedHat 9.1 machine. Matt p.s. Sorry for the "intsall" mispell >From: "Orion Hunter" >To: biojava-l@biojava.org >Subject: [Biojava-l] Intsall Problems >Date: Fri, 27 Feb 2004 20:39:21 +0000 > >I am trying to install biojava. I d/l biojava1.3.1, unziped and untarred >it. I installed ANT, but I am confused. It specifies to execute ANT from >the "biojava-live" directory, but I cannot find this directory, nor any >build.xml files anywhere. > >So, then I try to download the jar file. I download all three required >files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my >classpath (and have confirmed that the classpath is correct and is >remembered by the machine in my env). > >So, with a set of jars, I decided to try out the demo. Following the >instructions on the website (in the "getting started" section), I cd to the >demos/, and type the following: > >/home/user/biojava/>javac seq/TestEmbl.java > >I get a whole slew of errors, of which all look like they have to do with >the fact that it can't find the packages in org.biojava.bio, etc. Now, I >can see this package structure in > >/home/user/biojava/main/org/biojava/bio, etc. > >And I was trying to execute from /home/user/biojava/demos > >So, why can't it find the packages? I'm not new to java, but it's been a >long time since I've had to deal with packages, and I can't recall how to >make biojava see where the packages are located. Any suggestions? > >_________________________________________________________________ >Say “good-bye” to spam, viruses and pop-ups with MSN Premium -- free trial >offer! http://click.atdmt.com/AVE/go/onm00200359ave/direct/01/ > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l _________________________________________________________________ Get a FREE online computer virus scan from McAfee when you click here. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 From orion2480 at hotmail.com Fri Feb 27 16:36:25 2004 From: orion2480 at hotmail.com (Orion Hunter) Date: Fri Feb 27 16:42:24 2004 Subject: [Biojava-l] Intsall Problems Update Message-ID: So, I added /home/user/biojava/main to my classpath, since org.biojava.* sits in this directory (maybe this should be mentioned in the getting started page). I got rid of all my compile errors, but am encountering runtime errors now: /home/user/biojava/demos>$ java seq.TestEmbl seq/AL121903.embl java.lang.NoClassDefFoundError: org/biojava/bio/symbol/SimpleCrossProductAlphabet at org.biojava.bio.seq.DNATools.(DNATools.java:54) rethrown as org.biojava.bio.BioError: Unable to initialize DNATools at org.biojava.bio.seq.DNATools.(DNATools.java:85) at seq.TestEmbl.main(TestEmbl.java:22) If I go into the directory of this package, there is no .class file for this file (which is of course why I got this error). HOwever, I cannot figure out why there is no .class file. I didn't get any compile errors, but I would have thought this class file would have been compiled then. ANy suggestions? >From: "Orion Hunter" >To: biojava-l@biojava.org >Subject: RE: [Biojava-l] Intsall Problems >Date: Fri, 27 Feb 2004 21:11:47 +0000 > >In case it makes a difference, I forgot to mention that I am using >j2sdk1.5.0beta on an Intel based RedHat 9.1 machine. > >Matt > >p.s. Sorry for the "intsall" mispell > > >>From: "Orion Hunter" >>To: biojava-l@biojava.org >>Subject: [Biojava-l] Intsall Problems >>Date: Fri, 27 Feb 2004 20:39:21 +0000 >> >>I am trying to install biojava. I d/l biojava1.3.1, unziped and untarred >>it. I installed ANT, but I am confused. It specifies to execute ANT from >>the "biojava-live" directory, but I cannot find this directory, nor any >>build.xml files anywhere. >> >>So, then I try to download the jar file. I download all three required >>files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my >>classpath (and have confirmed that the classpath is correct and is >>remembered by the machine in my env). >> >>So, with a set of jars, I decided to try out the demo. Following the >>instructions on the website (in the "getting started" section), I cd to >>the demos/, and type the following: >> >>/home/user/biojava/>javac seq/TestEmbl.java >> >>I get a whole slew of errors, of which all look like they have to do with >>the fact that it can't find the packages in org.biojava.bio, etc. Now, I >>can see this package structure in >> >>/home/user/biojava/main/org/biojava/bio, etc. >> >>And I was trying to execute from /home/user/biojava/demos >> >>So, why can't it find the packages? I'm not new to java, but it's been a >>long time since I've had to deal with packages, and I can't recall how to >>make biojava see where the packages are located. Any suggestions? >> >>_________________________________________________________________ >>Say “good-bye” to spam, viruses and pop-ups with MSN Premium -- free trial >>offer! http://click.atdmt.com/AVE/go/onm00200359ave/direct/01/ >> >>_______________________________________________ >>Biojava-l mailing list - Biojava-l@biojava.org >>http://biojava.org/mailman/listinfo/biojava-l > >_________________________________________________________________ >Get a FREE online computer virus scan from McAfee when you click here. >http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l _________________________________________________________________ Take off on a romantic weekend or a family adventure to these great U.S. locations. http://special.msn.com/local/hotdestinations.armx From smh1008 at cus.cam.ac.uk Sat Feb 28 07:50:25 2004 From: smh1008 at cus.cam.ac.uk (David Huen) Date: Sat Feb 28 07:56:23 2004 Subject: [Biojava-l] Intsall Problems Update In-Reply-To: References: Message-ID: <200402281250.25457.smh1008@cus.cam.ac.uk> On Friday 27 Feb 2004 9:36 pm, Orion Hunter wrote: > So, I added > > /home/user/biojava/main to my classpath, since org.biojava.* sits in this > directory (maybe this should be mentioned in the getting started page). > I got rid of all my compile errors, but am encountering runtime errors > now: > > /home/user/biojava/demos>$ java seq.TestEmbl seq/AL121903.embl > java.lang.NoClassDefFoundError: > org/biojava/bio/symbol/SimpleCrossProductAlphabet > at org.biojava.bio.seq.DNATools.(DNATools.java:54) > rethrown as org.biojava.bio.BioError: Unable to initialize DNATools > at org.biojava.bio.seq.DNATools.(DNATools.java:85) > at seq.TestEmbl.main(TestEmbl.java:22) > > If I go into the directory of this package, there is no .class file for > this file (which is of course why I got this error). HOwever, I cannot > figure out why there is no .class file. I didn't get any compile errors, > but I would have thought this class file would have been compiled then. > ANy suggestions? > > > From: "Orion Hunter" > > >To: biojava-l@biojava.org > >Subject: RE: [Biojava-l] Intsall Problems > >Date: Fri, 27 Feb 2004 21:11:47 +0000 > > > >In case it makes a difference, I forgot to mention that I am using > >j2sdk1.5.0beta on an Intel based RedHat 9.1 machine. > > > >Matt > > > >p.s. Sorry for the "intsall" mispell > > > >>From: "Orion Hunter" > >>To: biojava-l@biojava.org > >>Subject: [Biojava-l] Intsall Problems > >>Date: Fri, 27 Feb 2004 20:39:21 +0000 > >> > >>I am trying to install biojava. I d/l biojava1.3.1, unziped and > >> untarred it. I installed ANT, but I am confused. It specifies to > >> execute ANT from the "biojava-live" directory, but I cannot find this > >> directory, nor any build.xml files anywhere. > >> > >>So, then I try to download the jar file. I download all three required > >>files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my > >>classpath (and have confirmed that the classpath is correct and is > >>remembered by the machine in my env). > >> > >>So, with a set of jars, I decided to try out the demo. Following the > >>instructions on the website (in the "getting started" section), I cd to > >>the demos/, and type the following: > >> > >>/home/user/biojava/>javac seq/TestEmbl.java > >> > >>I get a whole slew of errors, of which all look like they have to do > >> with the fact that it can't find the packages in org.biojava.bio, etc. > >> Now, I can see this package structure in > >> > >>/home/user/biojava/main/org/biojava/bio, etc. > >> > >>And I was trying to execute from /home/user/biojava/demos > >> > >>So, why can't it find the packages? I'm not new to java, but it's been > >> a long time since I've had to deal with packages, and I can't recall > >> how to make biojava see where the packages are located. Any > >> suggestions? > >> OK, I will go thru' the steps necessary to do a compile and installing the jars that result from the compile. 1) install ant You mentioned you have done this. Do ensure that the ant bin/ directory is in PATH so typing "ant" will execute ant. 2) compile the BJ source tree. cd ant 3) setting up the CLASSPATH Java relies on the CLASSPATH to find classes it needs. You have to set the CLASSPATH to include the jars that biojava depends on as well as the jar in which biojava resides. running ant creates the biojava.jar in the biojava-live/ant-build directory. If you wish to use that jar, you must put it into your CLASSPATH. I usually set up my CLASSPATH in .bash_profile at login. If you are just d/ling the precompiled biojava-1.3.jar, you still need to put it into your CLASSPATH or java will nto find the classes it wants and give you the error messages your are reading. At this stage, things should just work. Regards, David Huen From sacoca at MCB.McGill.CA Sat Feb 28 12:08:22 2004 From: sacoca at MCB.McGill.CA (sacoca@MCB.McGill.CA) Date: Sat Feb 28 12:14:19 2004 Subject: [Biojava-l] Hard Times using File Inputs for HMM Package Message-ID: <3414.24.203.206.173.1077988102.squirrel@mail.MCB.McGill.CA> Hey all, I built a markov model using the Biojava package and am having an incredibly hard time using it on sequences that I have stored in fasta format on a file. The problem is that I specified my own SimpleAlphabet, for protein sequences using the one letter amino acid code much like the dishonest casino example that you have on the tutorial page for dynamic programming, and each time I try reading the sequence all I get is : org.biojava.bio.symbol.IllegalSymbolException: Symbol G not found in alphabet ProtAlphabet at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.LinearAlphabetIndex.indexForSymbol(LinearAlphabetIndex.java:117) at org.biojava.bio.dist.SimpleDistribution.getWeightImpl(SimpleDistribution.java:131) at org.biojava.bio.dist.AbstractDistribution.getWeight(AbstractDistribution.java:197) at org.biojava.bio.dp.ScoreType$Probability.calculateScore(ScoreType.java:48) at org.biojava.bio.dp.onehead.SingleDP.getEmission(SingleDP.java:100) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:553) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:488) I've tried building a parser with CharacterTokenization such as Parser = new CharacterTokenization(ProtAlphabet,false) and then bidning each symbol to the proper character for(int i=0; i Thanks for the response. I got everything working, but it took a little tweaking. First off, I never found the biojava-live directory. I know I had ANT installed correctly, because it worked from the command line... but I never found the biojava-live nor the build.xml that ant looks for. So, I d/l the jars themselves. However, in order to get everything to work, I had to include in my classpath the location of the packages (basically, I had to include /home/user/biojava/main in my class path, because this is the root of where the pakcage structure is located), in addition to the normal classpath of the JARS. The second thing I had to do was go in and individually compile many of the .java files in the packages. Javac'ing the demo files for whatever reason did not do this (and I tried several times, recompiling the demofiles to see if that would result in the necessary class files from the packages). Anyway, in the end I got it to work. Just took a little extra work. Matt >From: David Huen >Reply-To: smh1008@cus.cam.ac.uk >To: "Orion Hunter" , biojava-l@biojava.org >Subject: Re: [Biojava-l] Intsall Problems Update >Date: Sat, 28 Feb 2004 12:50:25 +0000 > >On Friday 27 Feb 2004 9:36 pm, Orion Hunter wrote: > > So, I added > > > > /home/user/biojava/main to my classpath, since org.biojava.* sits in >this > > directory (maybe this should be mentioned in the getting started page). > > I got rid of all my compile errors, but am encountering runtime errors > > now: > > > > /home/user/biojava/demos>$ java seq.TestEmbl seq/AL121903.embl > > java.lang.NoClassDefFoundError: > > org/biojava/bio/symbol/SimpleCrossProductAlphabet > > at org.biojava.bio.seq.DNATools.(DNATools.java:54) > > rethrown as org.biojava.bio.BioError: Unable to initialize DNATools > > at org.biojava.bio.seq.DNATools.(DNATools.java:85) > > at seq.TestEmbl.main(TestEmbl.java:22) > > > > If I go into the directory of this package, there is no .class file for > > this file (which is of course why I got this error). HOwever, I cannot > > figure out why there is no .class file. I didn't get any compile >errors, > > but I would have thought this class file would have been compiled then. > > ANy suggestions? > > > > > > From: "Orion Hunter" > > > > >To: biojava-l@biojava.org > > >Subject: RE: [Biojava-l] Intsall Problems > > >Date: Fri, 27 Feb 2004 21:11:47 +0000 > > > > > >In case it makes a difference, I forgot to mention that I am using > > >j2sdk1.5.0beta on an Intel based RedHat 9.1 machine. > > > > > >Matt > > > > > >p.s. Sorry for the "intsall" mispell > > > > > >>From: "Orion Hunter" > > >>To: biojava-l@biojava.org > > >>Subject: [Biojava-l] Intsall Problems > > >>Date: Fri, 27 Feb 2004 20:39:21 +0000 > > >> > > >>I am trying to install biojava. I d/l biojava1.3.1, unziped and > > >> untarred it. I installed ANT, but I am confused. It specifies to > > >> execute ANT from the "biojava-live" directory, but I cannot find this > > >> directory, nor any build.xml files anywhere. > > >> > > >>So, then I try to download the jar file. I download all three >required > > >>files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my > > >>classpath (and have confirmed that the classpath is correct and is > > >>remembered by the machine in my env). > > >> > > >>So, with a set of jars, I decided to try out the demo. Following the > > >>instructions on the website (in the "getting started" section), I cd >to > > >>the demos/, and type the following: > > >> > > >>/home/user/biojava/>javac seq/TestEmbl.java > > >> > > >>I get a whole slew of errors, of which all look like they have to do > > >> with the fact that it can't find the packages in org.biojava.bio, >etc. > > >> Now, I can see this package structure in > > >> > > >>/home/user/biojava/main/org/biojava/bio, etc. > > >> > > >>And I was trying to execute from /home/user/biojava/demos > > >> > > >>So, why can't it find the packages? I'm not new to java, but it's >been > > >> a long time since I've had to deal with packages, and I can't recall > > >> how to make biojava see where the packages are located. Any > > >> suggestions? > > >> > >OK, I will go thru' the steps necessary to do a compile and installing the >jars that result from the compile. > >1) install ant >You mentioned you have done this. Do ensure that the ant bin/ directory is >in PATH so typing "ant" will execute ant. > >2) compile the BJ source tree. >cd >ant > >3) setting up the CLASSPATH >Java relies on the CLASSPATH to find classes it needs. You have to set the >CLASSPATH to include the jars that biojava depends on as well as the jar in >which biojava resides. > >running ant creates the biojava.jar in the biojava-live/ant-build >directory. >If you wish to use that jar, you must put it into your CLASSPATH. I >usually set up my CLASSPATH in .bash_profile at login. > >If you are just d/ling the precompiled biojava-1.3.jar, you still need to >put it into your CLASSPATH or java will nto find the classes it wants and >give you the error messages your are reading. > >At this stage, things should just work. > >Regards, >David Huen > _________________________________________________________________ Watch high-quality video with fast playback at MSN Video. Free! http://click.atdmt.com/AVE/go/onm00200365ave/direct/01/ From wux at mail.cbi.pku.edu.cn Sun Feb 29 03:30:19 2004 From: wux at mail.cbi.pku.edu.cn (wux@mail.cbi.pku.edu.cn) Date: Sun Feb 29 03:39:44 2004 Subject: [Biojava-l] Bug in biojava GUI? Message-ID: <200402290833.i1T8XF4q003989@mail.cbi.pku.edu.cn> Dear all: When I add a TitleBorder to SequencePanel, The bottom line in MutilLineRenderer is disappear. I change the border toLineBorder, all the lines are in the line frame. I am not sure it is due to biojava's problem or java's . For example: ---- Title Border -------------------------- | -----> -----> | | -----> | | <---- | | acgtttttttttaaatttttttttttttttttttttttt | -------------------------------------------- ----5------10------15-------------------- ( This line is disappear !) change a border to SequencePanel : ----------------- -------------------------- | -----> -----> | | -----> | | <---- | | acgtttttttttaaatttttttttttttttttttttttt | | ----5------10------15-------------------| ( This line is correct in line frame !) -------------------------------------------- ¡¡¡¡ Who else meets the same problem? ¡¡¡¡¡¡¡¡¡¡¡¡ Yours faithfully, ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ wux ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ wux@mail.cbi.pku.edu.cn ¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ 2004-02-29 ***************************************************** WuXin Ph.D student of CBI (Center of Bioinformatics) Peking University 100871 P.R.China Email: wux@mail.cbi.pku.edu.cn Tel: 010-62762409 (dorm) 010-62755206 (office) Address: Building 47#2026 Peking University ***************************************************** From mark.schreiber at group.novartis.com Sun Feb 29 20:20:21 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Sun Feb 29 20:23:57 2004 Subject: [Biojava-l] Hard Times using File Inputs for HMM Package Message-ID: Hi - Possible guesses about what might be wrong: 1) You haven't created Symbols for your Alphabet 2) You haven't added said Symbols to your Alphabet This page http://www.biojava.org/docs/bj_in_anger/customAlpha.htm shows how to make a custom Alphabet. It may be useful. Hope this helps, - Mark ps Just wondering, why do you need a custom Alphabet for Protein??? There is a perfectly good one in ProteinTools.getAlphabet(). sacoca@mcb.mcgill.ca Sent by: biojava-l-bounces@portal.open-bio.org 02/29/2004 01:08 AM To: biojava-l@biojava.org cc: Subject: [Biojava-l] Hard Times using File Inputs for HMM Package Hey all, I built a markov model using the Biojava package and am having an incredibly hard time using it on sequences that I have stored in fasta format on a file. The problem is that I specified my own SimpleAlphabet, for protein sequences using the one letter amino acid code much like the dishonest casino example that you have on the tutorial page for dynamic programming, and each time I try reading the sequence all I get is : org.biojava.bio.symbol.IllegalSymbolException: Symbol G not found in alphabet ProtAlphabet at org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278) at org.biojava.bio.symbol.LinearAlphabetIndex.indexForSymbol(LinearAlphabetIndex.java:117) at org.biojava.bio.dist.SimpleDistribution.getWeightImpl(SimpleDistribution.java:131) at org.biojava.bio.dist.AbstractDistribution.getWeight(AbstractDistribution.java:197) at org.biojava.bio.dp.ScoreType$Probability.calculateScore(ScoreType.java:48) at org.biojava.bio.dp.onehead.SingleDP.getEmission(SingleDP.java:100) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:553) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:488) I've tried building a parser with CharacterTokenization such as Parser = new CharacterTokenization(ProtAlphabet,false) and then bidning each symbol to the proper character for(int i=0; i Hi - The biojava-live and build.xml files are specific to the CVS distribution which is the development version. If you are happy to use biojava 1.3.1 (or earlier) you just need the precompiled jars on your classpath. You don't really need the source files but it is nice to have them so you can look at the code. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 1 Science Park Road #04-14 The Capricorn Singapore 117528 phone +65 6722 2973 fax +65 6722 2910 "Orion Hunter" Sent by: biojava-l-bounces@portal.open-bio.org 02/29/2004 02:10 AM To: biojava-l@biojava.org, smh1008@cus.cam.ac.uk cc: Subject: Re: [Biojava-l] Intsall Problems Update Thanks for the response. I got everything working, but it took a little tweaking. First off, I never found the biojava-live directory. I know I had ANT installed correctly, because it worked from the command line... but I never found the biojava-live nor the build.xml that ant looks for. So, I d/l the jars themselves. However, in order to get everything to work, I had to include in my classpath the location of the packages (basically, I had to include /home/user/biojava/main in my class path, because this is the root of where the pakcage structure is located), in addition to the normal classpath of the JARS. The second thing I had to do was go in and individually compile many of the .java files in the packages. Javac'ing the demo files for whatever reason did not do this (and I tried several times, recompiling the demofiles to see if that would result in the necessary class files from the packages). Anyway, in the end I got it to work. Just took a little extra work. Matt >From: David Huen >Reply-To: smh1008@cus.cam.ac.uk >To: "Orion Hunter" , biojava-l@biojava.org >Subject: Re: [Biojava-l] Intsall Problems Update >Date: Sat, 28 Feb 2004 12:50:25 +0000 > >On Friday 27 Feb 2004 9:36 pm, Orion Hunter wrote: > > So, I added > > > > /home/user/biojava/main to my classpath, since org.biojava.* sits in >this > > directory (maybe this should be mentioned in the getting started page). > > I got rid of all my compile errors, but am encountering runtime errors > > now: > > > > /home/user/biojava/demos>$ java seq.TestEmbl seq/AL121903.embl > > java.lang.NoClassDefFoundError: > > org/biojava/bio/symbol/SimpleCrossProductAlphabet > > at org.biojava.bio.seq.DNATools.(DNATools.java:54) > > rethrown as org.biojava.bio.BioError: Unable to initialize DNATools > > at org.biojava.bio.seq.DNATools.(DNATools.java:85) > > at seq.TestEmbl.main(TestEmbl.java:22) > > > > If I go into the directory of this package, there is no .class file for > > this file (which is of course why I got this error). HOwever, I cannot > > figure out why there is no .class file. I didn't get any compile >errors, > > but I would have thought this class file would have been compiled then. > > ANy suggestions? > > > > > > From: "Orion Hunter" > > > > >To: biojava-l@biojava.org > > >Subject: RE: [Biojava-l] Intsall Problems > > >Date: Fri, 27 Feb 2004 21:11:47 +0000 > > > > > >In case it makes a difference, I forgot to mention that I am using > > >j2sdk1.5.0beta on an Intel based RedHat 9.1 machine. > > > > > >Matt > > > > > >p.s. Sorry for the "intsall" mispell > > > > > >>From: "Orion Hunter" > > >>To: biojava-l@biojava.org > > >>Subject: [Biojava-l] Intsall Problems > > >>Date: Fri, 27 Feb 2004 20:39:21 +0000 > > >> > > >>I am trying to install biojava. I d/l biojava1.3.1, unziped and > > >> untarred it. I installed ANT, but I am confused. It specifies to > > >> execute ANT from the "biojava-live" directory, but I cannot find this > > >> directory, nor any build.xml files anywhere. > > >> > > >>So, then I try to download the jar file. I download all three >required > > >>files (biojava.jar, bytecode.jar, and xerces.jar). I add them to my > > >>classpath (and have confirmed that the classpath is correct and is > > >>remembered by the machine in my env). > > >> > > >>So, with a set of jars, I decided to try out the demo. Following the > > >>instructions on the website (in the "getting started" section), I cd >to > > >>the demos/, and type the following: > > >> > > >>/home/user/biojava/>javac seq/TestEmbl.java > > >> > > >>I get a whole slew of errors, of which all look like they have to do > > >> with the fact that it can't find the packages in org.biojava.bio, >etc. > > >> Now, I can see this package structure in > > >> > > >>/home/user/biojava/main/org/biojava/bio, etc. > > >> > > >>And I was trying to execute from /home/user/biojava/demos > > >> > > >>So, why can't it find the packages? I'm not new to java, but it's >been > > >> a long time since I've had to deal with packages, and I can't recall > > >> how to make biojava see where the packages are located. Any > > >> suggestions? > > >> > >OK, I will go thru' the steps necessary to do a compile and installing the >jars that result from the compile. > >1) install ant >You mentioned you have done this. Do ensure that the ant bin/ directory is >in PATH so typing "ant" will execute ant. > >2) compile the BJ source tree. >cd >ant > >3) setting up the CLASSPATH >Java relies on the CLASSPATH to find classes it needs. You have to set the >CLASSPATH to include the jars that biojava depends on as well as the jar in >which biojava resides. > >running ant creates the biojava.jar in the biojava-live/ant-build >directory. >If you wish to use that jar, you must put it into your CLASSPATH. I >usually set up my CLASSPATH in .bash_profile at login. > >If you are just d/ling the precompiled biojava-1.3.jar, you still need to >put it into your CLASSPATH or java will nto find the classes it wants and >give you the error messages your are reading. > >At this stage, things should just work. > >Regards, >David Huen > _________________________________________________________________ Watch high-quality video with fast playback at MSN Video. Free! http://click.atdmt.com/AVE/go/onm00200365ave/direct/01/ _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l