From felipe.albrecht at gmail.com Fri Aug 5 17:05:14 2005 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Fri Aug 5 16:55:14 2005 Subject: [Biojava-l] Sequence and String Message-ID: Hello, When I do "sequence.seqString()" , [sequence as a instance of SimpleSequence] I catch a _new_ String representing the sequence or a _reference_ for the String in my Sequence class? Its is important for me, because if a have a greate sequence, I will waste memory copying then when a need take your String representation. Thanks. Felipe Albrecht From hollandr at gis.a-star.edu.sg Sat Aug 6 00:49:09 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Sat Aug 6 00:40:40 2005 Subject: [Biojava-l] Sequence and String Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601E871A7@BIONIC.biopolis.one-north.com> Internally the Sequence is stored not as a String but as an array of Symbol objects. Hence when you call seqString(), a new String is generated every time by appending the results of calling toString() on every Symbol object in the internal array. cheers, Richard -----Original Message----- From: biojava-l-bounces@portal.open-bio.org on behalf of Felipe Albrecht Sent: Sat 8/6/2005 5:05 AM To: biojava-l@biojava.org Cc: Subject: [Biojava-l] Sequence and String Hello, When I do "sequence.seqString()" , [sequence as a instance of SimpleSequence] I catch a _new_ String representing the sequence or a _reference_ for the String in my Sequence class? Its is important for me, because if a have a greate sequence, I will waste memory copying then when a need take your String representation. Thanks. Felipe Albrecht _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Tue Aug 9 22:49:17 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Tue Aug 9 22:39:24 2005 Subject: [Biojava-l] Sequence and String Message-ID: Actually a class called a SymbolTokenizer governs the transition from Symbols to Strings. The String is assembled in a StringBuffer but ultimately a new String is created with each call to seqString(). The method seqString() is really only meant for displaying the sequence in a readable way to STDOUT or similar. It shouldn't be used in algorithms or general programing. - Mark "Richard HOLLAND" Sent by: biojava-l-bounces@portal.open-bio.org 08/06/2005 12:49 PM To: "Felipe Albrecht" , cc: (bcc: Mark Schreiber/GP/Novartis) Subject: RE: [Biojava-l] Sequence and String Internally the Sequence is stored not as a String but as an array of Symbol objects. Hence when you call seqString(), a new String is generated every time by appending the results of calling toString() on every Symbol object in the internal array. cheers, Richard -----Original Message----- From: biojava-l-bounces@portal.open-bio.org on behalf of Felipe Albrecht Sent: Sat 8/6/2005 5:05 AM To: biojava-l@biojava.org Cc: Subject: [Biojava-l] Sequence and String Hello, When I do "sequence.seqString()" , [sequence as a instance of SimpleSequence] I catch a _new_ String representing the sequence or a _reference_ for the String in my Sequence class? Its is important for me, because if a have a greate sequence, I will waste memory copying then when a need take your String representation. Thanks. Felipe Albrecht _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Thu Aug 11 01:19:07 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Thu Aug 11 01:09:20 2005 Subject: [Biojava-l] biojava and matlab Message-ID: Hello all - I stumbled across this today, it might be of interest to people who use matlab and want to call biojava functions from it. http://www.mathworks.com/company/newsletters/digest/2005/july/integrate_matlab.html - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From felipe.albrecht at gmail.com Thu Aug 11 21:07:12 2005 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Thu Aug 11 20:57:01 2005 Subject: [Biojava-l] Compress Sequences. Message-ID: Has some class in biojava that compress sequences? For example, put four nucleotides in a single byte. If dont exist, someone knows a good algorithm for compress, read and compare this sequence? Thanks. Felipe Albrecht From mark.schreiber at novartis.com Fri Aug 12 02:45:51 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Fri Aug 12 02:36:18 2005 Subject: [Biojava-l] Compress Sequences. Message-ID: Check out PackedSymbolList and the associated classes and interfaces PackedSymbolListFactory, Packing, and Packing factory. These do bit packing of sequences. The nice part with these is they behave exactly like normal SymbolLists so you don't even know your dealing with a compressed sequence. >From the java docs. Example Usage SymbolList symL = ...; SymbolList packed = new PackedSymbolList( PackingFactory.getPacking(symL.getAlphabet(), true), symL ); It is also relatively trivial to write a Huffman tree generator that can compress SymbolLists as a binary string. You could use this as the bases for full LZ compression. There are also very much more complicated algorithms published that look for long range repeats, these are also very slow. - Mark Felipe Albrecht Sent by: biojava-l-bounces@portal.open-bio.org 08/12/2005 04:07 AM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Compress Sequences. Has some class in biojava that compress sequences? For example, put four nucleotides in a single byte. If dont exist, someone knows a good algorithm for compress, read and compare this sequence? Thanks. Felipe Albrecht _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From rohdester at gmail.com Sat Aug 13 09:17:47 2005 From: rohdester at gmail.com (Jacob Rohde) Date: Sat Aug 13 09:08:31 2005 Subject: [Biojava-l] Reverse transcription Message-ID: Hi list, I've perused the docs, but I can't seem to find a reverseTransribe method in there. What would be the easiest way to reverse transcribe som RNA to DNA in BioJava? Thanks, Jacob -- You can be my wingman any time. From mark.schreiber at novartis.com Sun Aug 14 20:57:15 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Aug 14 20:47:20 2005 Subject: [Biojava-l] Reverse transcription Message-ID: Take a look at http://www.biojava.org/docs/bj_in_anger/ReverseComplement.htm Jacob Rohde Sent by: biojava-l-bounces@portal.open-bio.org 08/13/2005 09:17 PM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Reverse transcription Hi list, I've perused the docs, but I can't seem to find a reverseTransribe method in there. What would be the easiest way to reverse transcribe som RNA to DNA in BioJava? Thanks, Jacob -- You can be my wingman any time. _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Sun Aug 14 21:12:25 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Aug 14 21:02:19 2005 Subject: [Biojava-l] Reverse transcription Message-ID: See also DNATools.transcribeToRNA(), especially read the javadocs and contrast that with DNATools.toRNA(). Jacob Rohde Sent by: biojava-l-bounces@portal.open-bio.org 08/13/2005 09:17 PM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] Reverse transcription Hi list, I've perused the docs, but I can't seem to find a reverseTransribe method in there. What would be the easiest way to reverse transcribe som RNA to DNA in BioJava? Thanks, Jacob -- You can be my wingman any time. _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From rohdester at gmail.com Mon Aug 15 04:26:37 2005 From: rohdester at gmail.com (Jacob Rohde) Date: Mon Aug 15 04:19:09 2005 Subject: [Biojava-l] Reverse transcription In-Reply-To: References: Message-ID: Hi, On 8/15/05, mark.schreiber@novartis.com wrote: > Take a look at http://www.biojava.org/docs/bj_in_anger/ReverseComplement.htm > > > Thanks for you replies Mark. Yes, I've read about the reverseComplement. It may be I'm dense here, but I can't see how it can help me go from RNA to DNA. The way I understand reverseComplement is that it gives the complementary strand of the argument in the reverse order (5'-3'), but in the same alphabet. I need to go from RNA to DNA. Again, I might be overlooking something?!?! On 8/15/05, mark.schreiber@novartis.com wrote: >See also DNATools.transcribeToRNA(), especially read the >javadocs and >contrast that with DNATools.toRNA(). Yes, I can see the difference. transcribeToRNA() assumes the argument is the template strand in the 5'-3' direction whereas toRNA() takes the coding strand as argument (it simply translates the alphabet without any consideration of an actual transcription event). /Jacob From mark.schreiber at novartis.com Mon Aug 15 05:27:10 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Mon Aug 15 05:18:01 2005 Subject: [Biojava-l] Reverse transcription Message-ID: Surprisingly there is not a method ro reverseTranslation in RNATools, I might add one. It does however give a nice opportunity to see how biojava translates symbols from one Alphabet to another. You would do it like this: public static SymbolList rt(SymbolList rna) throws IllegalSymbolException, IllegalAlphabetException{ ReversibleTranslationTable rtt = RNATools.transcriptionTable(); Symbol[] syms = new Symbol[rna.length()]; //reverse RNA rna = SymbolListViews.reverse(rna); for(int i = 1; i <= rna.length(); i++){ syms[i-1] = rtt.untranslate(rna.symbolAt(i)); } SymbolListFactory fact = new SimpleSymbolListFactory(); SymbolList dna = fact.makeSymbolList(syms, syms.length, rtt.getSourceAlphabet()); System.out.println(dna.toString()); return dna; } The key interface is the ReversibleTranslationTable. TranslationTable (the parent interface) has a method to translate a Symbol from one alpha to another. A ReversibleTranslationTable extends this by providing a second method to untranslate. The RNATools.transcriptionTable maps DNA to RNA. The untranslate method does the opposite. Anyone want to wirte that up for Biojava in Anger? - Mark Jacob Rohde Sent by: biojava-l-bounces@portal.open-bio.org 08/15/2005 04:26 PM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Reverse transcription Hi, On 8/15/05, mark.schreiber@novartis.com wrote: > Take a look at http://www.biojava.org/docs/bj_in_anger/ReverseComplement.htm > > > Thanks for you replies Mark. Yes, I've read about the reverseComplement. It may be I'm dense here, but I can't see how it can help me go from RNA to DNA. The way I understand reverseComplement is that it gives the complementary strand of the argument in the reverse order (5'-3'), but in the same alphabet. I need to go from RNA to DNA. Again, I might be overlooking something?!?! On 8/15/05, mark.schreiber@novartis.com wrote: >See also DNATools.transcribeToRNA(), especially read the >javadocs and >contrast that with DNATools.toRNA(). Yes, I can see the difference. transcribeToRNA() assumes the argument is the template strand in the 5'-3' direction whereas toRNA() takes the coding strand as argument (it simply translates the alphabet without any consideration of an actual transcription event). /Jacob _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From rohdester at gmail.com Mon Aug 15 09:56:06 2005 From: rohdester at gmail.com (Jacob Rohde) Date: Mon Aug 15 09:46:15 2005 Subject: [Biojava-l] Reverse transcription In-Reply-To: References: Message-ID: Hi again, On 8/15/05, mark.schreiber@novartis.com wrote: > Surprisingly there is not a method ro reverseTranslation in RNATools, I > might add one. It does however give a nice opportunity to see how biojava > translates symbols from one Alphabet to another. > > [SNIP code] Hey thanks for that code snippet. It does the trick. And nice to see some low-level alphabet translation. > > Anyone want to wirte that up for Biojava in Anger? > I just wanna note that if somebody writes it up for the cookbook, note that the code snippet by Mark isn't symmetric with transcribeToRNA(). I put in a DNATools.complement() to make it just so. Thanks a lot for all the help :) From mark.schreiber at novartis.com Mon Aug 15 21:01:24 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Tue Aug 16 00:24:11 2005 Subject: [Biojava-l] Reverse transcription Message-ID: >I just wanna note that if somebody writes it up for the cookbook, note >that the code snippet by Mark isn't symmetric with transcribeToRNA(). >I put in a DNATools.complement() to make it just so. Changing strand, alphabet and polarity all in one go wrecks my head. From mark.schreiber at novartis.com Tue Aug 16 03:43:05 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Tue Aug 16 04:08:01 2005 Subject: [Biojava-l] Gibbs sampling demo Message-ID: Hello - I have put an example of how you can build a gibbs sampler using biojava on the web at http://www.biojava.org/docs/bj_in_anger/gibbs.html It's pretty simple to do and makes a lot of funky use of the Distribution class. Enjoy! Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From great_fred at yahoo.com Wed Aug 17 05:13:48 2005 From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=) Date: Wed Aug 17 05:12:42 2005 Subject: [Biojava-l] Construct a phylogenetic tree Message-ID: <20050817091349.10857.qmail@web32202.mail.mud.yahoo.com> Hello, My question is simple : is there a tool which can draw a phylogenetic tree? The phylogenetic tree is known and it is a tree with brackets, like this : ( ( Q6S6K2_PIG_Insulin-like_growth:0.04624, Q9GJV5_BOVIN_Insulin-like_grow:0.04655) :0.03391, Q6P1M6_HUMAN_Insulin-like_grow:0.08006, ( Q6PE62_MOUSE_Igfbp3_protein__I:0.01882, IBP3_RAT_Insulin-like_growth_f:0.02570) :0.07527); Thanks for any suggestion... Sebastien ___________________________________________________________________________ Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger T?l?chargez cette version sur http://fr.messenger.yahoo.com From jan.wuerthner at uni-duesseldorf.de Wed Aug 17 07:18:03 2005 From: jan.wuerthner at uni-duesseldorf.de (Jan =?iso-8859-1?q?W=FCrthner?=) Date: Wed Aug 17 07:18:08 2005 Subject: [Biojava-l] Construct a phylogenetic tree In-Reply-To: <20050817091349.10857.qmail@web32202.mail.mud.yahoo.com> References: <20050817091349.10857.qmail@web32202.mail.mud.yahoo.com> Message-ID: <200508171318.03302.jan.wuerthner@uni-duesseldorf.de> Hi Sebastien, I am using the phylip package: http://evolution.genetics.washington.edu/phylip.html it has nothing to do with BioJava though. kind regards Jan Am Mittwoch, 17. August 2005 11:13 schrieb S?bastien PETIT: > Hello, > > My question is simple : is there a tool which can draw a phylogenetic > tree? > The phylogenetic tree is known and it is a tree with brackets, like > this : > > ( > ( > Q6S6K2_PIG_Insulin-like_growth:0.04624, > Q9GJV5_BOVIN_Insulin-like_grow:0.04655) > > :0.03391, > > Q6P1M6_HUMAN_Insulin-like_grow:0.08006, > ( > Q6PE62_MOUSE_Igfbp3_protein__I:0.01882, > IBP3_RAT_Insulin-like_growth_f:0.02570) > > :0.07527); > > Thanks for any suggestion... > > Sebastien > > > > > > > ___________________________________________________________________________ > Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger > T?l?chargez cette version sur http://fr.messenger.yahoo.com > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l -- Jan W?rthner Institute for Medical Microbiology Building 22.21 Heinrich-Heine-University Universit?tsstra?e 1 40225 Duesseldorf Tel. +49 (0) 211 81 12461 URL: www.medmikro.uni-duesseldorf.de ************************************************************************************************** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** eSafe scanned this email for viruses, vandals and malicious content. ** ************************************************************************************************** From koeberle at mpiib-berlin.mpg.de Wed Aug 17 12:04:17 2005 From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-1?Q?Christian_K=F6berle?=) Date: Wed Aug 17 11:53:57 2005 Subject: [Biojava-l] RNA alignment Message-ID: <43036001.9060105@mpiib-berlin.mpg.de> Hi, 1. is in bio-JAVA a implementation for local sequence alignment? 2. if yes, how can use this? 3. and how can i consider wobble-basepairs between the first RNA-sequence and the reverse complement of the second RNA-sequence? thanks Christian ---------------------------------------- Christian K?berle Max Planck Institut for Infection Biology Department: Immunology Schumannstr. 21/22 10117 Berlin Tel: +49 30 28 460 562 e-mail: koeberle@mpiib-berlin.mpg.de From mark.schreiber at novartis.com Wed Aug 17 21:07:07 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Wed Aug 17 20:57:05 2005 Subject: [Biojava-l] Construct a phylogenetic tree Message-ID: Hello - If your on windows you can use a program called TreeView. There is also a Java package called PAL that can draw and manipulate trees and do lots of phylogenitic calculations. Not sure of the web site. Try google. -= Mark Jan W?rthner Sent by: biojava-l-bounces@portal.open-bio.org 08/17/2005 07:18 PM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Construct a phylogenetic tree Hi Sebastien, I am using the phylip package: http://evolution.genetics.washington.edu/phylip.html it has nothing to do with BioJava though. kind regards Jan Am Mittwoch, 17. August 2005 11:13 schrieb S?bastien PETIT: > Hello, > > My question is simple : is there a tool which can draw a phylogenetic > tree? > The phylogenetic tree is known and it is a tree with brackets, like > this : > > ( > ( > Q6S6K2_PIG_Insulin-like_growth:0.04624, > Q9GJV5_BOVIN_Insulin-like_grow:0.04655) > > :0.03391, > > Q6P1M6_HUMAN_Insulin-like_grow:0.08006, > ( > Q6PE62_MOUSE_Igfbp3_protein__I:0.01882, > IBP3_RAT_Insulin-like_growth_f:0.02570) > > :0.07527); > > Thanks for any suggestion... > > Sebastien > > > > > > > ___________________________________________________________________________ > Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger > T?l?chargez cette version sur http://fr.messenger.yahoo.com > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l -- Jan W?rthner Institute for Medical Microbiology Building 22.21 Heinrich-Heine-University Universit?tsstra?e 1 40225 Duesseldorf Tel. +49 (0) 211 81 12461 URL: www.medmikro.uni-duesseldorf.de ************************************************************************************************** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** eSafe scanned this email for viruses, vandals and malicious content. ** ************************************************************************************************** _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Wed Aug 17 21:42:46 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Wed Aug 17 21:35:19 2005 Subject: [Biojava-l] RNA alignment Message-ID: Hello - This example explains how you can do a pairwise alignment with a hidden markov model. There is no specific implementation of a local alignment (eg smith waterman). I've been thinking about it but I haven't gotten around to it. - Mark "Christian K?berle" Sent by: biojava-l-bounces@portal.open-bio.org 08/18/2005 12:04 AM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] RNA alignment Hi, 1. is in bio-JAVA a implementation for local sequence alignment? 2. if yes, how can use this? 3. and how can i consider wobble-basepairs between the first RNA-sequence and the reverse complement of the second RNA-sequence? thanks Christian ---------------------------------------- Christian K?berle Max Planck Institut for Infection Biology Department: Immunology Schumannstr. 21/22 10117 Berlin Tel: +49 30 28 460 562 e-mail: koeberle@mpiib-berlin.mpg.de _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From felipe.albrecht at gmail.com Thu Aug 18 12:48:11 2005 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Thu Aug 18 12:38:06 2005 Subject: [Biojava-l] Add Symbols in a Sequence Message-ID: Hello. How I create a Sequence and can put in they new Symbols and Gaps? Per example, I have a valuable matrix and I take the values from they and then a I add the symbol or gap correspondent to the value. (as dynamic programming). Thanks more one time. Felipe Albrecht From YZhao at tigr.ORG Thu Aug 18 15:37:15 2005 From: YZhao at tigr.ORG (Zhao, Yongmei) Date: Thu Aug 18 15:27:30 2005 Subject: [Biojava-l] ZTR trace file parser? Message-ID: <7F0AA25C9B1706448E8B73DFDD109B31EADF6C@EXCHANGE.TIGR.ORG> Hi, I am wondering if there is a API in biojava for reading ZTR format trace file? Any plan to add such one in near future? Thanks, Yong From reneehalbrook74 at yahoo.com Thu Aug 18 17:51:27 2005 From: reneehalbrook74 at yahoo.com (Renee Halbrook) Date: Thu Aug 18 17:41:12 2005 Subject: [Biojava-l] Blast xml parser question -- new to Biojava Message-ID: <20050818215127.48991.qmail@web40511.mail.yahoo.com> Hi, I am pretty new to Biojava. I am trying to use the blast xml parser. (Specifically, I am using org.biojava.bio.program.sax.blastxml.BlastParserFacade in biojava v1.4 ) Is it possible to get the accession number from blast output? (equivalent to in results.xml from blast results) Is there another class that I should be using, instead, that provides a more complete mapping of the xml file? Is there a resource on advice on how to extend this class, to give more specific results? Thanks in advance for any help on this. Regards, Renee Halbrook ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs From hollandr at gis.a-star.edu.sg Thu Aug 18 21:44:25 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Thu Aug 18 21:35:44 2005 Subject: [Biojava-l] Add Symbols in a Sequence Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601FE503C@BIONIC.biopolis.one-north.com> SymbolLists are generally immutable. To edit one you have to apply Edits to it. See http://biojava.org/docs/bj_in_anger/edit.htm for details. cheers, Richard. Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biojava-l-bounces@portal.open-bio.org > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of > Felipe Albrecht > Sent: Friday, August 19, 2005 12:48 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] Add Symbols in a Sequence > > > Hello. > > How I create a Sequence and can put in they new Symbols and Gaps? > > Per example, I have a valuable matrix and I take the values from they > and then a I add the symbol or gap correspondent to the value. (as > dynamic programming). > > Thanks more one time. > > > Felipe Albrecht > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > From hollandr at gis.a-star.edu.sg Thu Aug 18 21:46:54 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Thu Aug 18 21:37:50 2005 Subject: [Biojava-l] Blast xml parser question -- new to Biojava Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601FE503E@BIONIC.biopolis.one-north.com> See http://biojava.org/docs/bj_in_anger/BlastParser.htm Currently only a partial mapping is returned by the default parser. To get full results you have to write your own SearchContentHandler to give to the blast parser, an outline of which is given at http://biojava.org/docs/bj_in_anger/blastecho.htm cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biojava-l-bounces@portal.open-bio.org > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of > Renee Halbrook > Sent: Friday, August 19, 2005 5:51 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] Blast xml parser question -- new to Biojava > > > Hi, > I am pretty new to Biojava. I am trying to use the > blast xml parser. (Specifically, I am using > org.biojava.bio.program.sax.blastxml.BlastParserFacade > in biojava v1.4 ) > Is it possible to get the accession number from blast > output? (equivalent to in results.xml > from blast results) > Is there another class that I should be using, > instead, that provides a more complete mapping of the > xml file? > Is there a resource on advice on how to extend this > class, to give more specific results? > > Thanks in advance for any help on this. > > Regards, > Renee Halbrook > > > > > ____________________________________________________ > Start your day with Yahoo! - make it your home page > http://www.yahoo.com/r/hs > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > From hollandr at gis.a-star.edu.sg Thu Aug 18 21:52:16 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Thu Aug 18 21:43:10 2005 Subject: [Biojava-l] ZTR trace file parser? Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601FE503F@BIONIC.biopolis.one-north.com> BioJava doesn't currently have a parser for ZTR. If you'd like to write one and contribute it, please do so! You might like to copy the code used by the ABI trace file parser and work from there, assuming the formats are relatively similar (I know nothing about ZTR so I could be wildly wrong). The ABI parser can be found at org.biojava.bio.program.abi.ABIFParser. cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biojava-l-bounces@portal.open-bio.org > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of > Zhao, Yongmei > Sent: Friday, August 19, 2005 3:37 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] ZTR trace file parser? > > > Hi, > > I am wondering if there is a API in biojava for reading ZTR > format trace > file? Any plan to add such one in near future? > > Thanks, > Yong > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > From mark.schreiber at novartis.com Thu Aug 18 20:16:28 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Fri Aug 19 00:05:55 2005 Subject: [Biojava-l] ZTR trace file parser? Message-ID: AFAIK there is only SCF and ABI formats. If someone wants to add ZTR that would be great. - Mark "Zhao, Yongmei" Sent by: biojava-l-bounces@portal.open-bio.org 08/19/2005 03:37 AM To: cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] ZTR trace file parser? Hi, I am wondering if there is a API in biojava for reading ZTR format trace file? Any plan to add such one in near future? Thanks, Yong _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From carl.manaster at gmail.com Sun Aug 21 16:30:32 2005 From: carl.manaster at gmail.com (Carl Manaster) Date: Sun Aug 21 16:20:10 2005 Subject: [Biojava-l] Newbie Installation Question Message-ID: Hi, I'm new to Java, new to biojava. I tried to install biojava, but instead of .jar files winding up in my Windows directory, I get .zip files. When I expand them, I get folders META-INF and org, which I suspect are really supposed to be in the .jar file. I tried renaming .zip to .jar, but Eclipse wasn't happy with it. Could someone please pass along really simple instructions? Sorry for the bother. Peace, --Carl Manaster manaster@pobox.com -- http://undisclosed-recipients.blogspot.com http://www.flickr.com/photos/carlmanaster/sets/228603/ From hollandr at gis.a-star.edu.sg Sun Aug 21 21:38:15 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Sun Aug 21 21:29:09 2005 Subject: [Biojava-l] Newbie Installation Question Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601FE512B@BIONIC.biopolis.one-north.com> You just have to download the jar files from http://www.biojava.org/download14.html . There's no such thing as 'installing' it really - just download the jar files, then make sure they're on your classpath (or set them up as a library in your IDE). If Internet Explorer is renaming or doing stuff behind the scenes to the jar files when you download them, try using Firefox (I haven't used IE for so long I can't remember how to fix that particular problem). cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biojava-l-bounces@portal.open-bio.org > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of > Carl Manaster > Sent: Monday, August 22, 2005 4:31 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] Newbie Installation Question > > > Hi, > > I'm new to Java, new to biojava. I tried to install biojava, but > instead of .jar files winding up in my Windows directory, I get .zip > files. When I expand them, I get folders META-INF and org, which I > suspect are really supposed to be in the .jar file. I tried renaming > .zip to .jar, but Eclipse wasn't happy with it. Could someone please > pass along really simple instructions? Sorry for the bother. > > Peace, > --Carl Manaster > manaster@pobox.com > > -- > http://undisclosed-recipients.blogspot.com > http://www.flickr.com/photos/carlmanaster/sets/228603/ > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > From mark.schreiber at novartis.com Sun Aug 21 22:01:48 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Aug 21 21:51:25 2005 Subject: [Biojava-l] Newbie Installation Question Message-ID: If you are using WinZip on windows the jar icon looks the same as a zip file. This is because winzip can read JAR files not because they are zip files. You never need to unzip a zip file (even if it is actually zipped). The source and docs files are zipped (usually .tar.gz) many modern IDEs don't require these to be unzipped either. If you are not using an IDE you will need to unzip them so you can view them. - Mark "Richard HOLLAND" Sent by: biojava-l-bounces@portal.open-bio.org 08/22/2005 09:38 AM To: , cc: (bcc: Mark Schreiber/GP/Novartis) Subject: RE: [Biojava-l] Newbie Installation Question You just have to download the jar files from http://www.biojava.org/download14.html . There's no such thing as 'installing' it really - just download the jar files, then make sure they're on your classpath (or set them up as a library in your IDE). If Internet Explorer is renaming or doing stuff behind the scenes to the jar files when you download them, try using Firefox (I haven't used IE for so long I can't remember how to fix that particular problem). cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biojava-l-bounces@portal.open-bio.org > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of > Carl Manaster > Sent: Monday, August 22, 2005 4:31 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] Newbie Installation Question > > > Hi, > > I'm new to Java, new to biojava. I tried to install biojava, but > instead of .jar files winding up in my Windows directory, I get .zip > files. When I expand them, I get folders META-INF and org, which I > suspect are really supposed to be in the .jar file. I tried renaming > .zip to .jar, but Eclipse wasn't happy with it. Could someone please > pass along really simple instructions? Sorry for the bother. > > Peace, > --Carl Manaster > manaster@pobox.com > > -- > http://undisclosed-recipients.blogspot.com > http://www.flickr.com/photos/carlmanaster/sets/228603/ > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From carl.manaster at gmail.com Sun Aug 21 23:02:47 2005 From: carl.manaster at gmail.com (Carl Manaster) Date: Sun Aug 21 22:52:16 2005 Subject: [Biojava-l] Newbie Installation Question In-Reply-To: References: Message-ID: Thank you, Mark and Richard - Yes, it's IE, and I think it's renaming them as I download. I can force the new name to be .jar, and Eclipse is now happy. I think another problem was involved when I earlier tried renaming to .jar; too many things too new all at once. It's up and running now; thanks! Peace, --Carl -- http://undisclosed-recipients.blogspot.com http://www.flickr.com/photos/carlmanaster/sets/228603/ From dominique.vlieghe at dmbr.ugent.be Mon Aug 22 02:51:19 2005 From: dominique.vlieghe at dmbr.ugent.be (Dominique Vlieghe) Date: Mon Aug 22 02:40:58 2005 Subject: [Biojava-l] parse (recent) blast output Message-ID: <430975E7.2010904@dmbr.ugent.be> Hello fellow biojova'ers I need, as every bioinformatician who uses java, a blast parser. I tried biojova's blastparser (from the cookbook), but recent blast outputs (2.2.10-11) are not supported. When I use the lazy method, I get a NullPointerException. So my questions are: 1) Has someone of you already succeeded in parsing these recent outputs? 2) Does anyone know why these outputs give me the exception? Has the output changed that much going from 2.2.3 to 2.2.10? 3) Would the software tweaks be difficult to implement? I have seen in the mailing list archives that some time ago there was a call to centralise the biojava parseblast enhancement efforts. What is the status on that? I would like to contribute in (re)writing a blast parser, but only if it would serve the general community, so the biojava route would be preferred. But since I have outgrown the java newbie status only just and I'm totally new to biojava, any advice would be appreciated. Cheers, Dominique ========== Exception in thread "main" java.lang.NullPointerException at org.biojava.bio.program.sax.BlastSAXParser.interpret(BlastSAXParser.java:215) at org.biojava.bio.program.sax.BlastSAXParser.parse(BlastSAXParser.java:164) at org.biojava.bio.program.sax.BlastLikeSAXParser.onNewDataSet(BlastLikeSAXParser.java:311) at org.biojava.bio.program.sax.BlastLikeSAXParser.interpret(BlastLikeSAXParser.java:274) at org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java:160) at BlastParser.main(BlastParser.java:44) -- ------------------------------------------------------------------------ Dominique Vlieghe, Ph.D., Bioinformatics Core, Department for Molecular Biomedical Research (DMBR) VIB - Ghent University Technologiepark 927 B-9052 Ghent (Zwijnaarde), Belgium Tel : +32-(0)9-33-13.693 email: dominique.vlieghe@dmbr.ugent.be Fax : +32-(0)9-33-13.609 www: http://bioit.dmbr.ugent.be/ ------------------------------------------------------------------------ From mark.schreiber at novartis.com Mon Aug 22 04:32:09 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Mon Aug 22 04:22:45 2005 Subject: [Biojava-l] parse (recent) blast output Message-ID: Hello Dominique It shouldn't be too hard to track down the problem. You might find the following program helpful in checking what the parser is up to when the error occurs. (http://www.biojava.org/docs/bj_in_anger/blastecho.htm), Using this along with stack traces and possibly a debugger you should be able to find out what is going wrong. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 Dominique Vlieghe Sent by: biojava-l-bounces@portal.open-bio.org 08/22/2005 02:51 PM To: Biojava cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] parse (recent) blast output Hello fellow biojova'ers I need, as every bioinformatician who uses java, a blast parser. I tried biojova's blastparser (from the cookbook), but recent blast outputs (2.2.10-11) are not supported. When I use the lazy method, I get a NullPointerException. So my questions are: 1) Has someone of you already succeeded in parsing these recent outputs? 2) Does anyone know why these outputs give me the exception? Has the output changed that much going from 2.2.3 to 2.2.10? 3) Would the software tweaks be difficult to implement? I have seen in the mailing list archives that some time ago there was a call to centralise the biojava parseblast enhancement efforts. What is the status on that? I would like to contribute in (re)writing a blast parser, but only if it would serve the general community, so the biojava route would be preferred. But since I have outgrown the java newbie status only just and I'm totally new to biojava, any advice would be appreciated. Cheers, Dominique ========== Exception in thread "main" java.lang.NullPointerException at org.biojava.bio.program.sax.BlastSAXParser.interpret(BlastSAXParser.java:215) at org.biojava.bio.program.sax.BlastSAXParser.parse(BlastSAXParser.java:164) at org.biojava.bio.program.sax.BlastLikeSAXParser.onNewDataSet(BlastLikeSAXParser.java:311) at org.biojava.bio.program.sax.BlastLikeSAXParser.interpret(BlastLikeSAXParser.java:274) at org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java:160) at BlastParser.main(BlastParser.java:44) -- ------------------------------------------------------------------------ Dominique Vlieghe, Ph.D., Bioinformatics Core, Department for Molecular Biomedical Research (DMBR) VIB - Ghent University Technologiepark 927 B-9052 Ghent (Zwijnaarde), Belgium Tel : +32-(0)9-33-13.693 email: dominique.vlieghe@dmbr.ugent.be Fax : +32-(0)9-33-13.609 www: http://bioit.dmbr.ugent.be/ ------------------------------------------------------------------------ _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From hollandr at gis.a-star.edu.sg Wed Aug 24 04:45:02 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Wed Aug 24 04:39:01 2005 Subject: [Biojava-l] RE: [BioSQL-l] Special cases of protein data Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56021E40B6@BIONIC.biopolis.one-north.com> I've come across this same problem. The source features only relate to the location they specify. The sequence itself is always defined as coming from a single organism, further up in the headers of the file under the SOURCE/ORGANISM pairing. That organism is the one that should be referenced from bioentry. However, it does not help us much in BioSQL. The SOURCE/ORGANISM field only describes in text the organism. It doesn't provide an NCBI Taxon ID. So, we can't auto-generate missing organisms in the NCBI taxon table, and so we can't use this field to determine the species of the organism (unless we can guarantee the whole of the NCBI taxonomy tree has been preloaded into the database). The new BioJava Genbank parser we are working on (to be announced soon) uses the taxon ID from the first /dbxref="taxon:..." tag of the first feature as the source organism, and assigns the organism name from the SOURCE/ORGANISM headings to this taxon ID, and emits warnings if it finds other taxon IDs further down. It would be simple enough to change this to depend on a preloaded taxonomy database, but I hate introducing dependencies like that. Would such a required dependency be justified for the sake of correct parsing of multiple sources? cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biosql-l-bounces@portal.open-bio.org > [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of > "Andreas Dr?ger" > Sent: Wednesday, August 24, 2005 4:11 PM > To: biosql-l@open-bio.org > Subject: [BioSQL-l] Special cases of protein data > > > Dear BioSQL-developers, > > I am currently working with BioSQL using MySQL. I tried to > insert a lot of > protein data which were downloaded from the NCBI web page in > GenPept format. > During the insertion process (performed by BioJava) I got some error > messages. Looking at the sequences in detail showed that I > got more than > 1000 protein sequences which had at least two "source" > entries in theire > "FEATURE" table. One of these bad examples is given at NCBI > by the accession > number P76519. This one has even four "source" tags. In my > opinion this > means that every single species of the four given species > contains exactly > this protein. This would mean that there are at least these > one thousand > proteins that I found at NCBI belonging to more than one > species. This case > cannot be considered with the current BioSQL scheme because > there is a one > to many relationship between the tables bioentry and taxon. > To consider that > the same protein belongs to n taxa we would need to create > another table to > reflect a many to many relationship between the table taxon > and bioentry. > The foreign key constraint of bioentry to taxon would have to > be removed. > The resuld would be something like: > > bioentry <--> taxon_bioentry <--> taxon > > where taxon_bioentry is the extra table. This is just what I > was thinking > about. However, at the moment I cannot insert files like > P76519 into the > BioSQL database. Or am I wrong and the meaning of more than > one "source" tag > is somehow different? > I am looking forward to get any suggestions. > > Yours Andreas Dr?ger > > -- > 5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail > +++ GMX - die erste Adresse f?r Mail, Message, More +++ > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > From YZhao at tigr.ORG Thu Aug 25 10:29:14 2005 From: YZhao at tigr.ORG (Zhao, Yongmei) Date: Thu Aug 25 10:18:42 2005 Subject: [Biojava-l] SCF parser problem Message-ID: <7F0AA25C9B1706448E8B73DFDD109B31EADF7B@EXCHANGE.TIGR.ORG> Hello, I parsed trace file by using the SCF parser in biojava-1.4 release, the output of the getTrace(AtomicSymbol nuc) method returned erroneous data, obviously, the data is overflow. I tested with few scf trace files and tried to compare the results (int[]) with the output from staden package io_lib, only first few sample data in the array are the same, the rest of the data from SCF parser does not make sense. I searched Biojava-dev archive, and saw a discussion talked about the bug in SCF parser, which was fixed and will lead to 1.4 release. I am wondering if the fix was included in 1.4 release or not? Thanks, Yongmei From ady at sanger.ac.uk Thu Aug 25 10:55:04 2005 From: ady at sanger.ac.uk (Andy Yates) Date: Thu Aug 25 10:44:36 2005 Subject: [Biojava-l] SCF parser problem In-Reply-To: <7F0AA25C9B1706448E8B73DFDD109B31EADF7B@EXCHANGE.TIGR.ORG> References: <7F0AA25C9B1706448E8B73DFDD109B31EADF7B@EXCHANGE.TIGR.ORG> Message-ID: <430DDBC8.9040406@sanger.ac.uk> I also have encountered this problem before & this bug has not been fixed yet. I haven't had time to construct the relevant test packs for the distro. If you need the fix now then I can send you any src code you need. Andy Y Zhao, Yongmei wrote: > Hello, > > I parsed trace file by using the SCF parser in biojava-1.4 release, the > output of the getTrace(AtomicSymbol nuc) method returned erroneous data, > obviously, the data is overflow. I tested with few scf trace files and > tried to compare the results (int[]) with the output from staden package > io_lib, only first few sample data in the array are the same, the rest > of the data from SCF parser does not make sense. I searched Biojava-dev > archive, and saw a discussion talked about the bug in SCF parser, which > was fixed and will lead to 1.4 release. I am wondering if the fix was > included in 1.4 release or not? > > > Thanks, > Yongmei > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l From douglas.hoen at mail.mcgill.ca Sun Aug 28 12:46:24 2005 From: douglas.hoen at mail.mcgill.ca (Douglas Hoen) Date: Sun Aug 28 12:35:47 2005 Subject: [Biojava-l] sequence masking? Message-ID: Hi, I want to mask out DNA subsequences, such as repetitive DNA. I have been unable to find any APIs for this. I did find SequenceTools.maskSequence(), but this method masks the region outside an indicated location rather than inside it and it also uses gaps as the mask symbol, whereas I would like to use N or lowercase. Another related API is SoftMaskedAlphabet class, which seems useful but I can't find any utilities that take advantage of it. Any help would be appreciated. Thanks, Doug From mark.schreiber at novartis.com Sun Aug 28 21:56:08 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Aug 28 21:45:59 2005 Subject: [Biojava-l] sequence masking? Message-ID: Hello - There are not any specific utilities for dealing with this but SoftMaskedAlphabet is just a standard biojava alphabet (with some reworking of the internals) but can be used as normal. Hence, this should work (I've not tested this so let me know if it doesn't). //get a softmasked version of 'DNA' FiniteAlphabet alpha = SoftMaskedAlphabet.getInstance(DNATools.getDNA()); //Make a symbol list over that alphabet SimpleSymbolList syms = new SimpleSymbolList(alpha.getTokenization(), "ACCTCGCccccggggccccggggccccggggTTCGA"); //do stuff ... - Mark Douglas Hoen Sent by: biojava-l-bounces@portal.open-bio.org 08/29/2005 12:46 AM To: biojava-l@biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] sequence masking? Hi, I want to mask out DNA subsequences, such as repetitive DNA. I have been unable to find any APIs for this. I did find SequenceTools.maskSequence(), but this method masks the region outside an indicated location rather than inside it and it also uses gaps as the mask symbol, whereas I would like to use N or lowercase. Another related API is SoftMaskedAlphabet class, which seems useful but I can't find any utilities that take advantage of it. Any help would be appreciated. Thanks, Doug _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From mark.schreiber at novartis.com Sun Aug 28 21:58:30 2005 From: mark.schreiber at novartis.com (mark.schreiber@novartis.com) Date: Sun Aug 28 21:47:57 2005 Subject: [Biojava-l] SCF parser problem Message-ID: Hello - I may have missed that patch. Can you send me the source for the corrected SCF parser and I will commit it to CVS so it makes the next version. Thanks, - Mark Andy Yates Sent by: biojava-l-bounces@portal.open-bio.org 08/25/2005 10:55 PM To: "Zhao, Yongmei" cc: biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] SCF parser problem I also have encountered this problem before & this bug has not been fixed yet. I haven't had time to construct the relevant test packs for the distro. If you need the fix now then I can send you any src code you need. Andy Y Zhao, Yongmei wrote: > Hello, > > I parsed trace file by using the SCF parser in biojava-1.4 release, the > output of the getTrace(AtomicSymbol nuc) method returned erroneous data, > obviously, the data is overflow. I tested with few scf trace files and > tried to compare the results (int[]) with the output from staden package > io_lib, only first few sample data in the array are the same, the rest > of the data from SCF parser does not make sense. I searched Biojava-dev > archive, and saw a discussion talked about the bug in SCF parser, which > was fixed and will lead to 1.4 release. I am wondering if the fix was > included in 1.4 release or not? > > > Thanks, > Yongmei > > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From andreas.draeger at clever-telefonieren.de Sun Aug 28 10:44:45 2005 From: andreas.draeger at clever-telefonieren.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Fri Sep 2 08:03:06 2005 Subject: [Biojava-l] Global Alignment Message-ID: <4311CDDD.7030802@clever-telefonieren.de> Hello, I just implemented the Needleman-Wunsch-Algorithm and an object for handling substitution matrices like BLOSSUM, PAM, GONNET and so on. It parses a matrix file and provides a method to get the costs for changing Symbol A to Symbol B. This is realized by two hashes and a private int[][] matrix. If NeedlemanWunsch gets equal weights for gap opening and gap extension it won't consider affine gap penalties, whereas otherwise it will, which needs three times more memory. Andreas Dr?ger -------------- next part -------------- A non-text attachment was scrubbed... Name: SubstitutionMatrix.java Type: text/x-java Size: 12573 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050828/69399ac2/SubstitutionMatrix-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: NeedlemanWunsch.java Type: text/x-java Size: 14813 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050828/69399ac2/NeedlemanWunsch-0001.bin From andreas.draeger at clever-telefonieren.de Sun Aug 28 11:08:42 2005 From: andreas.draeger at clever-telefonieren.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Fri Sep 2 08:03:10 2005 Subject: [Biojava-l] Globla Sequence Alignment Message-ID: <4311D37A.2030406@clever-telefonieren.de> Hello, I just implemented the Needleman-Wunsch-Algorithm and an object for handling substitution matrices like BLOSSUM, PAM, GONNET and so on. It parses a matrix file and provides a method to get the costs for changing Symbol A to Symbol B. This is realized by two hashes and a private int[][] matrix. If NeedlemanWunsch gets equal weights for gap opening and gap extension it won't consider affine gap penalties, whereas otherwise it will, which needs three times more memory. Andreas Dr?ger -------------- next part -------------- A non-text attachment was scrubbed... Name: NeedlemanWunsch.java Type: text/x-java Size: 14813 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050828/c60b3c62/NeedlemanWunsch-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SubstitutionMatrix.java Type: text/x-java Size: 12573 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050828/c60b3c62/SubstitutionMatrix-0001.bin From duze at gmx.de Mon Aug 29 07:00:38 2005 From: duze at gmx.de (=?ISO-8859-1?Q?=22Andreas_Dr=E4ger=22?=) Date: Fri Sep 2 08:03:11 2005 Subject: [Biojava-l] Global Sequence Alignment Message-ID: <22283.1125313238@www26.gmx.net> Hello, I just implemented the Needleman-Wunsch-Algorithm and an object for handling substitution matrices like BLOSSUM, PAM, GONNET and so on. It parses a matrix file and provides a method to get the costs for changing Symbol A to Symbol B. This is realized by two hashes and a private int[][] matrix. If NeedlemanWunsch gets equal weights for gap opening and gap extension it won't consider affine gap penalties, whereas otherwise it will, which needs three times more memory. It would also be good to have a kind of interface to create multiple different implementations of pairwise sequence alignment algorithms which could be used easily then. I also attached an Interface for that purpose. The NeedlemanWunsch already implements. The HMM algorithm from the cookbook also implements this interface with a little effort of changes. Andreas Dr?ger -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisionen f?r GMX Partner: http://www.gmx.net/de/go/partner -------------- next part -------------- A non-text attachment was scrubbed... Name: NeedlemanWunsch.java Type: text/x-java Size: 14813 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050829/614d2755/NeedlemanWunsch-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SubstitutionMatrix.java Type: text/x-java Size: 12573 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050829/614d2755/SubstitutionMatrix-0001.bin From duze at gmx.de Mon Aug 29 07:03:05 2005 From: duze at gmx.de (=?ISO-8859-1?Q?=22Andreas_Dr=E4ger=22?=) Date: Fri Sep 2 08:03:12 2005 Subject: [Biojava-l] Global Sequence Alignment Message-ID: <18453.1125313385@www23.gmx.net> Here is the interface I forgot to send. -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisionen für GMX Partner: http://www.gmx.net/de/go/partner -------------- next part -------------- A non-text attachment was scrubbed... Name: SequenceAlignment.java Type: text/x-java Size: 2289 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050829/b91dc2df/SequenceAlignment-0001.bin From duze at gmx.de Tue Aug 30 10:56:02 2005 From: duze at gmx.de (=?ISO-8859-1?Q?=22Andreas_Dr=E4ger=22?=) Date: Fri Sep 2 08:03:14 2005 Subject: [Biojava-l] Global Sequence Alignment Message-ID: <28555.1125413762@www16.gmx.net> Hello, I just implemented the Needleman-Wunsch-Algorithm and an object for handling substitution matrices like BLOSSUM, PAM, GONNET and so on. It parses a matrix file and provides a method to get the costs for changing Symbol A to Symbol B. This is realized by two hashes and a private int[][] matrix. If NeedlemanWunsch gets equal weights for gap opening and gap extension it won't consider affine gap penalties, whereas otherwise it will, which needs three times more memory. It would also be good to have a kind of interface to create multiple different implementations of pairwise sequence alignment algorithms which could be used easily then. I also attached an Interface for that purpose. The NeedlemanWunsch already implements. The HMM algorithm from the cookbook also implements this interface with only little effort of changes. Andreas Dr?ger -- Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko! Satte Provisionen f?r GMX Partner: http://www.gmx.net/de/go/partner -------------- next part -------------- A non-text attachment was scrubbed... Name: NeedlemanWunsch.java Type: text/x-java Size: 14813 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/a950dbc4/NeedlemanWunsch-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SubstitutionMatrix.java Type: text/x-java Size: 12573 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/a950dbc4/SubstitutionMatrix-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SequenceAlignment.java Type: text/x-java Size: 2289 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/a950dbc4/SequenceAlignment-0001.bin From andreas.draeger at clever-telefonieren.de Tue Aug 30 10:59:44 2005 From: andreas.draeger at clever-telefonieren.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Fri Sep 2 08:03:15 2005 Subject: [Biojava-l] Global Sequence Alignment Message-ID: <43147460.2000006@clever-telefonieren.de> Hello, I just implemented the Needleman-Wunsch-Algorithm and an object for handling substitution matrices like BLOSSUM, PAM, GONNET and so on. It parses a matrix file and provides a method to get the costs for changing Symbol A to Symbol B. This is realized by two hashes and a private int[][] matrix. If NeedlemanWunsch gets equal weights for gap opening and gap extension it won't consider affine gap penalties, whereas otherwise it will, which needs three times more memory. It would also be good to have a kind of interface to create multiple different implementations of pairwise sequence alignment algorithms which could be used easily then. I also attached an Interface for that purpose. The NeedlemanWunsch already implements. The HMM algorithm from the cookbook also implements this interface with only little effort of changes. Andreas Dr?ger -------------- next part -------------- A non-text attachment was scrubbed... Name: NeedlemanWunsch.java Type: text/x-java Size: 14813 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/32cb6711/NeedlemanWunsch-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SubstitutionMatrix.java Type: text/x-java Size: 12573 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/32cb6711/SubstitutionMatrix-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: SequenceAlignment.java Type: text/x-java Size: 2289 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biojava-l/attachments/20050830/32cb6711/SequenceAlignment-0001.bin