From rejarohit2004 at gmail.com Tue May 1 05:25:01 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Tue, 1 May 2007 14:55:01 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> hello all i m a novice programmer in java with a good knowledge of java core i want to build a biological database using Biojava Is there a need to go for J2EE before learning Biojava? Is Biojava incorproated in NetBeans IDE? Regards -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From heatkent at gmail.com Wed May 2 13:58:14 2007 From: heatkent at gmail.com (Heather Kent) Date: Wed, 2 May 2007 12:58:14 -0500 Subject: [Biojava-l] ABIFParser question Message-ID: I'm currently working on an ABI file parser (i'm working with an extension of the biojava parser for various reasons) and file writer and running into some problems. The biojava ABIParser class creates a Map of TaggedDataRecords but you can only get one record at a time and i want to iterate through all the records. I can't access the Map because it is a private variable. Does it have to be a private variable? So far i am working around it by making large arrays of the tagnames to send to the getDataRecord method in the ABIFParser to retrieve records and creating my own map but this seems really inefficient.....is there a better way? Also i think i found an error in the toString method of the TaggedDataRecord....in all the if statements i use numberOfElements instead of elementLength when reading the DATA_TYPE_ASCII_ARRAY..the elementLength refers only to the number of bytes in one element and the numberoOfElements refers to the number of elements in the item....i dont actually use that method but when i'm reading other records of type ASCII_ARRAY, or PSTRING or CSTRING that is how i set it up heather -- Duct tape is like the force. It has a light side, a dark side, and it holds the universe together.... Carl Zwanzig From rejarohit2004 at gmail.com Thu May 3 12:52:24 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 3 May 2007 22:22:24 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705030952t6f23ec5q1c03eb8bdd33e874@mail.gmail.com> hello all i want to incorporate biojava in netbeans IDE so that i can use biojava libraries/modules in my program . Can anyone elaborate the steps to do that -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From rejarohit2004 at gmail.com Fri May 4 01:17:41 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Fri, 4 May 2007 10:47:41 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> hello all i want to incorporate biojava in netbeans IDE so that i can use biojava libraries/modules in my program . Can anyone elaborate the steps to do that -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From bernhard.heinzel at gmail.com Mon May 7 06:57:26 2007 From: bernhard.heinzel at gmail.com (Bernhard Heinzel) Date: Mon, 7 May 2007 12:57:26 +0200 Subject: [Biojava-l] Generate Graphics with BioJava Message-ID: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> Hi, I am a total newbie to BioJava... Is it possible to generate Image Files(PNG,GIF,JPG) of Sequence Features? Thanks in Advance Bernhard From markjschreiber at gmail.com Tue May 8 05:33:19 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 8 May 2007 17:33:19 +0800 Subject: [Biojava-l] query In-Reply-To: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> References: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> Message-ID: <93b45ca50705080233n38262edag458d81de02d91e6@mail.gmail.com> Hi - We don't use J2EE in biojava, although you could use biojava in a J2EE app. Biojava can be easily used in NetBeans (for example I use NetBeans). The easiest way to use it in NetBeans is to set up biojava as a library and then use that library for your project. - Mark On 5/1/07, rohit reja wrote: > hello all > i m a novice programmer in java with a good knowledge of java core > i want to build a biological database using Biojava > Is there a need to go for J2EE before learning Biojava? > > Is Biojava incorproated in NetBeans IDE? > > Regards > > > -- > Rohit Reja > 3rd -B.tech-Bioinformatics > VIT University > Vellore > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Tue May 8 05:35:59 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 8 May 2007 17:35:59 +0800 Subject: [Biojava-l] query In-Reply-To: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> References: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> Message-ID: <93b45ca50705080235l838ff91l56b5a0e0f930c5c3@mail.gmail.com> Hi - Minimally you need to create a new library called biojava. Then you need to add the appropriate jar files (take a look at www.biojava.org in the getting started section to see which ones you need). You can also add the source code if you have downloaded it and the javadocs if you download them. Adding the source code and javadocs is not essential but it is nice to have them available when coding. - Mark On 5/4/07, rohit reja wrote: > hello all > i want to incorporate biojava in netbeans IDE so that i can use > > biojava libraries/modules in my program . > > Can anyone elaborate the steps to do that > > > -- > Rohit Reja > 3rd -B.tech-Bioinformatics > VIT University > Vellore > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Wed May 9 02:56:34 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 9 May 2007 14:56:34 +0800 Subject: [Biojava-l] ABIFParser question In-Reply-To: References: Message-ID: <93b45ca50705082356v230578d3o20f32fe5e30a5c58@mail.gmail.com> Hi - Because you have the source code available it might be easier to copy and paste the code into your class and then modify as needed. It may be a bit extreme for it to be private. It is probably safe enough for it to be protected. Alternatively having public or protected methods to get it's internal members without allowing modification may be another good approach. If you have a nice solution we can check it in to CVS. If you believe the getString() method is incorrect please poste a bug report with example code to demonstrate the problem. If you know how to solve it even better. we can then use the bug report as the basis for a fix and a regression test. - Mark On 5/3/07, Heather Kent wrote: > I'm currently working on an ABI file parser (i'm working with an extension > of the biojava parser for various reasons) and file writer and running into > some problems. The biojava ABIParser class creates a Map of > TaggedDataRecords but you can only get one record at a time and i want to > iterate through all the records. I can't access the Map because it is a > private variable. Does it have to be a private variable? > So far i am working around it by making large arrays of the tagnames to send > to the getDataRecord method in the ABIFParser to retrieve records and > creating my own map but this seems really inefficient.....is there a better > way? > > Also i think i found an error in the toString method of the > TaggedDataRecord....in all the if statements i use numberOfElements instead > of elementLength when reading the DATA_TYPE_ASCII_ARRAY..the elementLength > refers only to the number of bytes in one element and the numberoOfElements > refers to the number of elements in the item....i dont actually use that > method but when i'm reading other records of type > ASCII_ARRAY, or PSTRING or CSTRING that is how i set it up > > heather > > > -- > Duct tape is like the force. It has a light side, a dark side, and it holds > the universe together.... > Carl Zwanzig > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Wed May 9 03:43:41 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 9 May 2007 15:43:41 +0800 Subject: [Biojava-l] Generate Graphics with BioJava In-Reply-To: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> References: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> Message-ID: <93b45ca50705090043t1caa2355v4bafd99ef617fc5b@mail.gmail.com> Hi Bernhard - It is possible but it is not as easy as it should be (this is an area where we have considered a redesign). Essentially most of the current graphics components of BioJava extend Swing components. The drawing of features etc is done by biojava code that draws into the Graphics2D objects of the component. This code is usually found in the paintComponent() method. What you can do it use exactly the same drawing code but use it to draw into the Graphics2D object of a java BufferedImage. You then use a java class called (I think) ImageIO to write it as JPG, PNG, TIFF etc. There is a nice project called Batik. This has objects that give you access to an implementation of a Graphics2D object. You 'draw' into that object in the normal java way and then the object converts everything you 'drew' into SVG which is useful in webpages or with SVG viewers. Ultimately biojava shouldn't really have swing components. What it should have is code that draws into Graphics2D objects which could be from Batik or Swing or a BufferedImage or something else entirely. This would provide much more flexible graphics system. - Mark On 5/7/07, Bernhard Heinzel wrote: > Hi, > > I am a total newbie to BioJava... > Is it possible to generate Image Files(PNG,GIF,JPG) of Sequence Features? > > Thanks in Advance > Bernhard > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Wed May 16 23:28:38 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 17 May 2007 11:28:38 +0800 Subject: [Biojava-l] [Biojava-dev] Project Contribution. In-Reply-To: <010201c7980a$86fcae30$0401a8c0@laptop> References: <93b45ca50705101954j3f899f49r438fe8d587ba027@mail.gmail.com> <010201c7980a$86fcae30$0401a8c0@laptop> Message-ID: <93b45ca50705162028o337f07d5gd79d838bea9159f2@mail.gmail.com> Hi - biojavax contains extensions to the basic biojava API's. (A bit like java and javax). biojavax does require biojava because it extends it and biojava 1.5 when released will be made up of both of them. On the whole we recommend people use the biojavax API where it deprecates the biojava API but both are retained for backwards compatibility. Please note though that there are large parts of biojava that are not deprecated by biojavax and which are still useful and needed. Hope this helps, - Mark On 5/17/07, Jeff Szielenski wrote: > Also, > > BioJava 2 is not BioJavax. I noticed biojava-1.5-beta2 contains both > biojava and biojavax. So does biojavax use biojava? Why are they in the > same distribution? > > Jeff > > -----Original Message----- > From: Mark Schreiber [mailto:markjschreiber at gmail.com] > Sent: Thursday, May 10, 2007 6:55 PM > To: jeff at dvss.net > Cc: biojava-dev at lists.open-bio.org > Subject: Re: [Biojava-dev] Project Contribution. > > > Hi - > > We need volunteers to convert the example programs in the biojava.org > cookbook to use the newer biojavax API's. > > - Mark > > On 5/11/07, Jeff Szielenski wrote: > > Hello, > > > > I am currently studying bioinformatics at UIC. I am interested in > > joining an open source project to start applying my knowledge. What if > > > any help do you guys need? I have a software engineering background > > and just finished an introductory course in bioinformatics. > > > > Jeff > > > > _______________________________________________ > > biojava-dev mailing list > > biojava-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > > > From zt_2003 at 163.com Sun May 20 22:07:36 2007 From: zt_2003 at 163.com (zt_2003) Date: Mon, 21 May 2007 10:07:36 +0800 (CST) Subject: [Biojava-l] =?gbk?Q?how_to_set_the_=A1=B0token=A1=B1_parameter=3F?= Message-ID: <30959794.2050151179713256766.JavaMail.root@bj163app70.163.com> I had made a custum alphabet und.but when I call the function: System.out.println("22222222---"+und.getTokenization("token"));give such error: There is no tokenization 'token' defined in alphabet UND. und.getTokenization("default")) ot und.getTokenization("name")) is ok. And I can't find any api to set the tokenization 'token',how to do it? From mark.schreiber at novartis.com Mon May 21 00:21:29 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 21 May 2007 12:21:29 +0800 Subject: [Biojava-l] =?utf-8?q?how_to_set_the_=E2=80=9Ctoken=E2=80=9D_para?= =?utf-8?q?meter=3F?= Message-ID: Hi - The name tokenization is provided because biojava can simply look up the name of the symbol. For a "token" tokenization you would need to provide a custom mapping for your custom alphabet. Probably the simplist was is to instantiate a org.biojava.bio.seq.io.CharacterTokenization and use the bindSymbol method to bind Symbols to characters. If you want to register the tokenization with the Alphabet then most Biojava alphabets derive from AbstractAlphabet which contains the method: putTokenization(String name, SymbolTokenization tok) You could then register your tokenization under the name "token". This is only required if other code is going to use the tokenization as after creating it you already have a reference to it anyway. - Mark zt_2003 Sent by: biojava-l-bounces at lists.open-bio.org 05/21/2007 10:07 AM To: biojava-l at lists.open-bio.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] how to set the "token" parameter? I had made a custum alphabet und.but when I call the function: System.out.println("22222222---"+und.getTokenization("token"));give such error: There is no tokenization 'token' defined in alphabet UND. und.getTokenization("default")) ot und.getTokenization("name")) is ok. And I can't find any api to set the tokenization 'token',how to do it? _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From cheng at tomcheng.com Mon May 21 22:31:59 2007 From: cheng at tomcheng.com (T. Thomas Cheng) Date: Mon, 21 May 2007 22:31:59 -0400 Subject: [Biojava-l] Another Project Contributor? Message-ID: <4652561F.8080508@tomcheng.com> Hi all-- I'm a software developer looking to break into bioinformatics, and (like Jeff, who posted a couple of weeks ago) I'm interested in joining an open source project to expand my horizons and apply what I've learned. My undergrad degree is in molecular biology, but I've spent the past decade doing more general software development work (and have an MS in comp sci), so I think I'm probably stronger on the software side, but I'd love to start flexing the bio muscles again. I think updating the Cookbook would be right up my alley for now, so I might jump in and take a look at that, if no one objects. Anyway, I mostly wanted to just say hello and introduce myself. I look forward to getting more involved in the project and getting to know you all! -- T. Thomas Cheng cheng at tomcheng.com http://www.tomcheng.com From e.j.blom at rug.nl Tue May 22 08:22:10 2007 From: e.j.blom at rug.nl (Evert-Jan Blom) Date: Tue, 22 May 2007 14:22:10 +0200 Subject: [Biojava-l] [HMM] detecting several instances of the same motif fails Message-ID: <4652E072.9060003@rug.nl> Dear all, Using a page from the CookBook http://www.biojava.org/wiki/BioJava:CookBook:DP:HMM we implemented a profile HMM in our application to detect regulatory motif instances. To test, we created a model based on 10 identical sequences (the test sequence was: TGCTGCTGCGGGCCC): The model is subsequently trained using a BaumWelchTrainer and decoded using the ScoreType.ODDS, ScoreType.Probability and ScoreType.NullModel The sequence we use for testing contains 2 motifs, a perfect motif and a motif with one mismatch:. AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA The results of the original HMMER package tell me that there are 2 instances of the motif present in the test string whereas the biojava package yields very strange results: results using the ScoreType.ODDS, only the second motif is detected: {AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA} Log Odds = 7.65779871993799 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 m-1 m-2 m-3 m-4 m-5 m-6 d-7 m-8 m-9 m-10 m-11 m-12 m-13 d-14 d-15 i-15 i-15 i-15 i-15 i-15 i-15 Now the second scorer, only the first motif is detected: Prob = -95.9806747848816 i-0 i-0 i-0 i-0 m-1 m-2 m-3 m-4 m-5 m-6 m-7 m-8 m-9 m-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 m-11 i-11 m-12 i-12 i-12 i-12 m-13 m-14 m-15 i-15 i-15 i-15 i-15 i-15 Now the null model which seems to make no sense at all: Null = -94.11166855273558 m-1 m-2 m-3 m-4 m-5 m-6 m-7 m-8 m-9 m-10 m-11 m-12 m-13 m-14 m-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 Is there an option to detect the second motif in the same run just like the original HMMER? Or am I missing some option that is not described in the tutorial. Thanks in advance E.J.Blom From markjschreiber at gmail.com Tue May 22 21:48:32 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 23 May 2007 09:48:32 +0800 Subject: [Biojava-l] [HMM] detecting several instances of the same motif fails In-Reply-To: <4652E072.9060003@rug.nl> References: <4652E072.9060003@rug.nl> Message-ID: <93b45ca50705221848n4db1761fr552a67b4d68437bd@mail.gmail.com> Hi - There are two things going on here. The first is that I beleive the profile model presented in biojava doesn't loop back on itself. I could be wrong I need to check the code. If this is indeed the case then the model will not be capable of finding more than one match in a sequence. This can be easily modified by changing the existing ProfileHMM code in a custom class or getting a reference to the MarkovModel and changing it's possible transitions. The other issue is the type of scoring used. ScoreType.Probability calculates the Viterbi path based on the transitions of the model and the emission probabilities of the states. ScoreType.NullModel uses the 'null model' which in your case will be a uniform distribution (essentially random) which will be meaningless, hence the strange result. The null model would be more meaningful if you wanted to model some biased background. ScoreType.ODDs is the log odds of the trained model and the null model. It is most useful when the null model is not uniform, eg where you want to distinguish a signal from biased background. It is most often used for proteins where the background amino acid distribution is anything but uniform. Hope this helps, - Mark On 5/22/07, Evert-Jan Blom wrote: > Dear all, > > Using a page from the CookBook > http://www.biojava.org/wiki/BioJava:CookBook:DP:HMM we implemented a > profile HMM > in our application to detect regulatory motif instances. To test, we > created a model based on 10 identical sequences > (the test sequence was: TGCTGCTGCGGGCCC): > The model is subsequently trained using a BaumWelchTrainer and decoded > using the ScoreType.ODDS, ScoreType.Probability and ScoreType.NullModel > > The sequence we use for testing contains 2 motifs, a perfect motif and a > motif with one mismatch:. > > AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA > > The results of the original HMMER package tell me that there are 2 > instances of the motif present in the test string whereas the biojava > package yields very strange results: > > results using the ScoreType.ODDS, only the second motif is detected: > > {AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA} > Log Odds = 7.65779871993799 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > d-7 > m-8 > m-9 > m-10 > m-11 > m-12 > m-13 > d-14 > d-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Now the second scorer, only the first motif is detected: > > Prob = -95.9806747848816 > i-0 > i-0 > i-0 > i-0 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > m-7 > m-8 > m-9 > m-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > m-11 > i-11 > m-12 > i-12 > i-12 > i-12 > m-13 > m-14 > m-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Now the null model which seems to make no sense at all: > Null = -94.11166855273558 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > m-7 > m-8 > m-9 > m-10 > m-11 > m-12 > m-13 > m-14 > m-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Is there an option to detect the second motif in the same run just like > the original HMMER? Or am I missing some > option that is not described in the tutorial. > > Thanks in advance > > E.J.Blom > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From koeberle at mpiib-berlin.mpg.de Fri May 25 10:03:02 2007 From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-15?Q?Christian_K=F6berle?=) Date: Fri, 25 May 2007 16:03:02 +0200 Subject: [Biojava-l] KEGG Interface Message-ID: <4656EC96.4090105@mpiib-berlin.mpg.de> Hi, is there a implementation of a interface for KEGG database in BIO-JAVA? And if it exist how dos it work? Best regards, Christian -- Christian K?berle Max Planck Institute for Infection Biology Department of Immunology Campus Charit? Mitte Charit?platz 1 10117 Berlin Tel: +49 30 28 460 562 From markjschreiber at gmail.com Fri May 25 11:41:21 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 25 May 2007 23:41:21 +0800 Subject: [Biojava-l] KEGG Interface In-Reply-To: <4656EC96.4090105@mpiib-berlin.mpg.de> References: <4656EC96.4090105@mpiib-berlin.mpg.de> Message-ID: <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> Hi - There is not one yet. It would be very useful to have one though if anyone is interested in developing one. A generic pathway model would also be great. - Mark On 5/25/07, Christian K?berle wrote: > Hi, > > is there a implementation of a interface for KEGG database in BIO-JAVA? > And if it exist how dos it work? > > Best regards, > Christian > > -- > Christian K?berle > > Max Planck Institute for Infection Biology > Department of Immunology > > Campus Charit? Mitte > Charit?platz 1 > > 10117 Berlin > > Tel: +49 30 28 460 562 > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From walsh at andrew.cmu.edu Fri May 25 15:05:35 2007 From: walsh at andrew.cmu.edu (Andrew Walsh) Date: Fri, 25 May 2007 15:05:35 -0400 Subject: [Biojava-l] Secondary Structure data from PDB file Message-ID: <4657337F.6060203@andrew.cmu.edu> Do I observe correctly that org.biojava.bio.structure.io.PDBFileParser does not currently handle secondary structure annotation (e.g. HELIX, SHEET, or TURN records), and thus no secondary structure information is ever added to an AminoAcid? If this is the case, is there any documentation on how such data is supposed to be added to the Map that represents secondary structure data (i.e. what key/value pairs are expected for common secondary structure features)? Thanks, Andrew Walsh From ap3 at sanger.ac.uk Mon May 28 18:43:45 2007 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 28 May 2007 23:43:45 +0100 Subject: [Biojava-l] Secondary Structure data from PDB file In-Reply-To: <4657337F.6060203@andrew.cmu.edu> References: <4657337F.6060203@andrew.cmu.edu> Message-ID: Hi Andrew, > Do I observe correctly that org.biojava.bio.structure.io.PDBFileParser > does not currently handle secondary structure annotation (e.g. HELIX, > SHEET, or TURN records), The parser is fine, but this has been a missing feature so far. To add support for this I just committed a patch to the biojava CVS. The parser now (optionally) can attach the secondary structure assignment that has been provided by the author of the PDB file to an amino acid. See also http://biojava.org/wiki/BioJava:CookBook:PDB:read Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road London NW12BE. From koeberle at mpiib-berlin.mpg.de Wed May 30 08:20:21 2007 From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-1?Q?Christian_K=F6berle?=) Date: Wed, 30 May 2007 14:20:21 +0200 Subject: [Biojava-l] KEGG Interface In-Reply-To: <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> References: <4656EC96.4090105@mpiib-berlin.mpg.de> <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> Message-ID: <465D6C05.90504@mpiib-berlin.mpg.de> Hi, there is a JAVA-API for KEGG on http://www.genome.jp/kegg/soap/ it is not complete but includes a lot of functions to get information from KEGG Christian Mark Schreiber schrieb: > Hi - > > There is not one yet. It would be very useful to have one though if > anyone is interested in developing one. A generic pathway model would > also be great. > > - Mark > > On 5/25/07, Christian K?berle wrote: >> Hi, >> >> is there a implementation of a interface for KEGG database in BIO-JAVA? >> And if it exist how dos it work? >> >> Best regards, >> Christian >> >> -- >> Christian K?berle >> >> Max Planck Institute for Infection Biology >> Department of Immunology >> >> Campus Charit? Mitte >> Charit?platz 1 >> >> 10117 Berlin >> >> Tel: +49 30 28 460 562 >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- Christian K?berle Max Planck Institute for Infection Biology Department of Immunology Campus Charit? Mitte Charit?platz 1 10117 Berlin Tel: +49 30 28 460 562 From rejarohit2004 at gmail.com Thu May 31 10:26:51 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 31 May 2007 19:56:51 +0530 Subject: [Biojava-l] Query Message-ID: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Hello all, I am working on HMMER which runs on a command line interface(CLI). Now i want to create and integrate a GUI to execute the commands on the CLI . How can we do this using java .? Please reply ASAP. -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From rejarohit2004 at gmail.com Thu May 31 10:26:51 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 31 May 2007 19:56:51 +0530 Subject: [Biojava-l] Query Message-ID: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Hello all, I am working on HMMER which runs on a command line interface(CLI). Now i want to create and integrate a GUI to execute the commands on the CLI . How can we do this using java .? Please reply ASAP. -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From tmo at ebi.ac.uk Thu May 31 11:10:58 2007 From: tmo at ebi.ac.uk (Tom Oinn) Date: Thu, 31 May 2007 16:10:58 +0100 Subject: [Biojava-l] Query In-Reply-To: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Message-ID: <465EE582.2000908@ebi.ac.uk> rohit reja wrote: > Hello all, > > I am working on HMMER which runs on a command line interface(CLI). > Now i want to create and integrate a GUI to execute the commands on the CLI > . > How can we do this using java .? This isn't really a BioJava question (this list is for discussion of the BioJava project and its usage) but... You can use the system Runtime class in Java to exec external tools, your application will have to construct the command line then create a Process object from this command line which has streams from which you can manipulate stdin and stdout for the process. Google for 'Java Runtime exec' for plenty more information. Hint - the process will block unless you consume both stdout and stderr streams, blocking when the buffer for those streams (which is OS and Java version dependant) is full. If you get odd behaviour with the application sometimes hanging this is probably why! Cheers, Tom > Please reply ASAP. ps - don't ask people to reply ASAP, it just annoys people! The assumption if you ask a question on a mailing list is that you'd like an answer :) From ola.spjuth at farmbio.uu.se Thu May 31 11:26:49 2007 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 31 May 2007 17:26:49 +0200 Subject: [Biojava-l] Query In-Reply-To: <465EE582.2000908@ebi.ac.uk> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> <465EE582.2000908@ebi.ac.uk> Message-ID: Hi, Look into the Bioclipse project (http://www.bioclipse.net). It is a graphical workbench that has integrated other tools such as ClustalW and Blast. Duplicating such an integration for HMMER seems like a straightforward task. Somewhat limited CLI is even done at runtime in the Bioclipse preferences, but it could very well be extended. Cheers, .../Ola Spjuth On May 31, 2007, at 17:10 , Tom Oinn wrote: > rohit reja wrote: >> Hello all, >> >> I am working on HMMER which runs on a command line interface(CLI). >> Now i want to create and integrate a GUI to execute the commands >> on the CLI >> . >> How can we do this using java .? > > This isn't really a BioJava question (this list is for discussion > of the > BioJava project and its usage) but... > > You can use the system Runtime class in Java to exec external tools, > your application will have to construct the command line then create a > Process object from this command line which has streams from which you > can manipulate stdin and stdout for the process. Google for 'Java > Runtime exec' for plenty more information. > > Hint - the process will block unless you consume both stdout and > stderr > streams, blocking when the buffer for those streams (which is OS and > Java version dependant) is full. If you get odd behaviour with the > application sometimes hanging this is probably why! > > Cheers, > > Tom > >> Please reply ASAP. > > ps - don't ask people to reply ASAP, it just annoys people! The > assumption if you ask a question on a mailing list is that you'd > like an > answer :) > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From ola.spjuth at farmbio.uu.se Thu May 31 11:26:49 2007 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 31 May 2007 17:26:49 +0200 Subject: [Biojava-l] Query In-Reply-To: <465EE582.2000908@ebi.ac.uk> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> <465EE582.2000908@ebi.ac.uk> Message-ID: Hi, Look into the Bioclipse project (http://www.bioclipse.net). It is a graphical workbench that has integrated other tools such as ClustalW and Blast. Duplicating such an integration for HMMER seems like a straightforward task. Somewhat limited CLI is even done at runtime in the Bioclipse preferences, but it could very well be extended. Cheers, .../Ola Spjuth On May 31, 2007, at 17:10 , Tom Oinn wrote: > rohit reja wrote: >> Hello all, >> >> I am working on HMMER which runs on a command line interface(CLI). >> Now i want to create and integrate a GUI to execute the commands >> on the CLI >> . >> How can we do this using java .? > > This isn't really a BioJava question (this list is for discussion > of the > BioJava project and its usage) but... > > You can use the system Runtime class in Java to exec external tools, > your application will have to construct the command line then create a > Process object from this command line which has streams from which you > can manipulate stdin and stdout for the process. Google for 'Java > Runtime exec' for plenty more information. > > Hint - the process will block unless you consume both stdout and > stderr > streams, blocking when the buffer for those streams (which is OS and > Java version dependant) is full. If you get odd behaviour with the > application sometimes hanging this is probably why! > > Cheers, > > Tom > >> Please reply ASAP. > > ps - don't ask people to reply ASAP, it just annoys people! The > assumption if you ask a question on a mailing list is that you'd > like an > answer :) > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From rejarohit2004 at gmail.com Tue May 1 09:25:01 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Tue, 1 May 2007 14:55:01 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> hello all i m a novice programmer in java with a good knowledge of java core i want to build a biological database using Biojava Is there a need to go for J2EE before learning Biojava? Is Biojava incorproated in NetBeans IDE? Regards -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From heatkent at gmail.com Wed May 2 17:58:14 2007 From: heatkent at gmail.com (Heather Kent) Date: Wed, 2 May 2007 12:58:14 -0500 Subject: [Biojava-l] ABIFParser question Message-ID: I'm currently working on an ABI file parser (i'm working with an extension of the biojava parser for various reasons) and file writer and running into some problems. The biojava ABIParser class creates a Map of TaggedDataRecords but you can only get one record at a time and i want to iterate through all the records. I can't access the Map because it is a private variable. Does it have to be a private variable? So far i am working around it by making large arrays of the tagnames to send to the getDataRecord method in the ABIFParser to retrieve records and creating my own map but this seems really inefficient.....is there a better way? Also i think i found an error in the toString method of the TaggedDataRecord....in all the if statements i use numberOfElements instead of elementLength when reading the DATA_TYPE_ASCII_ARRAY..the elementLength refers only to the number of bytes in one element and the numberoOfElements refers to the number of elements in the item....i dont actually use that method but when i'm reading other records of type ASCII_ARRAY, or PSTRING or CSTRING that is how i set it up heather -- Duct tape is like the force. It has a light side, a dark side, and it holds the universe together.... Carl Zwanzig From rejarohit2004 at gmail.com Thu May 3 16:52:24 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 3 May 2007 22:22:24 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705030952t6f23ec5q1c03eb8bdd33e874@mail.gmail.com> hello all i want to incorporate biojava in netbeans IDE so that i can use biojava libraries/modules in my program . Can anyone elaborate the steps to do that -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From rejarohit2004 at gmail.com Fri May 4 05:17:41 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Fri, 4 May 2007 10:47:41 +0530 Subject: [Biojava-l] query Message-ID: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> hello all i want to incorporate biojava in netbeans IDE so that i can use biojava libraries/modules in my program . Can anyone elaborate the steps to do that -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From bernhard.heinzel at gmail.com Mon May 7 10:57:26 2007 From: bernhard.heinzel at gmail.com (Bernhard Heinzel) Date: Mon, 7 May 2007 12:57:26 +0200 Subject: [Biojava-l] Generate Graphics with BioJava Message-ID: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> Hi, I am a total newbie to BioJava... Is it possible to generate Image Files(PNG,GIF,JPG) of Sequence Features? Thanks in Advance Bernhard From markjschreiber at gmail.com Tue May 8 09:33:19 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 8 May 2007 17:33:19 +0800 Subject: [Biojava-l] query In-Reply-To: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> References: <29c042ff0705010225i4a1cad7bsebe2e0a12c5158a4@mail.gmail.com> Message-ID: <93b45ca50705080233n38262edag458d81de02d91e6@mail.gmail.com> Hi - We don't use J2EE in biojava, although you could use biojava in a J2EE app. Biojava can be easily used in NetBeans (for example I use NetBeans). The easiest way to use it in NetBeans is to set up biojava as a library and then use that library for your project. - Mark On 5/1/07, rohit reja wrote: > hello all > i m a novice programmer in java with a good knowledge of java core > i want to build a biological database using Biojava > Is there a need to go for J2EE before learning Biojava? > > Is Biojava incorproated in NetBeans IDE? > > Regards > > > -- > Rohit Reja > 3rd -B.tech-Bioinformatics > VIT University > Vellore > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Tue May 8 09:35:59 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 8 May 2007 17:35:59 +0800 Subject: [Biojava-l] query In-Reply-To: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> References: <29c042ff0705032217n47d025aey6024ab8ac56e4f4d@mail.gmail.com> Message-ID: <93b45ca50705080235l838ff91l56b5a0e0f930c5c3@mail.gmail.com> Hi - Minimally you need to create a new library called biojava. Then you need to add the appropriate jar files (take a look at www.biojava.org in the getting started section to see which ones you need). You can also add the source code if you have downloaded it and the javadocs if you download them. Adding the source code and javadocs is not essential but it is nice to have them available when coding. - Mark On 5/4/07, rohit reja wrote: > hello all > i want to incorporate biojava in netbeans IDE so that i can use > > biojava libraries/modules in my program . > > Can anyone elaborate the steps to do that > > > -- > Rohit Reja > 3rd -B.tech-Bioinformatics > VIT University > Vellore > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Wed May 9 06:56:34 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 9 May 2007 14:56:34 +0800 Subject: [Biojava-l] ABIFParser question In-Reply-To: References: Message-ID: <93b45ca50705082356v230578d3o20f32fe5e30a5c58@mail.gmail.com> Hi - Because you have the source code available it might be easier to copy and paste the code into your class and then modify as needed. It may be a bit extreme for it to be private. It is probably safe enough for it to be protected. Alternatively having public or protected methods to get it's internal members without allowing modification may be another good approach. If you have a nice solution we can check it in to CVS. If you believe the getString() method is incorrect please poste a bug report with example code to demonstrate the problem. If you know how to solve it even better. we can then use the bug report as the basis for a fix and a regression test. - Mark On 5/3/07, Heather Kent wrote: > I'm currently working on an ABI file parser (i'm working with an extension > of the biojava parser for various reasons) and file writer and running into > some problems. The biojava ABIParser class creates a Map of > TaggedDataRecords but you can only get one record at a time and i want to > iterate through all the records. I can't access the Map because it is a > private variable. Does it have to be a private variable? > So far i am working around it by making large arrays of the tagnames to send > to the getDataRecord method in the ABIFParser to retrieve records and > creating my own map but this seems really inefficient.....is there a better > way? > > Also i think i found an error in the toString method of the > TaggedDataRecord....in all the if statements i use numberOfElements instead > of elementLength when reading the DATA_TYPE_ASCII_ARRAY..the elementLength > refers only to the number of bytes in one element and the numberoOfElements > refers to the number of elements in the item....i dont actually use that > method but when i'm reading other records of type > ASCII_ARRAY, or PSTRING or CSTRING that is how i set it up > > heather > > > -- > Duct tape is like the force. It has a light side, a dark side, and it holds > the universe together.... > Carl Zwanzig > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Wed May 9 07:43:41 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 9 May 2007 15:43:41 +0800 Subject: [Biojava-l] Generate Graphics with BioJava In-Reply-To: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> References: <1a7429ec0705070357v21014f9bt81e2fd45320ea991@mail.gmail.com> Message-ID: <93b45ca50705090043t1caa2355v4bafd99ef617fc5b@mail.gmail.com> Hi Bernhard - It is possible but it is not as easy as it should be (this is an area where we have considered a redesign). Essentially most of the current graphics components of BioJava extend Swing components. The drawing of features etc is done by biojava code that draws into the Graphics2D objects of the component. This code is usually found in the paintComponent() method. What you can do it use exactly the same drawing code but use it to draw into the Graphics2D object of a java BufferedImage. You then use a java class called (I think) ImageIO to write it as JPG, PNG, TIFF etc. There is a nice project called Batik. This has objects that give you access to an implementation of a Graphics2D object. You 'draw' into that object in the normal java way and then the object converts everything you 'drew' into SVG which is useful in webpages or with SVG viewers. Ultimately biojava shouldn't really have swing components. What it should have is code that draws into Graphics2D objects which could be from Batik or Swing or a BufferedImage or something else entirely. This would provide much more flexible graphics system. - Mark On 5/7/07, Bernhard Heinzel wrote: > Hi, > > I am a total newbie to BioJava... > Is it possible to generate Image Files(PNG,GIF,JPG) of Sequence Features? > > Thanks in Advance > Bernhard > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From markjschreiber at gmail.com Thu May 17 03:28:38 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 17 May 2007 11:28:38 +0800 Subject: [Biojava-l] [Biojava-dev] Project Contribution. In-Reply-To: <010201c7980a$86fcae30$0401a8c0@laptop> References: <93b45ca50705101954j3f899f49r438fe8d587ba027@mail.gmail.com> <010201c7980a$86fcae30$0401a8c0@laptop> Message-ID: <93b45ca50705162028o337f07d5gd79d838bea9159f2@mail.gmail.com> Hi - biojavax contains extensions to the basic biojava API's. (A bit like java and javax). biojavax does require biojava because it extends it and biojava 1.5 when released will be made up of both of them. On the whole we recommend people use the biojavax API where it deprecates the biojava API but both are retained for backwards compatibility. Please note though that there are large parts of biojava that are not deprecated by biojavax and which are still useful and needed. Hope this helps, - Mark On 5/17/07, Jeff Szielenski wrote: > Also, > > BioJava 2 is not BioJavax. I noticed biojava-1.5-beta2 contains both > biojava and biojavax. So does biojavax use biojava? Why are they in the > same distribution? > > Jeff > > -----Original Message----- > From: Mark Schreiber [mailto:markjschreiber at gmail.com] > Sent: Thursday, May 10, 2007 6:55 PM > To: jeff at dvss.net > Cc: biojava-dev at lists.open-bio.org > Subject: Re: [Biojava-dev] Project Contribution. > > > Hi - > > We need volunteers to convert the example programs in the biojava.org > cookbook to use the newer biojavax API's. > > - Mark > > On 5/11/07, Jeff Szielenski wrote: > > Hello, > > > > I am currently studying bioinformatics at UIC. I am interested in > > joining an open source project to start applying my knowledge. What if > > > any help do you guys need? I have a software engineering background > > and just finished an introductory course in bioinformatics. > > > > Jeff > > > > _______________________________________________ > > biojava-dev mailing list > > biojava-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > > > From zt_2003 at 163.com Mon May 21 02:07:36 2007 From: zt_2003 at 163.com (zt_2003) Date: Mon, 21 May 2007 10:07:36 +0800 (CST) Subject: [Biojava-l] =?gbk?Q?how_to_set_the_=A1=B0token=A1=B1_parameter=3F?= Message-ID: <30959794.2050151179713256766.JavaMail.root@bj163app70.163.com> I had made a custum alphabet und.but when I call the function: System.out.println("22222222---"+und.getTokenization("token"));give such error: There is no tokenization 'token' defined in alphabet UND. und.getTokenization("default")) ot und.getTokenization("name")) is ok. And I can't find any api to set the tokenization 'token',how to do it? From mark.schreiber at novartis.com Mon May 21 04:21:29 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Mon, 21 May 2007 12:21:29 +0800 Subject: [Biojava-l] =?utf-8?q?how_to_set_the_=E2=80=9Ctoken=E2=80=9D_para?= =?utf-8?q?meter=3F?= Message-ID: Hi - The name tokenization is provided because biojava can simply look up the name of the symbol. For a "token" tokenization you would need to provide a custom mapping for your custom alphabet. Probably the simplist was is to instantiate a org.biojava.bio.seq.io.CharacterTokenization and use the bindSymbol method to bind Symbols to characters. If you want to register the tokenization with the Alphabet then most Biojava alphabets derive from AbstractAlphabet which contains the method: putTokenization(String name, SymbolTokenization tok) You could then register your tokenization under the name "token". This is only required if other code is going to use the tokenization as after creating it you already have a reference to it anyway. - Mark zt_2003 Sent by: biojava-l-bounces at lists.open-bio.org 05/21/2007 10:07 AM To: biojava-l at lists.open-bio.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-l] how to set the "token" parameter? I had made a custum alphabet und.but when I call the function: System.out.println("22222222---"+und.getTokenization("token"));give such error: There is no tokenization 'token' defined in alphabet UND. und.getTokenization("default")) ot und.getTokenization("name")) is ok. And I can't find any api to set the tokenization 'token',how to do it? _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From cheng at tomcheng.com Tue May 22 02:31:59 2007 From: cheng at tomcheng.com (T. Thomas Cheng) Date: Mon, 21 May 2007 22:31:59 -0400 Subject: [Biojava-l] Another Project Contributor? Message-ID: <4652561F.8080508@tomcheng.com> Hi all-- I'm a software developer looking to break into bioinformatics, and (like Jeff, who posted a couple of weeks ago) I'm interested in joining an open source project to expand my horizons and apply what I've learned. My undergrad degree is in molecular biology, but I've spent the past decade doing more general software development work (and have an MS in comp sci), so I think I'm probably stronger on the software side, but I'd love to start flexing the bio muscles again. I think updating the Cookbook would be right up my alley for now, so I might jump in and take a look at that, if no one objects. Anyway, I mostly wanted to just say hello and introduce myself. I look forward to getting more involved in the project and getting to know you all! -- T. Thomas Cheng cheng at tomcheng.com http://www.tomcheng.com From e.j.blom at rug.nl Tue May 22 12:22:10 2007 From: e.j.blom at rug.nl (Evert-Jan Blom) Date: Tue, 22 May 2007 14:22:10 +0200 Subject: [Biojava-l] [HMM] detecting several instances of the same motif fails Message-ID: <4652E072.9060003@rug.nl> Dear all, Using a page from the CookBook http://www.biojava.org/wiki/BioJava:CookBook:DP:HMM we implemented a profile HMM in our application to detect regulatory motif instances. To test, we created a model based on 10 identical sequences (the test sequence was: TGCTGCTGCGGGCCC): The model is subsequently trained using a BaumWelchTrainer and decoded using the ScoreType.ODDS, ScoreType.Probability and ScoreType.NullModel The sequence we use for testing contains 2 motifs, a perfect motif and a motif with one mismatch:. AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA The results of the original HMMER package tell me that there are 2 instances of the motif present in the test string whereas the biojava package yields very strange results: results using the ScoreType.ODDS, only the second motif is detected: {AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA} Log Odds = 7.65779871993799 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 i-0 m-1 m-2 m-3 m-4 m-5 m-6 d-7 m-8 m-9 m-10 m-11 m-12 m-13 d-14 d-15 i-15 i-15 i-15 i-15 i-15 i-15 Now the second scorer, only the first motif is detected: Prob = -95.9806747848816 i-0 i-0 i-0 i-0 m-1 m-2 m-3 m-4 m-5 m-6 m-7 m-8 m-9 m-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 i-10 m-11 i-11 m-12 i-12 i-12 i-12 m-13 m-14 m-15 i-15 i-15 i-15 i-15 i-15 Now the null model which seems to make no sense at all: Null = -94.11166855273558 m-1 m-2 m-3 m-4 m-5 m-6 m-7 m-8 m-9 m-10 m-11 m-12 m-13 m-14 m-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 i-15 Is there an option to detect the second motif in the same run just like the original HMMER? Or am I missing some option that is not described in the tutorial. Thanks in advance E.J.Blom From markjschreiber at gmail.com Wed May 23 01:48:32 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Wed, 23 May 2007 09:48:32 +0800 Subject: [Biojava-l] [HMM] detecting several instances of the same motif fails In-Reply-To: <4652E072.9060003@rug.nl> References: <4652E072.9060003@rug.nl> Message-ID: <93b45ca50705221848n4db1761fr552a67b4d68437bd@mail.gmail.com> Hi - There are two things going on here. The first is that I beleive the profile model presented in biojava doesn't loop back on itself. I could be wrong I need to check the code. If this is indeed the case then the model will not be capable of finding more than one match in a sequence. This can be easily modified by changing the existing ProfileHMM code in a custom class or getting a reference to the MarkovModel and changing it's possible transitions. The other issue is the type of scoring used. ScoreType.Probability calculates the Viterbi path based on the transitions of the model and the emission probabilities of the states. ScoreType.NullModel uses the 'null model' which in your case will be a uniform distribution (essentially random) which will be meaningless, hence the strange result. The null model would be more meaningful if you wanted to model some biased background. ScoreType.ODDs is the log odds of the trained model and the null model. It is most useful when the null model is not uniform, eg where you want to distinguish a signal from biased background. It is most often used for proteins where the background amino acid distribution is anything but uniform. Hope this helps, - Mark On 5/22/07, Evert-Jan Blom wrote: > Dear all, > > Using a page from the CookBook > http://www.biojava.org/wiki/BioJava:CookBook:DP:HMM we implemented a > profile HMM > in our application to detect regulatory motif instances. To test, we > created a model based on 10 identical sequences > (the test sequence was: TGCTGCTGCGGGCCC): > The model is subsequently trained using a BaumWelchTrainer and decoded > using the ScoreType.ODDS, ScoreType.Probability and ScoreType.NullModel > > The sequence we use for testing contains 2 motifs, a perfect motif and a > motif with one mismatch:. > > AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA > > The results of the original HMMER package tell me that there are 2 > instances of the motif present in the test string whereas the biojava > package yields very strange results: > > results using the ScoreType.ODDS, only the second motif is detected: > > {AAAATGCTGCTGCGGGCCCAAAAATGCTGCGGCGGGCCCAAA} > Log Odds = 7.65779871993799 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > i-0 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > d-7 > m-8 > m-9 > m-10 > m-11 > m-12 > m-13 > d-14 > d-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Now the second scorer, only the first motif is detected: > > Prob = -95.9806747848816 > i-0 > i-0 > i-0 > i-0 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > m-7 > m-8 > m-9 > m-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > i-10 > m-11 > i-11 > m-12 > i-12 > i-12 > i-12 > m-13 > m-14 > m-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Now the null model which seems to make no sense at all: > Null = -94.11166855273558 > m-1 > m-2 > m-3 > m-4 > m-5 > m-6 > m-7 > m-8 > m-9 > m-10 > m-11 > m-12 > m-13 > m-14 > m-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > i-15 > > Is there an option to detect the second motif in the same run just like > the original HMMER? Or am I missing some > option that is not described in the tutorial. > > Thanks in advance > > E.J.Blom > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From koeberle at mpiib-berlin.mpg.de Fri May 25 14:03:02 2007 From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-15?Q?Christian_K=F6berle?=) Date: Fri, 25 May 2007 16:03:02 +0200 Subject: [Biojava-l] KEGG Interface Message-ID: <4656EC96.4090105@mpiib-berlin.mpg.de> Hi, is there a implementation of a interface for KEGG database in BIO-JAVA? And if it exist how dos it work? Best regards, Christian -- Christian K?berle Max Planck Institute for Infection Biology Department of Immunology Campus Charit? Mitte Charit?platz 1 10117 Berlin Tel: +49 30 28 460 562 From markjschreiber at gmail.com Fri May 25 15:41:21 2007 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 25 May 2007 23:41:21 +0800 Subject: [Biojava-l] KEGG Interface In-Reply-To: <4656EC96.4090105@mpiib-berlin.mpg.de> References: <4656EC96.4090105@mpiib-berlin.mpg.de> Message-ID: <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> Hi - There is not one yet. It would be very useful to have one though if anyone is interested in developing one. A generic pathway model would also be great. - Mark On 5/25/07, Christian K?berle wrote: > Hi, > > is there a implementation of a interface for KEGG database in BIO-JAVA? > And if it exist how dos it work? > > Best regards, > Christian > > -- > Christian K?berle > > Max Planck Institute for Infection Biology > Department of Immunology > > Campus Charit? Mitte > Charit?platz 1 > > 10117 Berlin > > Tel: +49 30 28 460 562 > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From walsh at andrew.cmu.edu Fri May 25 19:05:35 2007 From: walsh at andrew.cmu.edu (Andrew Walsh) Date: Fri, 25 May 2007 15:05:35 -0400 Subject: [Biojava-l] Secondary Structure data from PDB file Message-ID: <4657337F.6060203@andrew.cmu.edu> Do I observe correctly that org.biojava.bio.structure.io.PDBFileParser does not currently handle secondary structure annotation (e.g. HELIX, SHEET, or TURN records), and thus no secondary structure information is ever added to an AminoAcid? If this is the case, is there any documentation on how such data is supposed to be added to the Map that represents secondary structure data (i.e. what key/value pairs are expected for common secondary structure features)? Thanks, Andrew Walsh From ap3 at sanger.ac.uk Mon May 28 22:43:45 2007 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 28 May 2007 23:43:45 +0100 Subject: [Biojava-l] Secondary Structure data from PDB file In-Reply-To: <4657337F.6060203@andrew.cmu.edu> References: <4657337F.6060203@andrew.cmu.edu> Message-ID: Hi Andrew, > Do I observe correctly that org.biojava.bio.structure.io.PDBFileParser > does not currently handle secondary structure annotation (e.g. HELIX, > SHEET, or TURN records), The parser is fine, but this has been a missing feature so far. To add support for this I just committed a patch to the biojava CVS. The parser now (optionally) can attach the secondary structure assignment that has been provided by the author of the PDB file to an amino acid. See also http://biojava.org/wiki/BioJava:CookBook:PDB:read Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road London NW12BE. From koeberle at mpiib-berlin.mpg.de Wed May 30 12:20:21 2007 From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-1?Q?Christian_K=F6berle?=) Date: Wed, 30 May 2007 14:20:21 +0200 Subject: [Biojava-l] KEGG Interface In-Reply-To: <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> References: <4656EC96.4090105@mpiib-berlin.mpg.de> <93b45ca50705250841q7eccdc21vbfc4086c3a429482@mail.gmail.com> Message-ID: <465D6C05.90504@mpiib-berlin.mpg.de> Hi, there is a JAVA-API for KEGG on http://www.genome.jp/kegg/soap/ it is not complete but includes a lot of functions to get information from KEGG Christian Mark Schreiber schrieb: > Hi - > > There is not one yet. It would be very useful to have one though if > anyone is interested in developing one. A generic pathway model would > also be great. > > - Mark > > On 5/25/07, Christian K?berle wrote: >> Hi, >> >> is there a implementation of a interface for KEGG database in BIO-JAVA? >> And if it exist how dos it work? >> >> Best regards, >> Christian >> >> -- >> Christian K?berle >> >> Max Planck Institute for Infection Biology >> Department of Immunology >> >> Campus Charit? Mitte >> Charit?platz 1 >> >> 10117 Berlin >> >> Tel: +49 30 28 460 562 >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > -- Christian K?berle Max Planck Institute for Infection Biology Department of Immunology Campus Charit? Mitte Charit?platz 1 10117 Berlin Tel: +49 30 28 460 562 From rejarohit2004 at gmail.com Thu May 31 14:26:51 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 31 May 2007 19:56:51 +0530 Subject: [Biojava-l] Query Message-ID: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Hello all, I am working on HMMER which runs on a command line interface(CLI). Now i want to create and integrate a GUI to execute the commands on the CLI . How can we do this using java .? Please reply ASAP. -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From rejarohit2004 at gmail.com Thu May 31 14:26:51 2007 From: rejarohit2004 at gmail.com (rohit reja) Date: Thu, 31 May 2007 19:56:51 +0530 Subject: [Biojava-l] Query Message-ID: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Hello all, I am working on HMMER which runs on a command line interface(CLI). Now i want to create and integrate a GUI to execute the commands on the CLI . How can we do this using java .? Please reply ASAP. -- Rohit Reja 3rd -B.tech-Bioinformatics VIT University Vellore From tmo at ebi.ac.uk Thu May 31 15:10:58 2007 From: tmo at ebi.ac.uk (Tom Oinn) Date: Thu, 31 May 2007 16:10:58 +0100 Subject: [Biojava-l] Query In-Reply-To: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> Message-ID: <465EE582.2000908@ebi.ac.uk> rohit reja wrote: > Hello all, > > I am working on HMMER which runs on a command line interface(CLI). > Now i want to create and integrate a GUI to execute the commands on the CLI > . > How can we do this using java .? This isn't really a BioJava question (this list is for discussion of the BioJava project and its usage) but... You can use the system Runtime class in Java to exec external tools, your application will have to construct the command line then create a Process object from this command line which has streams from which you can manipulate stdin and stdout for the process. Google for 'Java Runtime exec' for plenty more information. Hint - the process will block unless you consume both stdout and stderr streams, blocking when the buffer for those streams (which is OS and Java version dependant) is full. If you get odd behaviour with the application sometimes hanging this is probably why! Cheers, Tom > Please reply ASAP. ps - don't ask people to reply ASAP, it just annoys people! The assumption if you ask a question on a mailing list is that you'd like an answer :) From ola.spjuth at farmbio.uu.se Thu May 31 15:26:49 2007 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 31 May 2007 17:26:49 +0200 Subject: [Biojava-l] Query In-Reply-To: <465EE582.2000908@ebi.ac.uk> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> <465EE582.2000908@ebi.ac.uk> Message-ID: Hi, Look into the Bioclipse project (http://www.bioclipse.net). It is a graphical workbench that has integrated other tools such as ClustalW and Blast. Duplicating such an integration for HMMER seems like a straightforward task. Somewhat limited CLI is even done at runtime in the Bioclipse preferences, but it could very well be extended. Cheers, .../Ola Spjuth On May 31, 2007, at 17:10 , Tom Oinn wrote: > rohit reja wrote: >> Hello all, >> >> I am working on HMMER which runs on a command line interface(CLI). >> Now i want to create and integrate a GUI to execute the commands >> on the CLI >> . >> How can we do this using java .? > > This isn't really a BioJava question (this list is for discussion > of the > BioJava project and its usage) but... > > You can use the system Runtime class in Java to exec external tools, > your application will have to construct the command line then create a > Process object from this command line which has streams from which you > can manipulate stdin and stdout for the process. Google for 'Java > Runtime exec' for plenty more information. > > Hint - the process will block unless you consume both stdout and > stderr > streams, blocking when the buffer for those streams (which is OS and > Java version dependant) is full. If you get odd behaviour with the > application sometimes hanging this is probably why! > > Cheers, > > Tom > >> Please reply ASAP. > > ps - don't ask people to reply ASAP, it just annoys people! The > assumption if you ask a question on a mailing list is that you'd > like an > answer :) > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From ola.spjuth at farmbio.uu.se Thu May 31 15:26:49 2007 From: ola.spjuth at farmbio.uu.se (Ola Spjuth) Date: Thu, 31 May 2007 17:26:49 +0200 Subject: [Biojava-l] Query In-Reply-To: <465EE582.2000908@ebi.ac.uk> References: <29c042ff0705310726y76e1f5cfn78aebc4392652ec@mail.gmail.com> <465EE582.2000908@ebi.ac.uk> Message-ID: Hi, Look into the Bioclipse project (http://www.bioclipse.net). It is a graphical workbench that has integrated other tools such as ClustalW and Blast. Duplicating such an integration for HMMER seems like a straightforward task. Somewhat limited CLI is even done at runtime in the Bioclipse preferences, but it could very well be extended. Cheers, .../Ola Spjuth On May 31, 2007, at 17:10 , Tom Oinn wrote: > rohit reja wrote: >> Hello all, >> >> I am working on HMMER which runs on a command line interface(CLI). >> Now i want to create and integrate a GUI to execute the commands >> on the CLI >> . >> How can we do this using java .? > > This isn't really a BioJava question (this list is for discussion > of the > BioJava project and its usage) but... > > You can use the system Runtime class in Java to exec external tools, > your application will have to construct the command line then create a > Process object from this command line which has streams from which you > can manipulate stdin and stdout for the process. Google for 'Java > Runtime exec' for plenty more information. > > Hint - the process will block unless you consume both stdout and > stderr > streams, blocking when the buffer for those streams (which is OS and > Java version dependant) is full. If you get odd behaviour with the > application sometimes hanging this is probably why! > > Cheers, > > Tom > >> Please reply ASAP. > > ps - don't ask people to reply ASAP, it just annoys people! The > assumption if you ask a question on a mailing list is that you'd > like an > answer :) > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l