From juberpatel at gmail.com Mon Nov 10 08:54:43 2008 From: juberpatel at gmail.com (juber patel) Date: Mon, 10 Nov 2008 19:24:43 +0530 Subject: [Biojava-dev] want to contribute... Message-ID: Hello people, i have been lurking for long on this list, hoping that i would be able to contribute soon. now that version 3 has been started from the scratch i think this is the best time to get involved as a programmer. it may not be easy given my day job, but please give me some ideas. is there a web page or document about biojava 3 ? -- Juber Patel http://juberpatel.googlepages.com From holland at eaglegenomics.com Mon Nov 10 09:02:19 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 10 Nov 2008 14:02:19 +0000 Subject: [Biojava-dev] want to contribute... In-Reply-To: References: Message-ID: Thanks for joining in! The general idea is here: Right now we could do with an implementation of a decent Blast parser, that parsed the entire Blast XML format into an appropriate object model that reflected the XML format. cheers, Richard 2008/11/10 juber patel : > Hello people, > > i have been lurking for long on this list, hoping that i would be able > to contribute soon. > now that version 3 has been started from the scratch i think this is > the best time to get involved as a programmer. > it may not be easy given my day job, but please give me some ideas. is > there a web page or document about biojava 3 ? > > -- > Juber Patel http://juberpatel.googlepages.com > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From Ekta.Jain at icr.ac.uk Tue Nov 11 14:02:55 2008 From: Ekta.Jain at icr.ac.uk (Ekta Jain) Date: Tue, 11 Nov 2008 19:02:55 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? Message-ID: Hello All, Wondering if BioJava has a PID xml parser ?? PID format is mainly used for pathway data such as provided from NCI. Many Thanks Ekta The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From holland at eaglegenomics.com Tue Nov 11 14:40:31 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 11 Nov 2008 19:40:31 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? In-Reply-To: References: Message-ID: Not that I'm aware of, unless it's very well hidden! If you end up writing one for yourself, would you consider contributing it back to BioJava for others to use? Such additions are always very welcome and appreciated. cheers, Richard 2008/11/11 Ekta Jain : > Hello All, > > Wondering if BioJava has a PID xml parser ?? PID format is mainly used > for pathway data such as provided from NCI. > > Many Thanks > > Ekta > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From Ekta.Jain at icr.ac.uk Tue Nov 11 15:29:07 2008 From: Ekta.Jain at icr.ac.uk (Ekta Jain) Date: Tue, 11 Nov 2008 20:29:07 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? Message-ID: Absolutely :). Wil get back to find out how the code can be added to Biojava once its ready. Many Thanks Ekta >>> "Richard Holland" 11/11/08 7:40 PM >>> Not that I'm aware of, unless it's very well hidden! If you end up writing one for yourself, would you consider contributing it back to BioJava for others to use? Such additions are always very welcome and appreciated. cheers, Richard 2008/11/11 Ekta Jain : > Hello All, > > Wondering if BioJava has a PID xml parser ?? PID format is mainly used > for pathway data such as provided from NCI. > > Many Thanks > > Ekta > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From felipe.albrecht at gmail.com Sat Nov 15 13:02:11 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 16:02:11 -0200 Subject: [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the > score value ,and I want to provide a expectation value ( like blast evalue) > to user. I do not know how to do in this step. Can biojava do it ? Thank you > in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > From felipe.albrecht at gmail.com Sat Nov 15 13:02:11 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 16:02:11 -0200 Subject: [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the > score value ,and I want to provide a expectation value ( like blast evalue) > to user. I do not know how to do in this step. Can biojava do it ? Thank you > in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > From simpleyrx at 163.com Sat Nov 15 19:22:23 2008 From: simpleyrx at 163.com (simpleyrx) Date: Sun, 16 Nov 2008 08:22:23 +0800 (CST) Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> I am very interested in the code. Could you send a copy of the code to me ? thank you. -- Renxiang Yan Ph.D. Candidate Student (2007) E-mail: simpleyrx at 163.com Mobile phone:+86-13811458000 Tel. +86-(0)10-62734412 +86-(0)10-80973092 Address: College of Biological Sciences, China Agricultural University, No.2, Yuanmingyuan West Rd., Haidian district, 100094 Beijing, China ??2008-11-16??"Felipe Albrecht" ?????? Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: Dear experts, I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. -- Student _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From simpleyrx at 163.com Sat Nov 15 19:22:23 2008 From: simpleyrx at 163.com (simpleyrx) Date: Sun, 16 Nov 2008 08:22:23 +0800 (CST) Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> I am very interested in the code. Could you send a copy of the code to me ? thank you. -- Renxiang Yan Ph.D. Candidate Student (2007) E-mail: simpleyrx at 163.com Mobile phone:+86-13811458000 Tel. +86-(0)10-62734412 +86-(0)10-80973092 Address: College of Biological Sciences, China Agricultural University, No.2, Yuanmingyuan West Rd., Haidian district, 100094 Beijing, China ??2008-11-16??"Felipe Albrecht" ?????? Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: Dear experts, I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. -- Student _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From felipe.albrecht at gmail.com Sat Nov 15 19:57:56 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 22:57:56 -0200 Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> <13860930.332041226794943111.JavaMail.coremail@app143.163.com> Message-ID: Hello, I put the source at http://www.pih.bio.br/src/Statistics.java if you modify this source, send to me the update version to use it. Felipe Albrecht 2008/11/15 simpleyrx > > I am very interested in the code. Could you send a copy of the code to me > ? > thank you. > > -- > > Renxiang Yan > Ph.D. Candidate Student (2007) > E-mail: simpleyrx at 163.com > Mobile phone:+86-13811458000 > > Tel. +86-(0)10-62734412 > > +86-(0)10-80973092 > > > > Address: College of Biological Sciences, > China Agricultural University, > No.2, Yuanmingyuan West Rd., > Haidian district, 100094 > Beijing, China > > ??2008-11-16??"Felipe Albrecht" ?????? > > Hello, > > I have a source code in java where I calculate the Evalue, H, K and lambda. > I only tested the source with nucleotidies and the score matrix should have > a 1 or -1 in your score. > > If it is interesting to someone, I can send it. > > Thank you, > > Felipe Albrecht > > On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > >> >> Dear experts, >> >> I develop a sequence search program. Now , my program can calculate the >> score value ,and I want to provide a expectation value ( like blast evalue) >> to user. I do not know how to do in this step. Can biojava do it ? Thank you >> in advanced. >> >> >> -- >> >> >> Student >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > > > ------------------------------ > [????] ??????????????-???????? From felipe.albrecht at gmail.com Sat Nov 15 19:57:56 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 22:57:56 -0200 Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> <13860930.332041226794943111.JavaMail.coremail@app143.163.com> Message-ID: Hello, I put the source at http://www.pih.bio.br/src/Statistics.java if you modify this source, send to me the update version to use it. Felipe Albrecht 2008/11/15 simpleyrx > > I am very interested in the code. Could you send a copy of the code to me > ? > thank you. > > -- > > Renxiang Yan > Ph.D. Candidate Student (2007) > E-mail: simpleyrx at 163.com > Mobile phone:+86-13811458000 > > Tel. +86-(0)10-62734412 > > +86-(0)10-80973092 > > > > Address: College of Biological Sciences, > China Agricultural University, > No.2, Yuanmingyuan West Rd., > Haidian district, 100094 > Beijing, China > > ??2008-11-16??"Felipe Albrecht" ?????? > > Hello, > > I have a source code in java where I calculate the Evalue, H, K and lambda. > I only tested the source with nucleotidies and the score matrix should have > a 1 or -1 in your score. > > If it is interesting to someone, I can send it. > > Thank you, > > Felipe Albrecht > > On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > >> >> Dear experts, >> >> I develop a sequence search program. Now , my program can calculate the >> score value ,and I want to provide a expectation value ( like blast evalue) >> to user. I do not know how to do in this step. Can biojava do it ? Thank you >> in advanced. >> >> >> -- >> >> >> Student >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > > > ------------------------------ > [????] ??????????????-???????? From holland at eaglegenomics.com Tue Nov 18 22:36:27 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 19 Nov 2008 03:36:27 +0000 Subject: [Biojava-dev] BioJava 3 code usage examples Message-ID: I've posted some brief HOWTOs here on how to use the new code as it progresses: http://biojava.org/wiki/BioJava3:HowTo Hopefully the examples will make the new approach a bit clearer. The FASTA parser example does need simplifying even further, through a standard file parser utility class to go in the core module which has yet to be written. But the examples on the page are a start and show you what the convenience methods would have to implement internally (hint, hint!). cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From me at hongyu.org Wed Nov 19 00:20:07 2008 From: me at hongyu.org (Hongyu Zhang) Date: Tue, 18 Nov 2008 21:20:07 -0800 (PST) Subject: [Biojava-dev] BioJava 3 code usage examples References: Message-ID: <935207.30877.qm@web51405.mail.re2.yahoo.com> Hi Richard, Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said "Sequences are perfectly happy as Strings unless you want to do complex things like store base quality information, and only at that point should you want to convert them into more complex object models." The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. Let me know if I missed something. Thanks! Best, Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 From holland at eaglegenomics.com Wed Nov 19 07:00:17 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 19 Nov 2008 12:00:17 +0000 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: <935207.30877.qm@web51405.mail.re2.yahoo.com> References: <935207.30877.qm@web51405.mail.re2.yahoo.com> Message-ID: Hello. Thanks for your feedback. You are right that we've continued to provide a Symbol-based alphabet/symbol structure, but it is no longer a central concept nor is it required to use it. You'll notice that when FASTA is read using the new parser, it reads the sequence from the FASTA file as a simple String (actually, a CharSequence). If you want to work with it as a String/CharSequence and don't want to convert it into Symbols/Lists, you can do so. This is the big change from the existing BioJava way of doing things, which automatically converts everything into the BioJava object model instead of giving the user the choice of what to do with it. This change is consistent with the part of the design document you quote in your email. So, this is giving users the choice of whether they want to work with the sequences directly as Strings/CharSequences, or whether they want to convert them into Symbols/Lists. Users can then tailor their choice depending on locally observed speed/memory usage issues should they so wish. cheers, Richard 2008/11/19 Hongyu Zhang : > Hi Richard, > > Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said > "Sequences are perfectly happy as Strings unless you want to do complex > things like store base quality information, and only at that point > should you want to convert them into more complex object models." > > > The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. > > Let me know if I missed something. Thanks! > > Best, > > Hongyu Zhang, Ph.D. > Ceres Inc., Thousand Oaks, CA > Cell: 805-405-5394 > Fax: 866-447-8750 > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From me at hongyu.org Thu Nov 20 17:26:07 2008 From: me at hongyu.org (Hongyu Zhang) Date: Thu, 20 Nov 2008 14:26:07 -0800 (PST) Subject: [Biojava-dev] BioJava 3 code usage examples Message-ID: <644724.10477.qm@web51411.mail.re2.yahoo.com> Hi Richard, I spent some time reading the codes today. I found that you had packed the biojava3 modules in a different style from the old version. I guess that some of the reasons are related to the new design philosophy and some are related to the maven software (I am new to maven). The things that are not clear to me are: 1) It doesn't seem that you want to avoid name conflicts with the old version because you are continuing using the package name "org.biojava.*" instead of "org.biojava3.*" 2) The old biojava version arranges sequence related classes in a hierarchical fashion, while in the new version you put the FASTA parsing classes directly under a first level node "org.biojava.fasta" rather than under the "org.biojava.seq" as before. There are tens of popular file formats in the bioinformatics world, so will all of them crowd the first level nodes under the root package? 3) The source files are now in much deeper paths now, for example for the FASTA parser, the path is "src/main/java/org/biojava/fasta", as opposed to the common style "src/org/biojava/fasta", so I am wondering why it is necessary to add "main/java" in the middle of the path. 4) It is interesting to see that you put the source codes of all the sub-packages separately, so whenever I need to browse the codes of some related classes in Windows explorer or Unix shell, I really need to go up and down by clicking or typing many more times. Netbean IDE alleviated this problem a little bit. I understand the idea of seperating independent packages in the new design, but I am wondering whether the current very fine seperation of classes went too far. I am not familiar with the new design, so forgive my ignorance. Thanks for your time. Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 ________________________________ From: Hongyu Zhang To: holland at eaglegenomics.com Sent: Wednesday, November 19, 2008 10:55:06 AM Subject: Re: [Biojava-dev] BioJava 3 code usage examples Thanks for the quick response, Richard. I will dive deeper into your codes. Best, Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 ________________________________ From: Richard Holland To: Hongyu Zhang Cc: biojava-dev Sent: Wednesday, November 19, 2008 4:00:17 AM Subject: Re: [Biojava-dev] BioJava 3 code usage examples Hello. Thanks for your feedback. You are right that we've continued to provide a Symbol-based alphabet/symbol structure, but it is no longer a central concept nor is it required to use it. You'll notice that when FASTA is read using the new parser, it reads the sequence from the FASTA file as a simple String (actually, a CharSequence). If you want to work with it as a String/CharSequence and don't want to convert it into Symbols/Lists, you can do so. This is the big change from the existing BioJava way of doing things, which automatically converts everything into the BioJava object model instead of giving the user the choice of what to do with it. This change is consistent with the part of the design document you quote in your email. So, this is giving users the choice of whether they want to work with the sequences directly as Strings/CharSequences, or whether they want to convert them into Symbols/Lists. Users can then tailor their choice depending on locally observed speed/memory usage issues should they so wish. cheers, Richard 2008/11/19 Hongyu Zhang : > Hi Richard, > > Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said > "Sequences are perfectly happy as Strings unless you want to do complex > things like store base quality information, and only at that point > should you want to convert them into more complex object models." > > > The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. > > Let me know if I missed something. Thanks! > > Best, > > Hongyu Zhang, Ph.D. > Ceres Inc., Thousand Oaks, CA > Cell: 805-405-5394 > Fax: 866-447-8750 > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Thu Nov 20 20:02:37 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 20 Nov 2008 20:02:37 -0500 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: <644724.10477.qm@web51411.mail.re2.yahoo.com> References: <644724.10477.qm@web51411.mail.re2.yahoo.com> Message-ID: Hello. Thanks for spending the time looking into the code. It's a bit like a virtual code review session for me... you've made some very useful comments. > 1) It doesn't seem that you want to avoid name conflicts with the old > version because you are continuing using the package name "org.biojava.*" > instead of "org.biojava3.*" Yes, I agree, I think they should be changed to org.biojava3. I will do this when I get 5 minutes spare. They should be in org.biojava3 because one of the stated goals was to be able to write a biojava-biojava3 mapper module if someone needed that to happen, and the use of org.biojava in the new code would preclude that. > 2) The old biojava version arranges sequence related classes in a > hierarchical fashion, while in the new version you put the FASTA parsing > classes directly under a first level node "org.biojava.fasta" rather than > under the "org.biojava.seq" as before. There are tens of popular file > formats in the bioinformatics world, so will all of them crowd the first > level nodes under the root package? That's also a good suggestion. I'll move it. It makes no difference to the physical structure of the project on disk, but it would make class and package browsing easier to do, especially in JavaDocs. > 3) The source files are now in much deeper paths now, for example for the > FASTA parser, the path is "src/main/java/org/biojava/fasta", as opposed to > the common style "src/org/biojava/fasta", so I am wondering why it is > necessary to add "main/java" in the middle of the path. This is because of Maven. The src folder contains both source code and test code, and within each you can have multiple programming languages. The default behaviour, to store source code in a main folder and test code in a test folder, and under each have a subfolder for the programming language, seems sensible to me so I went with it. It also allows the inclusion of resource folders at the same level as the java folder, the contents of which automatically get built into the resulting jars as top-level classpath elements. > 4) It is interesting to see that you put the source codes of all the > sub-packages separately, so whenever I need to browse the codes of some > related classes in Windows explorer or Unix shell, I really need to go up > and down by clicking or typing many more times. Netbean IDE alleviated this > problem a little bit. I understand the idea of seperating independent > packages in the new design, but I am wondering whether the current very fine > seperation of classes went too far. I did this for clarity's sake as to which source code related to which module - you can see really easily if they're separated, whereas if they were all in one tree, it would be hard to know which module a particular class ended up in. Also if you have one source tree, you have to remember to update the Maven configs to split different bits of it into different jars. By simply keeping them separate, this happens automatically. Thanks again for your feedback! cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From mark.schreiber at novartis.com Thu Nov 20 22:40:53 2008 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 21 Nov 2008 11:40:53 +0800 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: Message-ID: biojava-dev-bounces at lists.open-bio.org wrote on 11/21/2008 09:02:37 AM: > Hello. Thanks for spending the time looking into the code. It's a bit > like a virtual code review session for me... you've made some very > useful comments. > > > 1) It doesn't seem that you want to avoid name conflicts with the old > > version because you are continuing using the package name "org.biojava.*" > > instead of "org.biojava3.*" > > Yes, I agree, I think they should be changed to org.biojava3. I will > do this when I get 5 minutes spare. They should be in org.biojava3 > because one of the stated goals was to be able to write a > biojava-biojava3 mapper module if someone needed that to happen, and > the use of org.biojava in the new code would preclude that. > I think it would be a very good idea to differentiate this from old code via the packages. Strictly speaking you are supposed to use a domain that you own. I don't know if biojava3.org is available or if open-bio wants to obtain it. It's not critical but it would prevent possible issues. Alternatives would be to use org.biojava.v3.* or you could use the open-bio domain as in org.open-bio.biojava3 - Mark _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you. From holland at eaglegenomics.com Thu Nov 20 22:57:24 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 20 Nov 2008 22:57:24 -0500 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: References: Message-ID: I like the domain registration idea better. I'll find out what the folks at open-bio think. cheers, Richard 2008/11/20 : > > biojava-dev-bounces at lists.open-bio.org wrote on 11/21/2008 09:02:37 AM: > >> Hello. Thanks for spending the time looking into the code. It's a bit >> like a virtual code review session for me... you've made some very >> useful comments. >> >> > 1) It doesn't seem that you want to avoid name conflicts with the old >> > version because you are continuing using the package name >> > "org.biojava.*" >> > instead of "org.biojava3.*" >> >> Yes, I agree, I think they should be changed to org.biojava3. I will >> do this when I get 5 minutes spare. They should be in org.biojava3 >> because one of the stated goals was to be able to write a >> biojava-biojava3 mapper module if someone needed that to happen, and >> the use of org.biojava in the new code would preclude that. >> > > I think it would be a very good idea to differentiate this from old code via > the packages. Strictly speaking you are supposed to use a domain that you > own. I don't know if biojava3.org is available or if open-bio wants to > obtain it. It's not critical but it would prevent possible issues. > > Alternatives would be to use org.biojava.v3.* > > or you could use the open-bio domain as in org.open-bio.biojava3 > > - Mark > > _________________________ > > CONFIDENTIALITY NOTICE > > The information contained in this e-mail message is intended only for the > exclusive use of the individual or entity named above and may contain > information that is privileged, confidential or exempt from disclosure under > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivery of the message to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this communication is strictly prohibited. If you > have received this communication in error, please notify the sender > immediately by e-mail and delete the material from any computer. Thank you. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From bugzilla-daemon at portal.open-bio.org Fri Nov 21 05:48:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:48:59 -0500 Subject: [Biojava-dev] [Bug 2679] New: NEXUS parse fails on extraneous comma (TREES/TRANSLATE) Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2679 Summary: NEXUS parse fails on extraneous comma (TREES/TRANSLATE) Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: major Priority: P1 Component: bio AssignedTo: biojava-dev at biojava.org ReportedBy: keesey at gmail.com If the last item in a TRANSLATE command of a TREES node ends with a comma, the parser fails with the following exception: Error: org.biojava.bio.seq.io.ParseException: Found unexpected token = in TREES block A properly-formatted NEXUS file should not have a comma here, but there are many files which do. The parser should see the semicolon and assume the command is is ended. Example TREES block (from TREEBase, accession M474): BEGIN TREES; [! 1 trees. TreeBASE accession#: Tree1118 ] TRANSLATE 1 'Nyctereutes_procyonoides', 2 'Urocyon_cinereoargenteus', 3 'Pseudalopex_gymnocercus', 4 'Chrysocyon_brachyurus', 5 'Pseudalopex_sechurae', 6 'Pseudalopex_culpaeus', 7 'Pseudalopex_vetulus', 8 'Atelocynus_microtis', 9 'Pseudalopex_griseus', 10 'Dusicyon_australis', 11 'Urocyon_littoralis', 12 'Vulpes_bengalensis', 13 'Speothos_venaticus', 14 'Otocyon_megalotis', 15 'Vulpes_ferrilata', 16 'Vulpes_rueppelli', 17 'Canis_mesomelas', 18 'Cerdocyon_thous', 19 'Alopex_lagopus', 20 'Vulpes_pallida', 21 'Canis_simensis', 22 'Vulpes_corsac', 23 'Canis_latrans', 24 'Lycaon_pictus', 25 'Canis_adustus', 26 'Vulpes_vulpes', 27 'Vulpes_chama', 28 'Vulpes_velox', 29 'Vulpes_zerda', 30 'Canis_aureus', 31 'Cuon_alpinus', 32 'Pseudalopex', 33 'Canis_rufus', 34 'Vulpes_cana', 35 'Canis_lupus', 36 'Canidae', 37 'Canis', ; TREE 'Fig._9' = [&R] (((((((19,28),((22,15),(16,26))),(34,29)),(12,27,20)),(2,11)),14),((8,(((25,30),17),(23,(35,33),21)),18,4,31,(10,((6,9,3,5),7)),24,13),1)); END; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 05:49:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:49:45 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211049.mALAnjes020244@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #1 from keesey at gmail.com 2008-11-21 05:49 EST ------- The next comment is the full NEXUS file from the example. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 05:50:10 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:50:10 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211050.mALAoAwx020305@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #2 from keesey at gmail.com 2008-11-21 05:50 EST ------- #NEXUS [File created by TreeBASE: 1/29/99 19:24:10] [Matrix accession#: M474] BEGIN DATA; DIMENSIONS NTAX=34 NCHAR=223; [!This data set was downloaded from TreeBASE, a prototype relational database of phylogenetic knowledge. TreeBASE has been supported by the NSF, Harvard University, and UC Davis. Please do not remove this acknowledgment from the Nexus file. TreeBASE ?? 1994-1999. Study reference: Bininda-Emonds, O. R. P., J. L. Gittleman, and A. Purvis. 1999. Building larger phylogenies using supertrees: a complete phylogeny of the extant Carnivora (Mammalia). Biological Reviews, in press. Study accession number = S355 Matrix accession number = M474 ] FORMAT MISSING = ? GAP = - INTERLEAVE ; MATRIX 'Vulpes_zerda' 1?000111000????0000?1100000000?1001???????001100000?????10000000000000111000??????1000000????????111 'Vulpes_vulpes' 1?110000000?00??????1100000000?1001???????110000000?1???10000000000000110101??????1100000???0000?110 'Vulpes_velox' 1?111000000????0000?1100000000?1001???????110000000?????10000000000000110110??????1100000????????110 'Vulpes_rueppelli' 1???????????????????1100000000?1001???????110000000?????10000000000000110101??????1100000????????110 'Vulpes_pallida' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Vulpes_ferrilata' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Vulpes_corsac' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000????????110 'Vulpes_chama' 1???????????????????1100000000?1001???????110000000?????10000000000000110000??????1100000????????100 'Vulpes_cana' 1???????????????????1100000000?1001???????110000000?????10000000000000111000??????1100000????????111 'Vulpes_bengalensis' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Urocyon_littoralis' 1??????????????????????????????1001???????001110000???????????????????????????????1010000??????????? 'Urocyon_cinereoargenteus' 1?000110000?????????1010000000?1001?0?000?001110000?????00000000000000000000??????1010000?0?1000?000 'Speothos_venaticus' 1?000100111?????????0001100110?0000?????????????????????11100011100000000000?1100?0001000??????????? 'Pseudalopex_vetulus' 1???????????????????0001111000?1000???????001001010??????????????????????????0000?0001000??????????? 'Pseudalopex_sechurae' 1???????????????????0001111000?1010???????001001010??????????????????????????0010?0001101??????????? 'Pseudalopex_gymnocercus' 1???????????????????0001111000?1010???????001001010?????11111000000000000000?0010?0001101??????????? 'Pseudalopex_griseus' 1???????????????????0001111000?1010???????001001010?????11111100000000000000?0010?0001101??????????? 'Pseudalopex_culpaeus' 1???????????????????0001111000?1010???????001001010?????11111100000000000000?0011?0001101??????????? 'Otocyon_megalotis' 1?000111000?????????1010000000?0000???????100000000?????11000000000000000000??????1000000??????????? 'Nyctereutes_procyonoides' 1?100000000?????????0001100101?0000???????100000000?1???10000000000000100000???????????????????????? 'Lycaon_pictus' 1??????????????1000????????????0000???000???????????????11100011100000000000??????0001000???1110???? 'Dusicyon_australis' 1???????????????????0001110000?1000???????001001010??????????????????????????0011?0001100??????????? 'Cuon_alpinus' 1??????????????????????????????0000?????????????????????11100010010000000000??????0001000??????????? 'Chrysocyon_brachyurus' 1?000100110?????????0001000000?0000???????001001100?????11100011000000000000?0000?0001000??????????? 'Cerdocyon_thous' 1?000100111?????????0001100101?1000?????????????????????11110000000000000000?1000?0001000??????????? 'Canis_simensis' 1??????????????????????????????1100???111?001000001???0?11100010011011000000??????0001110??????????? 'Canis_rufus' 1??????????????????????????????1100???????001000001???0???????????????????????????0001110??????????? 'Canis_mesomelas' 1???????????10?1100????????????1100???100?001000001???0?11100010010000000000??????0001110??????????? 'Canis_lupus' 1?000100100?11?1111????????????1100?1?111?001000001?0?1?11100010011010000000??????0001110?1?1111???? 'Canis_latrans' 1???????????11?1111????????????1100?1?111?001000001???1?11100010011011000000??????0001110?1?1111???? 'Canis_aureus' 1??????????????1110????????????1100???110?001000001???0?11100010011100000000??????0001110???1100???? 'Canis_adustus' 1??????????????1000????????????1100???000?001000001???0?11100010011100000000??????0001110??????????? 'Atelocynus_microtis' 1???????????????????0001100110?1000???????001001100??????????????????????????1100?0001000??????????? 'Alopex_lagopus' 1?111000000????????????????????1000???????110000000?????10000000000000110110??????1000000????????110 'Vulpes_zerda' 0000?11111000?11110000?00000010??????????10000?01???????????????????100000?????0001?1000001010000000 'Vulpes_vulpes' 1011?11110100?11101101?00000010????????????????01?0000110???00001???100010?????0001?1000001000000000 'Vulpes_velox' 1100?11110111?11101110?00000011????????????????01???????????00001???100010?????0001?1000001000000000 'Vulpes_rueppelli' 1011?11110110?11101101?????????????????????????01???????????00001???100010?????0001?1000001010100000 'Vulpes_pallida' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001011000000 'Vulpes_ferrilata' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001100000000 'Vulpes_corsac' 1010?11110100?11101100?????????????????????????01???????????00001???100010?????0001?1000001100000000 'Vulpes_chama' 0000?11100000?11101000?00000010????????????????01???????????00001???100010?????0001?1000001011000000 'Vulpes_cana' 0000?11111000?11110000?????????????????????????01???????????00001???100010?????0001?1000001010100000 'Vulpes_bengalensis' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001011000000 'Urocyon_littoralis' ?????????????????????????????????????????11000?01???????????????????100100?????0010?1000001000010000 'Urocyon_cinereoargenteus' 0000?10000000?10000000?00000000??????????11000?01???????????????????100100?????0010?1000001000010000 'Speothos_venaticus' ???????????????????????11001000??????????00100?10?0000000???????????000001?????0000?0000000000000000 'Pseudalopex_vetulus' ???????????????????????11110000??????????00101?01???????????????????101000?????0100?1000000000001101 'Pseudalopex_sechurae' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001101 'Pseudalopex_gymnocercus' ?????????????????????????????????????????00101?01?1100000???????????101000?????0100?1000000000001110 'Pseudalopex_griseus' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001100 'Pseudalopex_culpaeus' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001110 'Otocyon_megalotis' ?????11000000?11000000?00000000????????????????00?0000001???????????100000?????0000?0000000000000000 'Nyctereutes_procyonoides' ???????????????????????00000000????????????????01?0000001???????????100000?????0000?0000000000000000 'Lycaon_pictus' ???????????????????????11001100??????????00100?10?0000000???????????000001?????0000?0000000000000000 'Dusicyon_australis' ?????????????????????????????????????????00101?01???????????????????100000?????0000?1000000000001000 'Cuon_alpinus' ???????????????????????????????????????????????10???????????????????000001?????0000?0000000000000000 'Chrysocyon_brachyurus' ???????????????????????11100000??????????00100?01?0000100???????????100000?????0000?0000000000000000 'Cerdocyon_thous' ???????????????????????11110000????????????????01???????????????????100000?????0000?1000000000001000 'Canis_simensis' ?????????????????????????????????????????00110?01???????????????????110000?????1000?1110000000000000 'Canis_rufus' ??????????????????????????????????100?10?00110?01?????????1?11100?1?110000???1?1000?1100110000000000 'Canis_mesomelas' ???????????????????????11001000?1?011?11?00110?01?1110000???11010???110000?0???1000?1111000000000000 'Canis_lupus' ?????00000000?00000000?11001100?1?100????00110?01?1001000?1?11100?1?110000???1?1000?1100110000000000 'Canis_latrans' ???????????????????????11001100???010?00?00110?01?1001000???10000?1?110000???1?1000?1100100000000000 'Canis_aureus' ??????????????????????????????????011?11?00110?01???????????11010?0?110000?1?0?1000?1110000000000000 'Canis_adustus' ?????????????????????????????????????????00110?01?1110000???11010???110000?1???1000?1111000000000000 'Atelocynus_microtis' ?????????????????????????????????????????00100?01???????????????????100000?????0000?1000000000001000 'Alopex_lagopus' 1100?11111111?11101110?00000011?0??????????????01?0000110???00001???100000?????0000?1000001000000000 'Vulpes_zerda' ?????????11000000000000 'Vulpes_vulpes' ?????????11000000000000 'Vulpes_velox' ?????????11000000000000 'Vulpes_rueppelli' ??????????????????????? 'Vulpes_pallida' ?????????11000000000000 'Vulpes_ferrilata' ????????????????????0?? 'Vulpes_corsac' ?????????11000000000000 'Vulpes_chama' ?????????11000000000000 'Vulpes_cana' ???????????????????0??? 'Vulpes_bengalensis' ?????????11000000000000 'Urocyon_littoralis' ??????????????????????? 'Urocyon_cinereoargenteus' ???0?????10100000000000 'Speothos_venaticus' ?????????00010001101101 'Pseudalopex_vetulus' ?????????00010001101000 'Pseudalopex_sechurae' ?????????00010001000000 'Pseudalopex_gymnocercus' ?????????00010001000000 'Pseudalopex_griseus' ?????????00010001000000 'Pseudalopex_culpaeus' ?????????00010001110000 'Otocyon_megalotis' ?1???????10100000000000 'Nyctereutes_procyonoides' ?????????00010001101110 'Lycaon_pictus' ?0???????00011010000000 'Dusicyon_australis' ?????????00010001110000 'Cuon_alpinus' ?????????00011010000000 'Chrysocyon_brachyurus' ?????????00010001101000 'Cerdocyon_thous' ?????????00010001101110 'Canis_simensis' ?????????00011100000000 'Canis_rufus' ?????1?1?00011100000000 'Canis_mesomelas' ?????????00011100000000 'Canis_lupus' ?1?1?1?1?00011100000000 'Canis_latrans' ???1?1?0?00011100000000 'Canis_aureus' ?????????00011100000000 'Canis_adustus' ?????????00011100000000 'Atelocynus_microtis' ?????????00010001101101 'Alopex_lagopus' ?????????11000000000000 ; END; BEGIN ASSUMPTIONS; OPTIONS DEFTYPE = unord PolyTcount = MINSTEPS ; END; BEGIN TREEBASE; END; BEGIN TREES; [! 1 trees. TreeBASE accession#: Tree1118 ] TRANSLATE 1 'Nyctereutes_procyonoides', 2 'Urocyon_cinereoargenteus', 3 'Pseudalopex_gymnocercus', 4 'Chrysocyon_brachyurus', 5 'Pseudalopex_sechurae', 6 'Pseudalopex_culpaeus', 7 'Pseudalopex_vetulus', 8 'Atelocynus_microtis', 9 'Pseudalopex_griseus', 10 'Dusicyon_australis', 11 'Urocyon_littoralis', 12 'Vulpes_bengalensis', 13 'Speothos_venaticus', 14 'Otocyon_megalotis', 15 'Vulpes_ferrilata', 16 'Vulpes_rueppelli', 17 'Canis_mesomelas', 18 'Cerdocyon_thous', 19 'Alopex_lagopus', 20 'Vulpes_pallida', 21 'Canis_simensis', 22 'Vulpes_corsac', 23 'Canis_latrans', 24 'Lycaon_pictus', 25 'Canis_adustus', 26 'Vulpes_vulpes', 27 'Vulpes_chama', 28 'Vulpes_velox', 29 'Vulpes_zerda', 30 'Canis_aureus', 31 'Cuon_alpinus', 32 'Pseudalopex', 33 'Canis_rufus', 34 'Vulpes_cana', 35 'Canis_lupus', 36 'Canidae', 37 'Canis', ; TREE 'Fig._9' = [&R] (((((((19,28),((22,15),(16,26))),(34,29)),(12,27,20)),(2,11)),14),((8,(((25,30),17),(23,(35,33),21)),18,4,31,(10,((6,9,3,5),7)),24,13),1)); END; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 05:51:52 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:51:52 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211051.mALApqKl020508@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #3 from keesey at gmail.com 2008-11-21 05:51 EST ------- version is 1.6 (no place to specifiy that) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Nov 25 18:31:57 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Nov 2008 18:31:57 -0500 Subject: [Biojava-dev] [Bug 2687] New: UniProt: Tags in feature continuation lines may be lost. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2687 Summary: UniProt: Tags in feature continuation lines may be lost. Product: BioJava Version: unspecified Platform: PC OS/Version: Windows XP Status: NEW Severity: major Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: jan at biochemfusion.com Using BioJava 1.6.1 on Windows XP to read in a UniProt file and write it back out. Main code that does this: BufferedReader br = new BufferedReader(new FileReader(args[0])); SimpleNamespace ns = new SimpleNamespace("biojava"); RichSequenceIterator rsi = RichSequence.IOTools.readUniProt(br, ns); RichSequence rs = rsi.nextRichSequence(); RichSequence.IOTools.writeUniProt(System.out, rs, ns); When reading in this heavily abridged version of FA9_BOVIN from www.uniprot.org it works: ID FA9_BOVIN Reviewed; 416 AA. AC P00741; FT CHAIN 1 416 Coagulation factor IX. FT CARBOHYD 53 53 O-linked (Glc...). FT /FTId=CAR_000008. FT Extra information. SQ SEQUENCE 416 AA; 46785 MW; 34A7DFE916330662 CRC64; YNSGKLEEFV RGNLERECKE EKCSFEEARE VFENTEKTTE FWKQYVDGDQ CESNPCLNGG MCKDDINSYE CWCQAGFEGT NCELDATCSI KNGRCKQFCK RDTDNKVVCS CTDGYRLAED QKSCEPAVPF PCGRVSVSHI SKKLTRAETI FSNTNYENSS EAEIIWDNVT QSNQSFDEFS RVVGGEDAER GQFPWQVLLH GEIAAFCGGS IVNEKWVVTA AHCIKPGVKI TVVAGEHNTE KPEPTEQKRN VIRAIPYHSY NASINKYSHD IALLELDEPL ELNSYVTPIC IADRDYTNIF SKFGYGYVSG WGKVFNRGRS ASILQYLKVP LVDRATCLRS TKFSIYSHMF CAGYHEGGKD SCQGDSGGPH VTEVEGTSFL TGIISWGEEC AMKGKYGIYT KVSRYVNWIK EKTKLT // However, when the extra information has been tagged with a slash it is lost: ID FA9_BOVIN Reviewed; 416 AA. AC P00741; FT CHAIN 1 416 Coagulation factor IX. FT CARBOHYD 53 53 O-linked (Glc...). FT /FTId=CAR_000008. FT /NB=Extra information. SQ SEQUENCE 416 AA; 46785 MW; 34A7DFE916330662 CRC64; YNSGKLEEFV RGNLERECKE EKCSFEEARE VFENTEKTTE FWKQYVDGDQ CESNPCLNGG MCKDDINSYE CWCQAGFEGT NCELDATCSI KNGRCKQFCK RDTDNKVVCS CTDGYRLAED QKSCEPAVPF PCGRVSVSHI SKKLTRAETI FSNTNYENSS EAEIIWDNVT QSNQSFDEFS RVVGGEDAER GQFPWQVLLH GEIAAFCGGS IVNEKWVVTA AHCIKPGVKI TVVAGEHNTE KPEPTEQKRN VIRAIPYHSY NASINKYSHD IALLELDEPL ELNSYVTPIC IADRDYTNIF SKFGYGYVSG WGKVFNRGRS ASILQYLKVP LVDRATCLRS TKFSIYSHMF CAGYHEGGKD SCQGDSGGPH VTEVEGTSFL TGIISWGEEC AMKGKYGIYT KVSRYVNWIK EKTKLT // BioPerl 1.5.2 copes with both situtations without problems. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Nov 25 18:35:42 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Nov 2008 18:35:42 -0500 Subject: [Biojava-dev] [Bug 2687] UniProt: Tags in feature continuation lines may be lost. In-Reply-To: Message-ID: <200811252335.mAPNZg6D004170@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2687 ------- Comment #1 from jan at biochemfusion.com 2008-11-25 18:35 EST ------- To clarify: In the first case the information in all feature lines is correctly preserved. In the second case the last FT line "/NB=Extra information." is lost. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at eaglegenomics.com Wed Nov 26 11:04:56 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 26 Nov 2008 16:04:56 +0000 Subject: [Biojava-dev] BugZilla! Message-ID: <492D73A8.4020401@eaglegenomics.com> Hi all, There's been a few new bugs reported on BugZilla recently. Whilst the development of the new BJ features is fun, it's really important to maintain the existing code and keep on top of the issues reported. If there's anyone with a few minutes spare, or would like to learn BioJava and don't know where to start - then zapping a few virtual creepy crawlies is an excellent way to get involved and make a contribution. Here's the list of the currently outstanding problems: http://bugzilla.open-bio.org/buglist.cgi?product=BioJava&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED If you see one you think you can fix, or might like to try to investigate, go for it! All you have to do is assign the bug to yourself in order to take ownership of it so that people don't end up working on the same thing without knowing. cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From bugzilla-daemon at portal.open-bio.org Wed Nov 26 12:44:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Nov 2008 12:44:45 -0500 Subject: [Biojava-dev] [Bug 2603] StringIndexOutOfBoundsException while parsing blastresult In-Reply-To: Message-ID: <200811261744.mAQHijbY009725@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2603 tbanks at agr.gc.ca changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #7 from tbanks at agr.gc.ca 2008-11-26 12:44 EST ------- Fixed. The change to the blast result format (sometime around v2.2.15) made the parser unable to determine where the end of one result finished and the next began. Now the code can recognize the two ways a result can end; the start of a new result or the statistics for the blast and database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mark.schreiber at novartis.com Wed Nov 26 21:33:05 2008 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Thu, 27 Nov 2008 10:33:05 +0800 Subject: [Biojava-dev] [Biojava-l] BugZilla! In-Reply-To: <492D73A8.4020401@eaglegenomics.com> Message-ID: Hi - I would concur with this. BioJava is volunteer based and requires people to volunteer before things happen. In my experience bug reports get resolved most rapidly if the person reporting the bug can do as much of their own investigation as possible. Have a good look at the stack trace and try using a debugger. You can rapidly narrow down the underlying cause. Even if you are not confident about the best way to fix the problem, armed with all the relevant information someone more experienced will be able to tell you if you are on the right track. Unit tests are also good here. For one thing they will set-up a situation to replicate your bug. They will also provide a way to make sure the bug doesn't happen again in future releases. Finally, the more unit tests there are the more confident you can be that your proposed bug fix won't break anything else. Helping fix stuff in BioJava will definitely improve you're programming skills and greatly increase your knowledge of how the API works so by helping others you are probably helping yourself more. - Mark biojava-l-bounces at lists.open-bio.org wrote on 11/27/2008 12:04:56 AM: > Hi all, > > There's been a few new bugs reported on BugZilla recently. Whilst the > development of the new BJ features is fun, it's really important to > maintain the existing code and keep on top of the issues reported. > > If there's anyone with a few minutes spare, or would like to learn > BioJava and don't know where to start - then zapping a few virtual > creepy crawlies is an excellent way to get involved and make a contribution. > > Here's the list of the currently outstanding problems: > > http://bugzilla.open-bio.org/buglist.cgi? > product=BioJava&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED > > If you see one you think you can fix, or might like to try to > investigate, go for it! All you have to do is assign the bug to yourself > in order to take ownership of it so that people don't end up working on > the same thing without knowing. > > cheers, > Richard > > -- > Richard Holland, BSc MBCS > Finance Director, Eagle Genomics Ltd > M: +44 7500 438846 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you. From talk2ali at gmail.com Fri Nov 21 20:38:26 2008 From: talk2ali at gmail.com (Muhammad Ali) Date: Fri, 21 Nov 2008 20:38:26 -0500 Subject: [Biojava-dev] NCBI Blast XML Parser update In-Reply-To: <59a41c430808191936sb139efcwff92065cffe8f49@mail.gmail.com> References: <59a41c430808191936sb139efcwff92065cffe8f49@mail.gmail.com> Message-ID: Hi, Sorry for the delay, I got caught up with my other project. Anyway, I'm here now and I've attached the patch file with this email. Please take a look and let me know if changes are required. With regards to testing, there is a BlastParser.java file in the /demos/blastxml folder that I've been using to test on my end. It takes a Blast output file as its input. If you want, I can provide some Blast output file with multiple iteration entries for use as a sample input file for testing. I think I'll do without SVN write access. I don't expect to be working on other parts of biojava right now, unless someone else in my lab complains about something else being broken ;). But if I do make any local patches, I'll definitely try and get them back to you guys. Cheers, Ali. On Tue, Aug 19, 2008 at 9:36 PM, Andreas Prlic wrote: > Hi Ali, > > that's good news, can you send your patch to this list, so we can have > a look? At this stage it would be also great to provide a new Junit > test that makes sure that the parsing works ok and to demonstrate the > new feature. If you provide more patches in the future we can also set > up write access to the SVN repository. > > Andreas > > On Tue, Aug 19, 2008 at 6:10 AM, Muhammad Ali wrote: >> Hello, >> >> The current version of the BlastXMLParser (used for parsing NCBI BLAST >> output files) is not handling multiple Iteration entries in the file >> correctly. It lumps them all together, resulting in a loss of >> search-specific parameters. The end result is a single >> SeqSimilaritySearchResult object. The expected output should be one >> SeqSimilaritySearchResult for each Iteration entry in the file. >> >> I've fixed this issue on my locally checked out copy by modifying a >> few files in the org.biojava.bio.program.sax.blastxml package. I'm >> interested in submitting the updated code back to the main repository. >> Can someone tell me how I can go about doing so? >> >> Also the javadocs for the BlastXMLParser seem to be outdated. They >> mention malformed XML being generated by NCBI, but that doesn't seem >> to be the case anymore. >> >> Thanks, >> Ali. >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: changes.patch Type: application/octet-stream Size: 34641 bytes Desc: not available URL: From juberpatel at gmail.com Mon Nov 10 13:54:43 2008 From: juberpatel at gmail.com (juber patel) Date: Mon, 10 Nov 2008 19:24:43 +0530 Subject: [Biojava-dev] want to contribute... Message-ID: Hello people, i have been lurking for long on this list, hoping that i would be able to contribute soon. now that version 3 has been started from the scratch i think this is the best time to get involved as a programmer. it may not be easy given my day job, but please give me some ideas. is there a web page or document about biojava 3 ? -- Juber Patel http://juberpatel.googlepages.com From holland at eaglegenomics.com Mon Nov 10 14:02:19 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 10 Nov 2008 14:02:19 +0000 Subject: [Biojava-dev] want to contribute... In-Reply-To: References: Message-ID: Thanks for joining in! The general idea is here: Right now we could do with an implementation of a decent Blast parser, that parsed the entire Blast XML format into an appropriate object model that reflected the XML format. cheers, Richard 2008/11/10 juber patel : > Hello people, > > i have been lurking for long on this list, hoping that i would be able > to contribute soon. > now that version 3 has been started from the scratch i think this is > the best time to get involved as a programmer. > it may not be easy given my day job, but please give me some ideas. is > there a web page or document about biojava 3 ? > > -- > Juber Patel http://juberpatel.googlepages.com > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From Ekta.Jain at icr.ac.uk Tue Nov 11 19:02:55 2008 From: Ekta.Jain at icr.ac.uk (Ekta Jain) Date: Tue, 11 Nov 2008 19:02:55 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? Message-ID: Hello All, Wondering if BioJava has a PID xml parser ?? PID format is mainly used for pathway data such as provided from NCI. Many Thanks Ekta The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From holland at eaglegenomics.com Tue Nov 11 19:40:31 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Tue, 11 Nov 2008 19:40:31 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? In-Reply-To: References: Message-ID: Not that I'm aware of, unless it's very well hidden! If you end up writing one for yourself, would you consider contributing it back to BioJava for others to use? Such additions are always very welcome and appreciated. cheers, Richard 2008/11/11 Ekta Jain : > Hello All, > > Wondering if BioJava has a PID xml parser ?? PID format is mainly used > for pathway data such as provided from NCI. > > Many Thanks > > Ekta > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From Ekta.Jain at icr.ac.uk Tue Nov 11 20:29:07 2008 From: Ekta.Jain at icr.ac.uk (Ekta Jain) Date: Tue, 11 Nov 2008 20:29:07 +0000 Subject: [Biojava-dev] Does BioJava have a PID parser? Message-ID: Absolutely :). Wil get back to find out how the code can be added to Biojava once its ready. Many Thanks Ekta >>> "Richard Holland" 11/11/08 7:40 PM >>> Not that I'm aware of, unless it's very well hidden! If you end up writing one for yourself, would you consider contributing it back to BioJava for others to use? Such additions are always very welcome and appreciated. cheers, Richard 2008/11/11 Ekta Jain : > Hello All, > > Wondering if BioJava has a PID xml parser ?? PID format is mainly used > for pathway data such as provided from NCI. > > Many Thanks > > Ekta > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From felipe.albrecht at gmail.com Sat Nov 15 18:02:11 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 16:02:11 -0200 Subject: [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the > score value ,and I want to provide a expectation value ( like blast evalue) > to user. I do not know how to do in this step. Can biojava do it ? Thank you > in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > From felipe.albrecht at gmail.com Sat Nov 15 18:02:11 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 16:02:11 -0200 Subject: [Biojava-dev] Does biojava can calculate evalue ? In-Reply-To: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > > Dear experts, > > I develop a sequence search program. Now , my program can calculate the > score value ,and I want to provide a expectation value ( like blast evalue) > to user. I do not know how to do in this step. Can biojava do it ? Thank you > in advanced. > > > -- > > > Student > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > From simpleyrx at 163.com Sun Nov 16 00:22:23 2008 From: simpleyrx at 163.com (simpleyrx) Date: Sun, 16 Nov 2008 08:22:23 +0800 (CST) Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> I am very interested in the code. Could you send a copy of the code to me ? thank you. -- Renxiang Yan Ph.D. Candidate Student (2007) E-mail: simpleyrx at 163.com Mobile phone:+86-13811458000 Tel. +86-(0)10-62734412 +86-(0)10-80973092 Address: College of Biological Sciences, China Agricultural University, No.2, Yuanmingyuan West Rd., Haidian district, 100094 Beijing, China ?2008-11-16?"Felipe Albrecht" ??? Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: Dear experts, I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. -- Student _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From simpleyrx at 163.com Sun Nov 16 00:22:23 2008 From: simpleyrx at 163.com (simpleyrx) Date: Sun, 16 Nov 2008 08:22:23 +0800 (CST) Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> Message-ID: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> I am very interested in the code. Could you send a copy of the code to me ? thank you. -- Renxiang Yan Ph.D. Candidate Student (2007) E-mail: simpleyrx at 163.com Mobile phone:+86-13811458000 Tel. +86-(0)10-62734412 +86-(0)10-80973092 Address: College of Biological Sciences, China Agricultural University, No.2, Yuanmingyuan West Rd., Haidian district, 100094 Beijing, China ?2008-11-16?"Felipe Albrecht" ??? Hello, I have a source code in java where I calculate the Evalue, H, K and lambda. I only tested the source with nucleotidies and the score matrix should have a 1 or -1 in your score. If it is interesting to someone, I can send it. Thank you, Felipe Albrecht On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: Dear experts, I develop a sequence search program. Now , my program can calculate the score value ,and I want to provide a expectation value ( like blast evalue) to user. I do not know how to do in this step. Can biojava do it ? Thank you in advanced. -- Student _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From felipe.albrecht at gmail.com Sun Nov 16 00:57:56 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 22:57:56 -0200 Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> <13860930.332041226794943111.JavaMail.coremail@app143.163.com> Message-ID: Hello, I put the source at http://www.pih.bio.br/src/Statistics.java if you modify this source, send to me the update version to use it. Felipe Albrecht 2008/11/15 simpleyrx > > I am very interested in the code. Could you send a copy of the code to me > ? > thank you. > > -- > > Renxiang Yan > Ph.D. Candidate Student (2007) > E-mail: simpleyrx at 163.com > Mobile phone:+86-13811458000 > > Tel. +86-(0)10-62734412 > > +86-(0)10-80973092 > > > > Address: College of Biological Sciences, > China Agricultural University, > No.2, Yuanmingyuan West Rd., > Haidian district, 100094 > Beijing, China > > ?2008-11-16?"Felipe Albrecht" ??? > > Hello, > > I have a source code in java where I calculate the Evalue, H, K and lambda. > I only tested the source with nucleotidies and the score matrix should have > a 1 or -1 in your score. > > If it is interesting to someone, I can send it. > > Thank you, > > Felipe Albrecht > > On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > >> >> Dear experts, >> >> I develop a sequence search program. Now , my program can calculate the >> score value ,and I want to provide a expectation value ( like blast evalue) >> to user. I do not know how to do in this step. Can biojava do it ? Thank you >> in advanced. >> >> >> -- >> >> >> Student >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > > > ------------------------------ > [??] ???????-???? From felipe.albrecht at gmail.com Sun Nov 16 00:57:56 2008 From: felipe.albrecht at gmail.com (Felipe Albrecht) Date: Sat, 15 Nov 2008 22:57:56 -0200 Subject: [Biojava-dev] Reply:Re: Does biojava can calculate evalue ? In-Reply-To: <13860930.332041226794943111.JavaMail.coremail@app143.163.com> References: <11620077.684601220107698211.JavaMail.coremail@bj163app60.163.com> <13860930.332041226794943111.JavaMail.coremail@app143.163.com> Message-ID: Hello, I put the source at http://www.pih.bio.br/src/Statistics.java if you modify this source, send to me the update version to use it. Felipe Albrecht 2008/11/15 simpleyrx > > I am very interested in the code. Could you send a copy of the code to me > ? > thank you. > > -- > > Renxiang Yan > Ph.D. Candidate Student (2007) > E-mail: simpleyrx at 163.com > Mobile phone:+86-13811458000 > > Tel. +86-(0)10-62734412 > > +86-(0)10-80973092 > > > > Address: College of Biological Sciences, > China Agricultural University, > No.2, Yuanmingyuan West Rd., > Haidian district, 100094 > Beijing, China > > ?2008-11-16?"Felipe Albrecht" ??? > > Hello, > > I have a source code in java where I calculate the Evalue, H, K and lambda. > I only tested the source with nucleotidies and the score matrix should have > a 1 or -1 in your score. > > If it is interesting to someone, I can send it. > > Thank you, > > Felipe Albrecht > > On Sat, Aug 30, 2008 at 12:48 PM, simpleyrx wrote: > >> >> Dear experts, >> >> I develop a sequence search program. Now , my program can calculate the >> score value ,and I want to provide a expectation value ( like blast evalue) >> to user. I do not know how to do in this step. Can biojava do it ? Thank you >> in advanced. >> >> >> -- >> >> >> Student >> >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> >> > > > ------------------------------ > [??] ???????-???? From holland at eaglegenomics.com Wed Nov 19 03:36:27 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 19 Nov 2008 03:36:27 +0000 Subject: [Biojava-dev] BioJava 3 code usage examples Message-ID: I've posted some brief HOWTOs here on how to use the new code as it progresses: http://biojava.org/wiki/BioJava3:HowTo Hopefully the examples will make the new approach a bit clearer. The FASTA parser example does need simplifying even further, through a standard file parser utility class to go in the core module which has yet to be written. But the examples on the page are a start and show you what the convenience methods would have to implement internally (hint, hint!). cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From me at hongyu.org Wed Nov 19 05:20:07 2008 From: me at hongyu.org (Hongyu Zhang) Date: Tue, 18 Nov 2008 21:20:07 -0800 (PST) Subject: [Biojava-dev] BioJava 3 code usage examples References: Message-ID: <935207.30877.qm@web51405.mail.re2.yahoo.com> Hi Richard, Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said "Sequences are perfectly happy as Strings unless you want to do complex things like store base quality information, and only at that point should you want to convert them into more complex object models." The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. Let me know if I missed something. Thanks! Best, Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 From holland at eaglegenomics.com Wed Nov 19 12:00:17 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 19 Nov 2008 12:00:17 +0000 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: <935207.30877.qm@web51405.mail.re2.yahoo.com> References: <935207.30877.qm@web51405.mail.re2.yahoo.com> Message-ID: Hello. Thanks for your feedback. You are right that we've continued to provide a Symbol-based alphabet/symbol structure, but it is no longer a central concept nor is it required to use it. You'll notice that when FASTA is read using the new parser, it reads the sequence from the FASTA file as a simple String (actually, a CharSequence). If you want to work with it as a String/CharSequence and don't want to convert it into Symbols/Lists, you can do so. This is the big change from the existing BioJava way of doing things, which automatically converts everything into the BioJava object model instead of giving the user the choice of what to do with it. This change is consistent with the part of the design document you quote in your email. So, this is giving users the choice of whether they want to work with the sequences directly as Strings/CharSequences, or whether they want to convert them into Symbols/Lists. Users can then tailor their choice depending on locally observed speed/memory usage issues should they so wish. cheers, Richard 2008/11/19 Hongyu Zhang : > Hi Richard, > > Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said > "Sequences are perfectly happy as Strings unless you want to do complex > things like store base quality information, and only at that point > should you want to convert them into more complex object models." > > > The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. > > Let me know if I missed something. Thanks! > > Best, > > Hongyu Zhang, Ph.D. > Ceres Inc., Thousand Oaks, CA > Cell: 805-405-5394 > Fax: 866-447-8750 > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From me at hongyu.org Thu Nov 20 22:26:07 2008 From: me at hongyu.org (Hongyu Zhang) Date: Thu, 20 Nov 2008 14:26:07 -0800 (PST) Subject: [Biojava-dev] BioJava 3 code usage examples Message-ID: <644724.10477.qm@web51411.mail.re2.yahoo.com> Hi Richard, I spent some time reading the codes today. I found that you had packed the biojava3 modules in a different style from the old version. I guess that some of the reasons are related to the new design philosophy and some are related to the maven software (I am new to maven). The things that are not clear to me are: 1) It doesn't seem that you want to avoid name conflicts with the old version because you are continuing using the package name "org.biojava.*" instead of "org.biojava3.*" 2) The old biojava version arranges sequence related classes in a hierarchical fashion, while in the new version you put the FASTA parsing classes directly under a first level node "org.biojava.fasta" rather than under the "org.biojava.seq" as before. There are tens of popular file formats in the bioinformatics world, so will all of them crowd the first level nodes under the root package? 3) The source files are now in much deeper paths now, for example for the FASTA parser, the path is "src/main/java/org/biojava/fasta", as opposed to the common style "src/org/biojava/fasta", so I am wondering why it is necessary to add "main/java" in the middle of the path. 4) It is interesting to see that you put the source codes of all the sub-packages separately, so whenever I need to browse the codes of some related classes in Windows explorer or Unix shell, I really need to go up and down by clicking or typing many more times. Netbean IDE alleviated this problem a little bit. I understand the idea of seperating independent packages in the new design, but I am wondering whether the current very fine seperation of classes went too far. I am not familiar with the new design, so forgive my ignorance. Thanks for your time. Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 ________________________________ From: Hongyu Zhang To: holland at eaglegenomics.com Sent: Wednesday, November 19, 2008 10:55:06 AM Subject: Re: [Biojava-dev] BioJava 3 code usage examples Thanks for the quick response, Richard. I will dive deeper into your codes. Best, Hongyu Zhang, Ph.D. Ceres Inc., Thousand Oaks, CA Cell: 805-405-5394 Fax: 866-447-8750 ________________________________ From: Richard Holland To: Hongyu Zhang Cc: biojava-dev Sent: Wednesday, November 19, 2008 4:00:17 AM Subject: Re: [Biojava-dev] BioJava 3 code usage examples Hello. Thanks for your feedback. You are right that we've continued to provide a Symbol-based alphabet/symbol structure, but it is no longer a central concept nor is it required to use it. You'll notice that when FASTA is read using the new parser, it reads the sequence from the FASTA file as a simple String (actually, a CharSequence). If you want to work with it as a String/CharSequence and don't want to convert it into Symbols/Lists, you can do so. This is the big change from the existing BioJava way of doing things, which automatically converts everything into the BioJava object model instead of giving the user the choice of what to do with it. This change is consistent with the part of the design document you quote in your email. So, this is giving users the choice of whether they want to work with the sequences directly as Strings/CharSequences, or whether they want to convert them into Symbols/Lists. Users can then tailor their choice depending on locally observed speed/memory usage issues should they so wish. cheers, Richard 2008/11/19 Hongyu Zhang : > Hi Richard, > > Thanks for your great work! I noticed from your examples that you decided to continue to use the Symbol object-based model to represent sequences even though in the Biojava3 design page ( http://biojava.org/wiki/BioJava3_Design ) it said > "Sequences are perfectly happy as Strings unless you want to do complex > things like store base quality information, and only at that point > should you want to convert them into more complex object models." > > > The original Biojava tutorial ( http://biojava.org/wiki/BioJava:Tutorial:Symbols_and_SymbolLists#Doesn.27t_this_all_waste_memory.3F ) discussed the memoery space difference between Symbol object-based sequence representation and String-based sequence representation, but it didn't address speed issue. One of the advantages of Java String library is that it was optimized using native machine codes, so I think an Sybmol object-based sequence representation would be slower than String-based sequence representation for certain operations such as substring search. > > Let me know if I missed something. Thanks! > > Best, > > Hongyu Zhang, Ph.D. > Ceres Inc., Thousand Oaks, CA > Cell: 805-405-5394 > Fax: 866-447-8750 > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Fri Nov 21 01:02:37 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 20 Nov 2008 20:02:37 -0500 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: <644724.10477.qm@web51411.mail.re2.yahoo.com> References: <644724.10477.qm@web51411.mail.re2.yahoo.com> Message-ID: Hello. Thanks for spending the time looking into the code. It's a bit like a virtual code review session for me... you've made some very useful comments. > 1) It doesn't seem that you want to avoid name conflicts with the old > version because you are continuing using the package name "org.biojava.*" > instead of "org.biojava3.*" Yes, I agree, I think they should be changed to org.biojava3. I will do this when I get 5 minutes spare. They should be in org.biojava3 because one of the stated goals was to be able to write a biojava-biojava3 mapper module if someone needed that to happen, and the use of org.biojava in the new code would preclude that. > 2) The old biojava version arranges sequence related classes in a > hierarchical fashion, while in the new version you put the FASTA parsing > classes directly under a first level node "org.biojava.fasta" rather than > under the "org.biojava.seq" as before. There are tens of popular file > formats in the bioinformatics world, so will all of them crowd the first > level nodes under the root package? That's also a good suggestion. I'll move it. It makes no difference to the physical structure of the project on disk, but it would make class and package browsing easier to do, especially in JavaDocs. > 3) The source files are now in much deeper paths now, for example for the > FASTA parser, the path is "src/main/java/org/biojava/fasta", as opposed to > the common style "src/org/biojava/fasta", so I am wondering why it is > necessary to add "main/java" in the middle of the path. This is because of Maven. The src folder contains both source code and test code, and within each you can have multiple programming languages. The default behaviour, to store source code in a main folder and test code in a test folder, and under each have a subfolder for the programming language, seems sensible to me so I went with it. It also allows the inclusion of resource folders at the same level as the java folder, the contents of which automatically get built into the resulting jars as top-level classpath elements. > 4) It is interesting to see that you put the source codes of all the > sub-packages separately, so whenever I need to browse the codes of some > related classes in Windows explorer or Unix shell, I really need to go up > and down by clicking or typing many more times. Netbean IDE alleviated this > problem a little bit. I understand the idea of seperating independent > packages in the new design, but I am wondering whether the current very fine > seperation of classes went too far. I did this for clarity's sake as to which source code related to which module - you can see really easily if they're separated, whereas if they were all in one tree, it would be hard to know which module a particular class ended up in. Also if you have one source tree, you have to remember to update the Maven configs to split different bits of it into different jars. By simply keeping them separate, this happens automatically. Thanks again for your feedback! cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From mark.schreiber at novartis.com Fri Nov 21 03:40:53 2008 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 21 Nov 2008 11:40:53 +0800 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: Message-ID: biojava-dev-bounces at lists.open-bio.org wrote on 11/21/2008 09:02:37 AM: > Hello. Thanks for spending the time looking into the code. It's a bit > like a virtual code review session for me... you've made some very > useful comments. > > > 1) It doesn't seem that you want to avoid name conflicts with the old > > version because you are continuing using the package name "org.biojava.*" > > instead of "org.biojava3.*" > > Yes, I agree, I think they should be changed to org.biojava3. I will > do this when I get 5 minutes spare. They should be in org.biojava3 > because one of the stated goals was to be able to write a > biojava-biojava3 mapper module if someone needed that to happen, and > the use of org.biojava in the new code would preclude that. > I think it would be a very good idea to differentiate this from old code via the packages. Strictly speaking you are supposed to use a domain that you own. I don't know if biojava3.org is available or if open-bio wants to obtain it. It's not critical but it would prevent possible issues. Alternatives would be to use org.biojava.v3.* or you could use the open-bio domain as in org.open-bio.biojava3 - Mark _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you. From holland at eaglegenomics.com Fri Nov 21 03:57:24 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 20 Nov 2008 22:57:24 -0500 Subject: [Biojava-dev] BioJava 3 code usage examples In-Reply-To: References: Message-ID: I like the domain registration idea better. I'll find out what the folks at open-bio think. cheers, Richard 2008/11/20 : > > biojava-dev-bounces at lists.open-bio.org wrote on 11/21/2008 09:02:37 AM: > >> Hello. Thanks for spending the time looking into the code. It's a bit >> like a virtual code review session for me... you've made some very >> useful comments. >> >> > 1) It doesn't seem that you want to avoid name conflicts with the old >> > version because you are continuing using the package name >> > "org.biojava.*" >> > instead of "org.biojava3.*" >> >> Yes, I agree, I think they should be changed to org.biojava3. I will >> do this when I get 5 minutes spare. They should be in org.biojava3 >> because one of the stated goals was to be able to write a >> biojava-biojava3 mapper module if someone needed that to happen, and >> the use of org.biojava in the new code would preclude that. >> > > I think it would be a very good idea to differentiate this from old code via > the packages. Strictly speaking you are supposed to use a domain that you > own. I don't know if biojava3.org is available or if open-bio wants to > obtain it. It's not critical but it would prevent possible issues. > > Alternatives would be to use org.biojava.v3.* > > or you could use the open-bio domain as in org.open-bio.biojava3 > > - Mark > > _________________________ > > CONFIDENTIALITY NOTICE > > The information contained in this e-mail message is intended only for the > exclusive use of the individual or entity named above and may contain > information that is privileged, confidential or exempt from disclosure under > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivery of the message to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this communication is strictly prohibited. If you > have received this communication in error, please notify the sender > immediately by e-mail and delete the material from any computer. Thank you. > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From bugzilla-daemon at portal.open-bio.org Fri Nov 21 10:48:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:48:59 -0500 Subject: [Biojava-dev] [Bug 2679] New: NEXUS parse fails on extraneous comma (TREES/TRANSLATE) Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2679 Summary: NEXUS parse fails on extraneous comma (TREES/TRANSLATE) Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: major Priority: P1 Component: bio AssignedTo: biojava-dev at biojava.org ReportedBy: keesey at gmail.com If the last item in a TRANSLATE command of a TREES node ends with a comma, the parser fails with the following exception: Error: org.biojava.bio.seq.io.ParseException: Found unexpected token = in TREES block A properly-formatted NEXUS file should not have a comma here, but there are many files which do. The parser should see the semicolon and assume the command is is ended. Example TREES block (from TREEBase, accession M474): BEGIN TREES; [! 1 trees. TreeBASE accession#: Tree1118 ] TRANSLATE 1 'Nyctereutes_procyonoides', 2 'Urocyon_cinereoargenteus', 3 'Pseudalopex_gymnocercus', 4 'Chrysocyon_brachyurus', 5 'Pseudalopex_sechurae', 6 'Pseudalopex_culpaeus', 7 'Pseudalopex_vetulus', 8 'Atelocynus_microtis', 9 'Pseudalopex_griseus', 10 'Dusicyon_australis', 11 'Urocyon_littoralis', 12 'Vulpes_bengalensis', 13 'Speothos_venaticus', 14 'Otocyon_megalotis', 15 'Vulpes_ferrilata', 16 'Vulpes_rueppelli', 17 'Canis_mesomelas', 18 'Cerdocyon_thous', 19 'Alopex_lagopus', 20 'Vulpes_pallida', 21 'Canis_simensis', 22 'Vulpes_corsac', 23 'Canis_latrans', 24 'Lycaon_pictus', 25 'Canis_adustus', 26 'Vulpes_vulpes', 27 'Vulpes_chama', 28 'Vulpes_velox', 29 'Vulpes_zerda', 30 'Canis_aureus', 31 'Cuon_alpinus', 32 'Pseudalopex', 33 'Canis_rufus', 34 'Vulpes_cana', 35 'Canis_lupus', 36 'Canidae', 37 'Canis', ; TREE 'Fig._9' = [&R] (((((((19,28),((22,15),(16,26))),(34,29)),(12,27,20)),(2,11)),14),((8,(((25,30),17),(23,(35,33),21)),18,4,31,(10,((6,9,3,5),7)),24,13),1)); END; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 10:49:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:49:45 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211049.mALAnjes020244@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #1 from keesey at gmail.com 2008-11-21 05:49 EST ------- The next comment is the full NEXUS file from the example. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 10:50:10 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:50:10 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211050.mALAoAwx020305@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #2 from keesey at gmail.com 2008-11-21 05:50 EST ------- #NEXUS [File created by TreeBASE: 1/29/99 19:24:10] [Matrix accession#: M474] BEGIN DATA; DIMENSIONS NTAX=34 NCHAR=223; [!This data set was downloaded from TreeBASE, a prototype relational database of phylogenetic knowledge. TreeBASE has been supported by the NSF, Harvard University, and UC Davis. Please do not remove this acknowledgment from the Nexus file. TreeBASE ?? 1994-1999. Study reference: Bininda-Emonds, O. R. P., J. L. Gittleman, and A. Purvis. 1999. Building larger phylogenies using supertrees: a complete phylogeny of the extant Carnivora (Mammalia). Biological Reviews, in press. Study accession number = S355 Matrix accession number = M474 ] FORMAT MISSING = ? GAP = - INTERLEAVE ; MATRIX 'Vulpes_zerda' 1?000111000????0000?1100000000?1001???????001100000?????10000000000000111000??????1000000????????111 'Vulpes_vulpes' 1?110000000?00??????1100000000?1001???????110000000?1???10000000000000110101??????1100000???0000?110 'Vulpes_velox' 1?111000000????0000?1100000000?1001???????110000000?????10000000000000110110??????1100000????????110 'Vulpes_rueppelli' 1???????????????????1100000000?1001???????110000000?????10000000000000110101??????1100000????????110 'Vulpes_pallida' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Vulpes_ferrilata' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Vulpes_corsac' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000????????110 'Vulpes_chama' 1???????????????????1100000000?1001???????110000000?????10000000000000110000??????1100000????????100 'Vulpes_cana' 1???????????????????1100000000?1001???????110000000?????10000000000000111000??????1100000????????111 'Vulpes_bengalensis' 1???????????????????1100000000?1001???????110000000???????????????????????????????1100000??????????? 'Urocyon_littoralis' 1??????????????????????????????1001???????001110000???????????????????????????????1010000??????????? 'Urocyon_cinereoargenteus' 1?000110000?????????1010000000?1001?0?000?001110000?????00000000000000000000??????1010000?0?1000?000 'Speothos_venaticus' 1?000100111?????????0001100110?0000?????????????????????11100011100000000000?1100?0001000??????????? 'Pseudalopex_vetulus' 1???????????????????0001111000?1000???????001001010??????????????????????????0000?0001000??????????? 'Pseudalopex_sechurae' 1???????????????????0001111000?1010???????001001010??????????????????????????0010?0001101??????????? 'Pseudalopex_gymnocercus' 1???????????????????0001111000?1010???????001001010?????11111000000000000000?0010?0001101??????????? 'Pseudalopex_griseus' 1???????????????????0001111000?1010???????001001010?????11111100000000000000?0010?0001101??????????? 'Pseudalopex_culpaeus' 1???????????????????0001111000?1010???????001001010?????11111100000000000000?0011?0001101??????????? 'Otocyon_megalotis' 1?000111000?????????1010000000?0000???????100000000?????11000000000000000000??????1000000??????????? 'Nyctereutes_procyonoides' 1?100000000?????????0001100101?0000???????100000000?1???10000000000000100000???????????????????????? 'Lycaon_pictus' 1??????????????1000????????????0000???000???????????????11100011100000000000??????0001000???1110???? 'Dusicyon_australis' 1???????????????????0001110000?1000???????001001010??????????????????????????0011?0001100??????????? 'Cuon_alpinus' 1??????????????????????????????0000?????????????????????11100010010000000000??????0001000??????????? 'Chrysocyon_brachyurus' 1?000100110?????????0001000000?0000???????001001100?????11100011000000000000?0000?0001000??????????? 'Cerdocyon_thous' 1?000100111?????????0001100101?1000?????????????????????11110000000000000000?1000?0001000??????????? 'Canis_simensis' 1??????????????????????????????1100???111?001000001???0?11100010011011000000??????0001110??????????? 'Canis_rufus' 1??????????????????????????????1100???????001000001???0???????????????????????????0001110??????????? 'Canis_mesomelas' 1???????????10?1100????????????1100???100?001000001???0?11100010010000000000??????0001110??????????? 'Canis_lupus' 1?000100100?11?1111????????????1100?1?111?001000001?0?1?11100010011010000000??????0001110?1?1111???? 'Canis_latrans' 1???????????11?1111????????????1100?1?111?001000001???1?11100010011011000000??????0001110?1?1111???? 'Canis_aureus' 1??????????????1110????????????1100???110?001000001???0?11100010011100000000??????0001110???1100???? 'Canis_adustus' 1??????????????1000????????????1100???000?001000001???0?11100010011100000000??????0001110??????????? 'Atelocynus_microtis' 1???????????????????0001100110?1000???????001001100??????????????????????????1100?0001000??????????? 'Alopex_lagopus' 1?111000000????????????????????1000???????110000000?????10000000000000110110??????1000000????????110 'Vulpes_zerda' 0000?11111000?11110000?00000010??????????10000?01???????????????????100000?????0001?1000001010000000 'Vulpes_vulpes' 1011?11110100?11101101?00000010????????????????01?0000110???00001???100010?????0001?1000001000000000 'Vulpes_velox' 1100?11110111?11101110?00000011????????????????01???????????00001???100010?????0001?1000001000000000 'Vulpes_rueppelli' 1011?11110110?11101101?????????????????????????01???????????00001???100010?????0001?1000001010100000 'Vulpes_pallida' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001011000000 'Vulpes_ferrilata' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001100000000 'Vulpes_corsac' 1010?11110100?11101100?????????????????????????01???????????00001???100010?????0001?1000001100000000 'Vulpes_chama' 0000?11100000?11101000?00000010????????????????01???????????00001???100010?????0001?1000001011000000 'Vulpes_cana' 0000?11111000?11110000?????????????????????????01???????????00001???100010?????0001?1000001010100000 'Vulpes_bengalensis' ???????????????????????????????????????????????01???????????00001???100010?????0001?1000001011000000 'Urocyon_littoralis' ?????????????????????????????????????????11000?01???????????????????100100?????0010?1000001000010000 'Urocyon_cinereoargenteus' 0000?10000000?10000000?00000000??????????11000?01???????????????????100100?????0010?1000001000010000 'Speothos_venaticus' ???????????????????????11001000??????????00100?10?0000000???????????000001?????0000?0000000000000000 'Pseudalopex_vetulus' ???????????????????????11110000??????????00101?01???????????????????101000?????0100?1000000000001101 'Pseudalopex_sechurae' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001101 'Pseudalopex_gymnocercus' ?????????????????????????????????????????00101?01?1100000???????????101000?????0100?1000000000001110 'Pseudalopex_griseus' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001100 'Pseudalopex_culpaeus' ?????????????????????????????????????????00101?01???????????????????101000?????0100?1000000000001110 'Otocyon_megalotis' ?????11000000?11000000?00000000????????????????00?0000001???????????100000?????0000?0000000000000000 'Nyctereutes_procyonoides' ???????????????????????00000000????????????????01?0000001???????????100000?????0000?0000000000000000 'Lycaon_pictus' ???????????????????????11001100??????????00100?10?0000000???????????000001?????0000?0000000000000000 'Dusicyon_australis' ?????????????????????????????????????????00101?01???????????????????100000?????0000?1000000000001000 'Cuon_alpinus' ???????????????????????????????????????????????10???????????????????000001?????0000?0000000000000000 'Chrysocyon_brachyurus' ???????????????????????11100000??????????00100?01?0000100???????????100000?????0000?0000000000000000 'Cerdocyon_thous' ???????????????????????11110000????????????????01???????????????????100000?????0000?1000000000001000 'Canis_simensis' ?????????????????????????????????????????00110?01???????????????????110000?????1000?1110000000000000 'Canis_rufus' ??????????????????????????????????100?10?00110?01?????????1?11100?1?110000???1?1000?1100110000000000 'Canis_mesomelas' ???????????????????????11001000?1?011?11?00110?01?1110000???11010???110000?0???1000?1111000000000000 'Canis_lupus' ?????00000000?00000000?11001100?1?100????00110?01?1001000?1?11100?1?110000???1?1000?1100110000000000 'Canis_latrans' ???????????????????????11001100???010?00?00110?01?1001000???10000?1?110000???1?1000?1100100000000000 'Canis_aureus' ??????????????????????????????????011?11?00110?01???????????11010?0?110000?1?0?1000?1110000000000000 'Canis_adustus' ?????????????????????????????????????????00110?01?1110000???11010???110000?1???1000?1111000000000000 'Atelocynus_microtis' ?????????????????????????????????????????00100?01???????????????????100000?????0000?1000000000001000 'Alopex_lagopus' 1100?11111111?11101110?00000011?0??????????????01?0000110???00001???100000?????0000?1000001000000000 'Vulpes_zerda' ?????????11000000000000 'Vulpes_vulpes' ?????????11000000000000 'Vulpes_velox' ?????????11000000000000 'Vulpes_rueppelli' ??????????????????????? 'Vulpes_pallida' ?????????11000000000000 'Vulpes_ferrilata' ????????????????????0?? 'Vulpes_corsac' ?????????11000000000000 'Vulpes_chama' ?????????11000000000000 'Vulpes_cana' ???????????????????0??? 'Vulpes_bengalensis' ?????????11000000000000 'Urocyon_littoralis' ??????????????????????? 'Urocyon_cinereoargenteus' ???0?????10100000000000 'Speothos_venaticus' ?????????00010001101101 'Pseudalopex_vetulus' ?????????00010001101000 'Pseudalopex_sechurae' ?????????00010001000000 'Pseudalopex_gymnocercus' ?????????00010001000000 'Pseudalopex_griseus' ?????????00010001000000 'Pseudalopex_culpaeus' ?????????00010001110000 'Otocyon_megalotis' ?1???????10100000000000 'Nyctereutes_procyonoides' ?????????00010001101110 'Lycaon_pictus' ?0???????00011010000000 'Dusicyon_australis' ?????????00010001110000 'Cuon_alpinus' ?????????00011010000000 'Chrysocyon_brachyurus' ?????????00010001101000 'Cerdocyon_thous' ?????????00010001101110 'Canis_simensis' ?????????00011100000000 'Canis_rufus' ?????1?1?00011100000000 'Canis_mesomelas' ?????????00011100000000 'Canis_lupus' ?1?1?1?1?00011100000000 'Canis_latrans' ???1?1?0?00011100000000 'Canis_aureus' ?????????00011100000000 'Canis_adustus' ?????????00011100000000 'Atelocynus_microtis' ?????????00010001101101 'Alopex_lagopus' ?????????11000000000000 ; END; BEGIN ASSUMPTIONS; OPTIONS DEFTYPE = unord PolyTcount = MINSTEPS ; END; BEGIN TREEBASE; END; BEGIN TREES; [! 1 trees. TreeBASE accession#: Tree1118 ] TRANSLATE 1 'Nyctereutes_procyonoides', 2 'Urocyon_cinereoargenteus', 3 'Pseudalopex_gymnocercus', 4 'Chrysocyon_brachyurus', 5 'Pseudalopex_sechurae', 6 'Pseudalopex_culpaeus', 7 'Pseudalopex_vetulus', 8 'Atelocynus_microtis', 9 'Pseudalopex_griseus', 10 'Dusicyon_australis', 11 'Urocyon_littoralis', 12 'Vulpes_bengalensis', 13 'Speothos_venaticus', 14 'Otocyon_megalotis', 15 'Vulpes_ferrilata', 16 'Vulpes_rueppelli', 17 'Canis_mesomelas', 18 'Cerdocyon_thous', 19 'Alopex_lagopus', 20 'Vulpes_pallida', 21 'Canis_simensis', 22 'Vulpes_corsac', 23 'Canis_latrans', 24 'Lycaon_pictus', 25 'Canis_adustus', 26 'Vulpes_vulpes', 27 'Vulpes_chama', 28 'Vulpes_velox', 29 'Vulpes_zerda', 30 'Canis_aureus', 31 'Cuon_alpinus', 32 'Pseudalopex', 33 'Canis_rufus', 34 'Vulpes_cana', 35 'Canis_lupus', 36 'Canidae', 37 'Canis', ; TREE 'Fig._9' = [&R] (((((((19,28),((22,15),(16,26))),(34,29)),(12,27,20)),(2,11)),14),((8,(((25,30),17),(23,(35,33),21)),18,4,31,(10,((6,9,3,5),7)),24,13),1)); END; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Nov 21 10:51:52 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 21 Nov 2008 05:51:52 -0500 Subject: [Biojava-dev] [Bug 2679] NEXUS parse fails on extraneous comma (TREES/TRANSLATE) In-Reply-To: Message-ID: <200811211051.mALApqKl020508@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2679 ------- Comment #3 from keesey at gmail.com 2008-11-21 05:51 EST ------- version is 1.6 (no place to specifiy that) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Nov 25 23:31:57 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Nov 2008 18:31:57 -0500 Subject: [Biojava-dev] [Bug 2687] New: UniProt: Tags in feature continuation lines may be lost. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2687 Summary: UniProt: Tags in feature continuation lines may be lost. Product: BioJava Version: unspecified Platform: PC OS/Version: Windows XP Status: NEW Severity: major Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: jan at biochemfusion.com Using BioJava 1.6.1 on Windows XP to read in a UniProt file and write it back out. Main code that does this: BufferedReader br = new BufferedReader(new FileReader(args[0])); SimpleNamespace ns = new SimpleNamespace("biojava"); RichSequenceIterator rsi = RichSequence.IOTools.readUniProt(br, ns); RichSequence rs = rsi.nextRichSequence(); RichSequence.IOTools.writeUniProt(System.out, rs, ns); When reading in this heavily abridged version of FA9_BOVIN from www.uniprot.org it works: ID FA9_BOVIN Reviewed; 416 AA. AC P00741; FT CHAIN 1 416 Coagulation factor IX. FT CARBOHYD 53 53 O-linked (Glc...). FT /FTId=CAR_000008. FT Extra information. SQ SEQUENCE 416 AA; 46785 MW; 34A7DFE916330662 CRC64; YNSGKLEEFV RGNLERECKE EKCSFEEARE VFENTEKTTE FWKQYVDGDQ CESNPCLNGG MCKDDINSYE CWCQAGFEGT NCELDATCSI KNGRCKQFCK RDTDNKVVCS CTDGYRLAED QKSCEPAVPF PCGRVSVSHI SKKLTRAETI FSNTNYENSS EAEIIWDNVT QSNQSFDEFS RVVGGEDAER GQFPWQVLLH GEIAAFCGGS IVNEKWVVTA AHCIKPGVKI TVVAGEHNTE KPEPTEQKRN VIRAIPYHSY NASINKYSHD IALLELDEPL ELNSYVTPIC IADRDYTNIF SKFGYGYVSG WGKVFNRGRS ASILQYLKVP LVDRATCLRS TKFSIYSHMF CAGYHEGGKD SCQGDSGGPH VTEVEGTSFL TGIISWGEEC AMKGKYGIYT KVSRYVNWIK EKTKLT // However, when the extra information has been tagged with a slash it is lost: ID FA9_BOVIN Reviewed; 416 AA. AC P00741; FT CHAIN 1 416 Coagulation factor IX. FT CARBOHYD 53 53 O-linked (Glc...). FT /FTId=CAR_000008. FT /NB=Extra information. SQ SEQUENCE 416 AA; 46785 MW; 34A7DFE916330662 CRC64; YNSGKLEEFV RGNLERECKE EKCSFEEARE VFENTEKTTE FWKQYVDGDQ CESNPCLNGG MCKDDINSYE CWCQAGFEGT NCELDATCSI KNGRCKQFCK RDTDNKVVCS CTDGYRLAED QKSCEPAVPF PCGRVSVSHI SKKLTRAETI FSNTNYENSS EAEIIWDNVT QSNQSFDEFS RVVGGEDAER GQFPWQVLLH GEIAAFCGGS IVNEKWVVTA AHCIKPGVKI TVVAGEHNTE KPEPTEQKRN VIRAIPYHSY NASINKYSHD IALLELDEPL ELNSYVTPIC IADRDYTNIF SKFGYGYVSG WGKVFNRGRS ASILQYLKVP LVDRATCLRS TKFSIYSHMF CAGYHEGGKD SCQGDSGGPH VTEVEGTSFL TGIISWGEEC AMKGKYGIYT KVSRYVNWIK EKTKLT // BioPerl 1.5.2 copes with both situtations without problems. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Nov 25 23:35:42 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Nov 2008 18:35:42 -0500 Subject: [Biojava-dev] [Bug 2687] UniProt: Tags in feature continuation lines may be lost. In-Reply-To: Message-ID: <200811252335.mAPNZg6D004170@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2687 ------- Comment #1 from jan at biochemfusion.com 2008-11-25 18:35 EST ------- To clarify: In the first case the information in all feature lines is correctly preserved. In the second case the last FT line "/NB=Extra information." is lost. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at eaglegenomics.com Wed Nov 26 16:04:56 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Wed, 26 Nov 2008 16:04:56 +0000 Subject: [Biojava-dev] BugZilla! Message-ID: <492D73A8.4020401@eaglegenomics.com> Hi all, There's been a few new bugs reported on BugZilla recently. Whilst the development of the new BJ features is fun, it's really important to maintain the existing code and keep on top of the issues reported. If there's anyone with a few minutes spare, or would like to learn BioJava and don't know where to start - then zapping a few virtual creepy crawlies is an excellent way to get involved and make a contribution. Here's the list of the currently outstanding problems: http://bugzilla.open-bio.org/buglist.cgi?product=BioJava&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED If you see one you think you can fix, or might like to try to investigate, go for it! All you have to do is assign the bug to yourself in order to take ownership of it so that people don't end up working on the same thing without knowing. cheers, Richard -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From bugzilla-daemon at portal.open-bio.org Wed Nov 26 17:44:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Nov 2008 12:44:45 -0500 Subject: [Biojava-dev] [Bug 2603] StringIndexOutOfBoundsException while parsing blastresult In-Reply-To: Message-ID: <200811261744.mAQHijbY009725@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2603 tbanks at agr.gc.ca changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #7 from tbanks at agr.gc.ca 2008-11-26 12:44 EST ------- Fixed. The change to the blast result format (sometime around v2.2.15) made the parser unable to determine where the end of one result finished and the next began. Now the code can recognize the two ways a result can end; the start of a new result or the statistics for the blast and database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mark.schreiber at novartis.com Thu Nov 27 02:33:05 2008 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Thu, 27 Nov 2008 10:33:05 +0800 Subject: [Biojava-dev] [Biojava-l] BugZilla! In-Reply-To: <492D73A8.4020401@eaglegenomics.com> Message-ID: Hi - I would concur with this. BioJava is volunteer based and requires people to volunteer before things happen. In my experience bug reports get resolved most rapidly if the person reporting the bug can do as much of their own investigation as possible. Have a good look at the stack trace and try using a debugger. You can rapidly narrow down the underlying cause. Even if you are not confident about the best way to fix the problem, armed with all the relevant information someone more experienced will be able to tell you if you are on the right track. Unit tests are also good here. For one thing they will set-up a situation to replicate your bug. They will also provide a way to make sure the bug doesn't happen again in future releases. Finally, the more unit tests there are the more confident you can be that your proposed bug fix won't break anything else. Helping fix stuff in BioJava will definitely improve you're programming skills and greatly increase your knowledge of how the API works so by helping others you are probably helping yourself more. - Mark biojava-l-bounces at lists.open-bio.org wrote on 11/27/2008 12:04:56 AM: > Hi all, > > There's been a few new bugs reported on BugZilla recently. Whilst the > development of the new BJ features is fun, it's really important to > maintain the existing code and keep on top of the issues reported. > > If there's anyone with a few minutes spare, or would like to learn > BioJava and don't know where to start - then zapping a few virtual > creepy crawlies is an excellent way to get involved and make a contribution. > > Here's the list of the currently outstanding problems: > > http://bugzilla.open-bio.org/buglist.cgi? > product=BioJava&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED > > If you see one you think you can fix, or might like to try to > investigate, go for it! All you have to do is assign the bug to yourself > in order to take ownership of it so that people don't end up working on > the same thing without knowing. > > cheers, > Richard > > -- > Richard Holland, BSc MBCS > Finance Director, Eagle Genomics Ltd > M: +44 7500 438846 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you. From talk2ali at gmail.com Sat Nov 22 01:38:26 2008 From: talk2ali at gmail.com (Muhammad Ali) Date: Fri, 21 Nov 2008 20:38:26 -0500 Subject: [Biojava-dev] NCBI Blast XML Parser update In-Reply-To: <59a41c430808191936sb139efcwff92065cffe8f49@mail.gmail.com> References: <59a41c430808191936sb139efcwff92065cffe8f49@mail.gmail.com> Message-ID: Hi, Sorry for the delay, I got caught up with my other project. Anyway, I'm here now and I've attached the patch file with this email. Please take a look and let me know if changes are required. With regards to testing, there is a BlastParser.java file in the /demos/blastxml folder that I've been using to test on my end. It takes a Blast output file as its input. If you want, I can provide some Blast output file with multiple iteration entries for use as a sample input file for testing. I think I'll do without SVN write access. I don't expect to be working on other parts of biojava right now, unless someone else in my lab complains about something else being broken ;). But if I do make any local patches, I'll definitely try and get them back to you guys. Cheers, Ali. On Tue, Aug 19, 2008 at 9:36 PM, Andreas Prlic wrote: > Hi Ali, > > that's good news, can you send your patch to this list, so we can have > a look? At this stage it would be also great to provide a new Junit > test that makes sure that the parsing works ok and to demonstrate the > new feature. If you provide more patches in the future we can also set > up write access to the SVN repository. > > Andreas > > On Tue, Aug 19, 2008 at 6:10 AM, Muhammad Ali wrote: >> Hello, >> >> The current version of the BlastXMLParser (used for parsing NCBI BLAST >> output files) is not handling multiple Iteration entries in the file >> correctly. It lumps them all together, resulting in a loss of >> search-specific parameters. The end result is a single >> SeqSimilaritySearchResult object. The expected output should be one >> SeqSimilaritySearchResult for each Iteration entry in the file. >> >> I've fixed this issue on my locally checked out copy by modifying a >> few files in the org.biojava.bio.program.sax.blastxml package. I'm >> interested in submitting the updated code back to the main repository. >> Can someone tell me how I can go about doing so? >> >> Also the javadocs for the BlastXMLParser seem to be outdated. They >> mention malformed XML being generated by NCBI, but that doesn't seem >> to be the case anymore. >> >> Thanks, >> Ali. >> _______________________________________________ >> biojava-dev mailing list >> biojava-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-dev >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: changes.patch Type: application/octet-stream Size: 34641 bytes Desc: not available URL: