From gwaldon at geneinfinity.org Thu Mar 1 00:01:38 2007 From: gwaldon at geneinfinity.org (george waldon) Date: Wed, 28 Feb 2007 21:01:38 -0800 Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Message-ID: <200703010501.l2151dk5093954@mmm1924.dulles19-verio.com> Hi, I have a simple problem with the genbank parser. It returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. I am not familiar with the parsers; maybe someone could volonteer to correct this problem. Thanks, George From mark.schreiber at novartis.com Thu Mar 1 00:23:46 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Thu, 1 Mar 2007 13:23:46 +0800 Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Message-ID: Hi George - I think someone will fix it promptly but to make sure it gets done and doesn't happen again could you submit a bug report with an example accession. We recently made a decision that fixes to bug reports should be accompanied by JUnit tests that make sure that bug gets tested with every new build to avoid the same issues creaping back in. Thanks, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com www.dengueinfo.org phone +65 6722 2973 fax +65 6722 2910 "george waldon" Sent by: biojava-dev-bounces at lists.open-bio.org 03/01/2007 01:01 PM Please respond to george waldon To: biojava-dev at biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Hi, I have a simple problem with the genbank parser. It returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. I am not familiar with the parsers; maybe someone could volonteer to correct this problem. Thanks, George _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From bugzilla-daemon at portal.open-bio.org Thu Mar 1 21:58:50 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Mar 2007 21:58:50 -0500 Subject: [Biojava-dev] [Bug 2223] New: newline characters in RichSequence description Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2223 Summary: newline characters in RichSequence description Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org The biojavax genbank parser returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. It might be the same for with embl files. This is reproduced by parsing any genbank file containing multiple lines in the DESCRIPTION keywords with RichSequence.IOTools.readGenbankDNA, e.g. try AAB31603. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 2 05:12:05 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 2 Mar 2007 05:12:05 -0500 Subject: [Biojava-dev] [Bug 2223] newline characters in RichSequence description In-Reply-To: Message-ID: <200703021012.l22AC51i013336@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2223 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX ------- Comment #1 from holland at ebi.ac.uk 2007-03-02 05:12 EST ------- This newline stuff is annoying, admittedly, but it is deliberate. There are a number of formats out there which rely on newlines being preserved within comment or description text, such as the Genbank-like format produced by Vectorbase. Also, some comments or descriptions are meaningless if the newlines are not preserved (consider a description or comment block that makes an ordered list of points, one per line, but those points do not take up the whole line - if the newlines were ignored and the block reformatted on output then the points would run into each other). I originally had this code to drop the newlines, but there were too many problems due to the above situations, and so I made it preserve them. Granted this makes transfers to other formats a little messy, sometimes introducing newlines where they are not necessary, but it preserves things much better. If you need the newlines taken out, a simple search-and-replace on the text would do it: mydesc = mydesc.replaceAll("\n",""); If you then need to update the record before formatting it you can update (or replace) the comment object involved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at ebi.ac.uk Mon Mar 5 04:32:14 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Mon, 05 Mar 2007 09:32:14 +0000 Subject: [Biojava-dev] Google Summer of Code Message-ID: <45EBE39E.9010402@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Phyloinformatics Hackathon group, which aims to foster collaboration among the Bio* toolkit developers (see http://phyloinformatics.net/ ), is preparing to apply for the Google Summer of Code program. The page for collecting ideas etc is at http://phyloinformatics.net/Phyloinformatics_Summer_of_Code_2007 This is mostly a stub right now but will rapidly (have to) be fleshed out over the next couple of days (the deadline for application is March 9). Please feel free to add ideas that you have directly (wiki registration is open), or email them to Hilmar Lapp (hlapp at duke dot edu) who is managing the program. If we are accepted, we'll (hopefully) have students over the summer, some of which will possibly work on BioPerl and BioJava-related projects. These may be newcomers to BioJava, newcomers to distributed OSS development, or even programming newbies ... Given the helping hand that this community has readily extended to newbies in the past, I'm hoping that you'll help us help them overcome the initial barriers, too. Also, this offers us a chance to get work done on BioJava and find future contributors with the financial and logistic help of Google. If anyone is willing to go beyond that and would be willing to help out as a mentor (or back-up mentor), that'd be great; just send me an email. cheers, Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFF6+Oe4C5LeMEKA/QRAp6EAJ9K+Y4Bb1s6zLiaovmApEymvs0eIgCWMJce U+iKoYrFqwBH+aqlrSOXAw== =lSpn -----END PGP SIGNATURE----- From darin.london at duke.edu Tue Mar 6 11:03:59 2007 From: darin.london at duke.edu (Darin London) Date: Tue, 06 Mar 2007 11:03:59 -0500 Subject: [Biojava-dev] Announcing BOSC 2007 Message-ID: <45ED90EF.7030000@duke.edu> The BOSC Organizing Committee are proud to announce BOSC 2007, occurring in Vienna, Austria on July 19th, 20th. The conference this year promises to be exciting, as the BOSC developers attempt to define and solve currently intractable problems in Bioinformatics. Please refer to the following website for complete information, and requests for submissions. Thank you, and we hope to see you in Vienna. http://open-bio.org/wiki/BOSC_2007 The BOSC organizing Committee Please pass this email on to anyone that would be interested. From bugzilla-daemon at portal.open-bio.org Tue Mar 13 20:48:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 13 Mar 2007 20:48:31 -0400 Subject: [Biojava-dev] [Bug 2234] New: Error during "rich" conversion of genbank to EMBL format Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2234 Summary: Error during "rich" conversion of genbank to EMBL format Product: BioJava Version: unspecified Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org An EMBL file generated by RichSequence.IOTools.writeEMBL is not read by RichSequence.IOTools.readEMBLDNA. To reproduce, get AY847516 from NCBI (a CTLA-4 mAb); read with RichSequence.IOTools.readGenbankDNA, write with writeEMBL, and then read again with readEMBLDNA. An IO exception is thrown at the first line with the message "error during parsing". The first line has a suspect null data class (ID AY847516; linear; mRNA; null; PRI; 362 BP). The problem does not appear when AY847516 is queried from EMBL, written and read in EMBL format (ID AY847516; linear; mRNA; STD; HUM; 362 BP). Thanks, George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 14 04:23:55 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 04:23:55 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703140823.l2E8NtJ0031964@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-14 04:23 EST ------- Thanks for pointing this out. It was due to the data class being an EMBL-specific thing and not being populated when reading from other formats such as Genbank. I have made EMBLFormat use a default value of STD for the data class in these cases. This should solve the problem. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 14 14:12:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 14:12:51 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703141812.l2EICp56032597@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED | ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-14 14:12 EST ------- Thanks Richard for the fix, but it seems that other problems were uncovered. First doing the read/write/read operation with EMBL format only, I got "Bad ID line found": ID AY847516; linear; mRNA; STD; HUM; 362 BP. I added the version by hand: ID AY847516; SV 1; linear; mRNA; STD; HUM; 362 BP. and I got "Current bioentry already has a version". I removed the SV line, and I got "Bad date line found...",which corresponds to the following line: DT 01-JUN-2005 (Rel. 84 Created) Finally, I added a coma after 84, and the file finally parsed without error. In the process, I noticed that the first reference lost the last character of the last RT line (missing 's' at the end of lymphocytes); note that the second reference does not have RT lines. Secondly, doing the read/write/read operation from Genbank to EMBL format, I got similar progression; the bad date line has now a 0 instead of 84, but adding the coma after 0 resolved the reading problem. Then I got a new error: "Could not read sequence For input string: "GI:61815557", which is generated by the following FT line: FT /db_xref="taxon:GI:61815557" I did not investigate further at this point. I noticed that the above reference is now complete, but is incomplete when I do the read/write/read operation from EMBL to Genbank. Therefore, this is probably a read problem in the EMBL parser. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From zbhuang2005 at hotmail.com Wed Mar 14 15:08:03 2007 From: zbhuang2005 at hotmail.com (Huang Zhibin) Date: Wed, 14 Mar 2007 19:08:03 +0000 Subject: [Biojava-dev] tandem repeats in biojava Message-ID: Hi, I just wonder whether biojava implements the function of exact/approximate tandem repeats in DNA. I thought it is a typical function of dna, but I could not find anything about it in biojava. Hope someone can give me some hints. Thanks Best, zbhuang _________________________________________________________________ ???????????????????????????? MSN Messenger: http://messenger.msn.com/cn From joel.pitt at gmail.com Wed Mar 14 17:13:00 2007 From: joel.pitt at gmail.com (Joel Pitt) Date: Thu, 15 Mar 2007 10:13:00 +1300 Subject: [Biojava-dev] tandem repeats in biojava In-Reply-To: References: Message-ID: Hi zbhuang, When I last checked about a couple of years ago there wasn't anything like that as part of the core biojava. I'm the author of http://repeatfinder.sourceforge.net/ which is an approximate tandem/exact repeatfinder using our own searching algorithm. I was planning to port it into biojava (as well as build a nice GUI for it), but as is usually the case, I haven't had the time. The RepeatFinder source code is available though, so feel free to port it yourself ;) Cheers, Joel On 3/15/07, Huang Zhibin wrote: > Hi, > I just wonder whether biojava implements the function of exact/approximate > tandem repeats in DNA. I thought it is a typical function of dna, but I > could not find anything about it in biojava. > Hope someone can give me some hints. Thanks > Best, > zbhuang > > _________________________________________________________________ > ???????????????????????????? MSN Messenger: http://messenger.msn.com/cn > > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- -Joel "Unless you try to do something beyond what you have mastered, you will never grow." -C.R. Lawton From bugzilla-daemon at portal.open-bio.org Wed Mar 14 19:02:49 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 19:02:49 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703142302.l2EN2ngu013096@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 ------- Comment #3 from holland at ebi.ac.uk 2007-03-14 19:02 EST ------- good grief, what a can of worms.... i'll get onto it on Friday when I'm back in the office! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 15 06:11:58 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Mar 2007 06:11:58 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703151011.l2FABw47005920@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 ------- Comment #4 from holland at ebi.ac.uk 2007-03-15 06:11 EST ------- I've identified the problem areas in the EMBLFormat code. I will fix them tomorrow. Note to self: if (format.equals(EMBL_FORMAT)) { // accession; SV version; circular/linear; moltype; dataclass; division; length BP. locusLine.append(rs.getAccession()); locusLine.append("; "); --- INSERT "SV versionLine;" here locusLine.append(rs.getCircular()?"circular; ":"linear; "); locusLine.append(moltype); // version line --- WRAP in test - only execute if EMBL_PRE87_FORMAT if (versionLine!=null) StringTools.writeKeyValueLine(VERSION_TAG, versionLine, 5, this.getLineWidth(), null, VERSION_TAG, this.getPrintStream()); else StringTools.writeKeyValueLine(VERSION_TAG, accession+"."+rs.getVersion(), 5, this.getLineWidth(), null, VERSION_TAG, this.getPrintStream()); this.getPrintStream().println(DELIMITER_TAG+" "); // date line -- INSERT commaa before space before Created StringTools.writeKeyValueLine(DATE_TAG, (cdat==null?udat:cdat)+" (Rel. "+(crel==null?"0":crel)+" Created)", 5, this.getLineWidth(), null, DATE_TAG, this.getPrintStream()); --- CHECK that leading space chomped first - does this record start in space then "? It ends in "; on last line but does not start on same line if (key.equals(TITLE_TAG)) { if (val.length()>1) { if (val.endsWith(";")) val = val.substring(0,val.length()-1); // chomp semicolon if (val.endsWith("\"")) val = val.substring(1,val.length()-2); // chomp quotes title = val; } else title=null; // single semi-colon indicates no title } // add-in other dbxrefs where present for (Iterator j = f.getRankedCrossRefs().iterator(); j.hasNext(); ) { RankedCrossRef rcr = (RankedCrossRef)j.next(); CrossRef cr = rcr.getCrossRef(); --- REMOVE taxon: prefix - shouldn't be there... StringTools.writeKeyValueLine(FEATURE_TAG, "/db_xref=\"taxon:"+cr.getDbname()+":"+cr.getAccession()+"\"", 21, this.getLineWidth(), null, FEATURE_TAG, this.getPrintStream()); } -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 16 05:50:27 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 16 Mar 2007 05:50:27 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703160950.l2G9oR2a005733@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED ------- Comment #5 from holland at ebi.ac.uk 2007-03-16 05:50 EST ------- Fixes made and committed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 14:38:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 14:38:34 -0400 Subject: [Biojava-dev] [Bug 2244] New: uniprot files do not load Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2244 Summary: uniprot files do not load Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org Exceptions are thrown when Uniprot files are tentatively read by the biojavax parser. Reading P01717 or P07724 with RichSequence.IOTools.readUniProt throw : Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:393) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) P07724 (this one initially reported by Sofia Burvall) Reading P13346 throw: Caused by: org.biojava.bio.seq.io.ParseException: Bad date line found: 01-JAN-1990 (Rel. 13, Created) at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:349) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 15:07:57 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 15:07:57 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703211907.l2LJ7v5u010410@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |holland at ebi.ac.uk Status|NEW |ASSIGNED ------- Comment #1 from holland at ebi.ac.uk 2007-03-21 15:07 EST ------- I don't understand what the problem is. If I download UniProt files for the 3 accessions you mention from the official UniProt website at http://www.pir.uniprot.org/ then I get files that parse correctly. The date lines shown in your example are not in UniProt format. They are EMBL format date lines. Where are you getting your UniProt files from? This sounds like a bug in the software at the site you are getting from, not a bug with BioJava. Please let me know where you are getting the files from and I will investigate a bit more. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 16:15:56 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 16:15:56 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212015.l2LKFu2x014547@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-21 16:15 EST ------- Hi Richard, We must have some mixup of some sort. The bad date line was reported when I tried to parse the file included in the email from Sofia (it was posted on biojava-1 on March 16, 2007). Effectively, this file is different from what the one I get at the uniprot web site. Reading Sophia email, it is possible that the incorrect file was in fact generated by biojavax (in the same kind of operation like read/write/read problem that was corrected recently). I am going to investigate the other exception. The 2 files were from the correct source (http://www.expasy.org/uniprot/P01717, click on "View entry in original UniProtKB/Swiss-Prot format", etc.). I'll get back on this latter. - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 16:27:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 16:27:15 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212027.l2LKRFrc015247@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #3 from gwaldon at geneinfinity.org 2007-03-21 16:27 EST ------- Well, the 2 files are respectively from: http://www.expasy.org/cgi-bin/get-sprot-entry?P07724 http://www.expasy.org/cgi-bin/get-sprot-entry?P01717 The ArrayIndexOutOfBoundsException are thrown for the lines: P07724: DR ProDom [Domain structure / List of seq. sharing at least 1 domain ] P01717: DR HOVERGEN [Family / Alignment / Tree] Hope this helps! - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 17:53:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 17:53:26 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212153.l2LLrQ8X020150@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #4 from holland at ebi.ac.uk 2007-03-21 17:53 EST ------- Thanks for that extra info! I'll investigate further over the next couple of days and get back to you when I've got a solution. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 22 08:19:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 22 Mar 2007 08:19:20 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703221219.l2MCJKIM028661@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #5 from holland at ebi.ac.uk 2007-03-22 08:19 EST ------- I have investigated the two problem DR lines and found that they do not match the current UniProt file format specifications as defined at http://www.expasy.org/sprot/userman.html#DR_line . There is no indication in the UniProt file format specification as to what a DR line in this format might actually mean. As these files came from UniProt themselves, I think the best thing to do is for you to contact UniProt and raise a bug report with them directly indicating that their website is producing files that do not conform to their own file format standards. BJX can only parse files which follow the official format definition. There's only so much flexibility we can build in! :) I'll leave this bug as ASSIGNED for now so that you can add any comments regarding what UniProt say about their bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 26 04:23:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Mar 2007 04:23:09 -0400 Subject: [Biojava-dev] [Bug 2250] New: Genbank format error Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2250 Summary: Genbank format error Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: mark.schreiber at novartis.com I think that the following error was caused by the last fix to GenbankFormat Format_object=org.biojavax.bio.seq.io.GenbankFormat Accession=null Id=null Comments=Bad locus line Parse_block=LOCUS NM_182008 1629 bp mRNA linear INV 23-MAR-2007 Stack trace follows .... at org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:323) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 2 more -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 26 10:41:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Mar 2007 10:41:40 -0400 Subject: [Biojava-dev] [Bug 2250] Genbank format error In-Reply-To: Message-ID: <200703261441.l2QEfent022095@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2250 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-26 10:41 EST ------- Fixed. Now testing before commit. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 00:47:10 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 00:47:10 -0400 Subject: [Biojava-dev] [Bug 2253] New: NullPointerException in MultiSourceCompoundRichLocation Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2253 Summary: NullPointerException in MultiSourceCompoundRichLocation Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org More exactly in the blockIterator method taken from CompoundRichLocation where the "sortedMembers" variable is null. This set is actually never instantiated, see constructor. Well, I don't understand the function of the "sortedMembers" list; why not sorting the "members" list instead from the start? - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 01:27:41 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 01:27:41 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703290527.l2T5Rfu3010839@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #6 from gwaldon at geneinfinity.org 2007-03-29 01:27 EST ------- The problem is limited to the Expasy server. I am trying to get in touch with someone overthere. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 04:54:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 04:54:26 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703290854.l2T8sQkH019389@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-29 04:54 EST ------- The only way that sortedMembers can be null is if the object is loaded in via Hibernate. This is a bug which I have now fixed in CVS. PS. In order to reproduce output files that are largely the same as the input files these locations are read from, whilst simultaneously providing a sensible iterator for the user, it is important to maintain two separate orders. If the user ever calls sort() on the location, the two orders become the same. PPS. There is NO test in BJX for this bug as it is Hibernate-related and therefore almost impossible to write a test case for! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 12:08:42 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 12:08:42 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703291608.l2TG8gWo008372@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-29 12:08 EST ------- The problem was in MultiSourceCompoundRichLocation and not its parent class. This resulted in the second set to be null from both public constructors. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 13:13:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 13:13:40 -0400 Subject: [Biojava-dev] [Bug 2255] New: Problems with read and write of EMBL files Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2255 Summary: Problems with read and write of EMBL files Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org I just noticed two problems with EMBL files. During read operations, the first keyword (KW line) is not parsed. Also during write operations, the sequence line (SQ) is missing the word "Sequence" in the output. - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gwaldon at geneinfinity.org Fri Mar 30 01:28:51 2007 From: gwaldon at geneinfinity.org (george waldon) Date: Thu, 29 Mar 2007 22:28:51 -0700 Subject: [Biojava-dev] Exception in MassCalc.getMass Message-ID: <20070330052851.16176.qmail@mmm1924.dulles19-verio.com> I got a NullPointerException in MassCalc.getMass that I traced to the presence of an ambiguity symbol X in the peptide sequence. The origin lies in SimpleSymbolPropertyTable.getDoubleValue. The ambiguity symbol is first validated (through its matches in AbstractAlphabet) and the exception is thrown when we try to find a value associated with a null key in the property map. First I think we should redirect this exception to the checked IllegalSymbolException to tell the user that something is going wrong. Concerning calculating masses, I see two needs: need to estimate the mass of polypeptides containing ambiguity symbols and need to obtain an exact mass or an IllegalSymbolException when ambiguity occurs. I propose to add a new method MassCalc.getEstimatedMass to do the first part and to modify the existing method so that an IllegalSymbolException is thrown whenever a non-atomic symbol is encounter. Mass values for ambiguity symbols would be added to the ResidueProperty.xml file and taken as simple average of the atomic match values. - George From mark.schreiber at novartis.com Fri Mar 30 01:43:05 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 30 Mar 2007 13:43:05 +0800 Subject: [Biojava-dev] Exception in MassCalc.getMass Message-ID: Hi George - Sounds like a good idea. Can you check it in? And add a JUnit test??? You probably don't need to add ambiguity weights to the XML as you can decompose a basis symbol into it's components, look up each and get the average. Infact there are so many possible amino acid ambiguities it would be a very bad idea to list them all in the XML file. Although IUPAC only has about 3 amino acid ambiguities BioJava's BasisSymbols can infact represent any and every possible combination of ambiguity (although there is no obvious way to tokenize them to text) so you would need to allow for this in your getEstimated mass method. Thanks for spotting this, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 "george waldon" Sent by: biojava-dev-bounces at lists.open-bio.org 03/30/2007 01:28 PM Please respond to george waldon To: biojava-dev at biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Exception in MassCalc.getMass I got a NullPointerException in MassCalc.getMass that I traced to the presence of an ambiguity symbol X in the peptide sequence. The origin lies in SimpleSymbolPropertyTable.getDoubleValue. The ambiguity symbol is first validated (through its matches in AbstractAlphabet) and the exception is thrown when we try to find a value associated with a null key in the property map. First I think we should redirect this exception to the checked IllegalSymbolException to tell the user that something is going wrong. Concerning calculating masses, I see two needs: need to estimate the mass of polypeptides containing ambiguity symbols and need to obtain an exact mass or an IllegalSymbolException when ambiguity occurs. I propose to add a new method MassCalc.getEstimatedMass to do the first part and to modify the existing method so that an IllegalSymbolException is thrown whenever a non-atomic symbol is encounter. Mass values for ambiguity symbols would be added to the ResidueProperty.xml file and taken as simple average of the atomic match values. - George _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From bugzilla-daemon at portal.open-bio.org Fri Mar 30 04:51:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 04:51:20 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703300851.l2U8pKbZ016735@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #3 from holland at ebi.ac.uk 2007-03-30 04:51 EST ------- Indeed you're right. I just checked again and the bugfix I applied should have fixed this too. Could you confirm that? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at ebi.ac.uk Fri Mar 30 04:53:02 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 30 Mar 2007 09:53:02 +0100 Subject: [Biojava-dev] ParseExceptions Message-ID: <460CCFEE.4030609@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all. Can the person who recently modified the BJX EMBLFormat to throw ParseExceptions using the ParseException.newMessage() method please check all their code to make sure that it is correct. I have noticed that every occurrence of this in EMBLFormat was implemented incorrectly and never threw the generated message. I have fixed it there, but have no idea how many other places this might have happened. I have replaced all occurrences with two lines similar to this: String message = ParseException.newMessage(....); throw new ParseException(message); cheers, Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDM/u4C5LeMEKA/QRAkf3AJ4opHh8SDa+jUIoNnKTc3XgdJzLAACfcci5 LyA41NgjkULPvdy4AeALOhw= =PXKG -----END PGP SIGNATURE----- From bugzilla-daemon at portal.open-bio.org Fri Mar 30 06:29:41 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 06:29:41 -0400 Subject: [Biojava-dev] [Bug 2255] Problems with read and write of EMBL files In-Reply-To: Message-ID: <200703301029.l2UATf4w020994@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2255 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-30 06:29 EST ------- Fixed in CVS. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 11:39:11 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:39:11 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703301539.l2UFdB29001755@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #4 from gwaldon at geneinfinity.org 2007-03-30 11:39 EST ------- Yes, it is fixed. I'll try to get the time to write a test on equality of MultiSourceCompoundRichLocation - this is how the bug was revealed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 11:45:38 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:45:38 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703301545.l2UFjcY8002022@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #7 from gwaldon at geneinfinity.org 2007-03-30 11:45 EST ------- Following is the comment of Severine Duvaud from Expasy. The links she mentionned are found on niceprot View; here is a link exemple: http://www.expasy.org/cgi-bin/niceprot.pl?P01717 ====================================================================== When clicking on "View entry in original UniProtKB/Swiss-Prot format", you get an "enhanced" view of the original entry (compared to the "raw" entry view as pointed out by the other link), notably with implicit links. The standard entry, in (hopefully) regular Swiss-Prot format can be found when clicking on "View entry in raw text format (no link)" or by downloading uniprot_sprot.dat on our FTP server. Sorry for the confusion. Best regards, Severine -- ************************************** Severine Duvaud, Swiss-Prot Group Swiss Institute of Bioinformatics CMU, 1 rue Michel Servet CH - 1211 Geneva 4 Switzerland Tel. (+41) 22 379 58 25 Fax (+41) 22 379 58 58 Severine.Duvaud at isb-sib.ch http://www.expasy.org ************************************** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 11:46:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:46:09 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703301546.l2UFk96q002090@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #5 from holland at ebi.ac.uk 2007-03-30 11:46 EST ------- Great. If you get time to come up with a test, can you add a note to this bug and then close the bug! I'll leave it as verified/fixed in the meantime. Tests should go in an appropriate package under the tests/ folder in biojava-live - you'll get the picture I'm sure just by looking around there to see where other tests have gone (e.g. org.biojavax.bio.seq.io for all BJX sequence i/o related tests). Please do create a new test package if you can't find one that seems appropriate. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 11:49:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:49:40 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703301549.l2UFnekY002285@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED ------- Comment #8 from holland at ebi.ac.uk 2007-03-30 11:49 EST ------- Thanks for that. I'm glad I'm not going mad! :) So it seems that Expasy is providing slightly confusing links, and the best route to get files from them for programmatic parsing is to get raw text files wherever possible rather then cutting-and-pasting their enhanced-with-HTML-links versions that they show by default when browsing their database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From amit.p at ocimumbio.com Tue Mar 13 02:38:52 2007 From: amit.p at ocimumbio.com (Amit Dattatreya Parhar) Date: Tue, 13 Mar 2007 06:38:52 -0000 Subject: [Biojava-dev] ESD megabace trace file parser Message-ID: <004a01c76537$adca6e40$fc01a8c0@ocimumbio.net> Hi, I need a .esd file parser. can anybody help me out? atleast information on how to parse this type of trace file. thanks in advance Amit Bioinformatics Analyst amit.p at ocimumbio.com Ocimum Biosolutions ...enabling R&D 6th Floor, Reliance Classic, Road No.1, Banjara Hills, Hyderabad - 500 034, A.P, India Business Phone: 04055627203,ext-132 | Fax: 04055627205 BioIT Solutions | Microarrays | Oligonucleotide Synthesis | Research Services | Live Help Disclaimer: Please note that this e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please notify the system manager at webmaster at ocimumbio.com and destroy all copies of the original message. Any unauthorised use, disclosure, review, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of Ocimum. The recipient should check this email and attachments if any, for the presence of viruses. Ocimum accepts no liability for any damage caused by any virus transmitted by this email. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 41679 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/biojava-dev/attachments/20070313/6f8bb35e/attachment-0001.png From renauxc.isat at gmail.com Mon Mar 26 09:18:08 2007 From: renauxc.isat at gmail.com (Caroline Renaux) Date: Mon, 26 Mar 2007 15:18:08 +0200 Subject: [Biojava-dev] org.biojava.bio.symbol.UkkonenSuffixTree.class BUG Message-ID: <401c5fc20703260618t3f38eb86yd7d66043399ba69e@mail.gmail.com> Bonjour, j'ai r?cemment utilis? le Package org.biojava.bio.symbol et plus particuli?rement la classe UkkonenSuffixTree. Cependant lorsque que je veux ajouter un ensemble de s?quences ? l'arbre et que je les s?pares par le caract?re de s?paration '$' cel? ne fonctionne pas. Lorsqu'il traite la seconde s?quence j'obtiens une erreur "NullPointerException" dans la m?thode jumpTo ? la ligne : arrivedAt=(SuffixNode)currentNode.children.get(*new* Character(source.charAt (from))); Je ne comprend pas ce que j'aurai pu faire de travers. D'avance merci de votre r?ponse. RENAUX C. -------------------------------- Hello, I used for a java application the org.biojava.bio.symbol package and particularly the UkkonenSuffixTree class. When i want to add a set of sequences to the tree, i add a '$' between the sequences but it doesn't work. I have a NullPointerException when the system add the second sequence int the method jumTo at the line : arrivedAt=(SuffixNode)currentNode.children.get(*new* Character(source.charAt (from))); I don't understand why it doesn't work. Thank you in advance. RENAUX C. From gwaldon at geneinfinity.org Thu Mar 1 05:01:38 2007 From: gwaldon at geneinfinity.org (george waldon) Date: Wed, 28 Feb 2007 21:01:38 -0800 Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Message-ID: <200703010501.l2151dk5093954@mmm1924.dulles19-verio.com> Hi, I have a simple problem with the genbank parser. It returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. I am not familiar with the parsers; maybe someone could volonteer to correct this problem. Thanks, George From mark.schreiber at novartis.com Thu Mar 1 05:23:46 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Thu, 1 Mar 2007 13:23:46 +0800 Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Message-ID: Hi George - I think someone will fix it promptly but to make sure it gets done and doesn't happen again could you submit a bug report with an example accession. We recently made a decision that fixes to bug reports should be accompanied by JUnit tests that make sure that bug gets tested with every new build to avoid the same issues creaping back in. Thanks, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com www.dengueinfo.org phone +65 6722 2973 fax +65 6722 2910 "george waldon" Sent by: biojava-dev-bounces at lists.open-bio.org 03/01/2007 01:01 PM Please respond to george waldon To: biojava-dev at biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Problem with RichSequence.IOTools.readGenbankDNA Hi, I have a simple problem with the genbank parser. It returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. I am not familiar with the parsers; maybe someone could volonteer to correct this problem. Thanks, George _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From bugzilla-daemon at portal.open-bio.org Fri Mar 2 02:58:50 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Mar 2007 21:58:50 -0500 Subject: [Biojava-dev] [Bug 2223] New: newline characters in RichSequence description Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2223 Summary: newline characters in RichSequence description Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org The biojavax genbank parser returns a RichSequence with a description that sometimes contains newline characters derived from the inital genbank file (keyword DEFINITION and following description, which encompasses one or several lines). This is particularly annoying when converting to other file formats. It might be the same for with embl files. This is reproduced by parsing any genbank file containing multiple lines in the DESCRIPTION keywords with RichSequence.IOTools.readGenbankDNA, e.g. try AAB31603. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 2 10:12:05 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 2 Mar 2007 05:12:05 -0500 Subject: [Biojava-dev] [Bug 2223] newline characters in RichSequence description In-Reply-To: Message-ID: <200703021012.l22AC51i013336@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2223 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX ------- Comment #1 from holland at ebi.ac.uk 2007-03-02 05:12 EST ------- This newline stuff is annoying, admittedly, but it is deliberate. There are a number of formats out there which rely on newlines being preserved within comment or description text, such as the Genbank-like format produced by Vectorbase. Also, some comments or descriptions are meaningless if the newlines are not preserved (consider a description or comment block that makes an ordered list of points, one per line, but those points do not take up the whole line - if the newlines were ignored and the block reformatted on output then the points would run into each other). I originally had this code to drop the newlines, but there were too many problems due to the above situations, and so I made it preserve them. Granted this makes transfers to other formats a little messy, sometimes introducing newlines where they are not necessary, but it preserves things much better. If you need the newlines taken out, a simple search-and-replace on the text would do it: mydesc = mydesc.replaceAll("\n",""); If you then need to update the record before formatting it you can update (or replace) the comment object involved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at ebi.ac.uk Mon Mar 5 09:32:14 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Mon, 05 Mar 2007 09:32:14 +0000 Subject: [Biojava-dev] Google Summer of Code Message-ID: <45EBE39E.9010402@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Phyloinformatics Hackathon group, which aims to foster collaboration among the Bio* toolkit developers (see http://phyloinformatics.net/ ), is preparing to apply for the Google Summer of Code program. The page for collecting ideas etc is at http://phyloinformatics.net/Phyloinformatics_Summer_of_Code_2007 This is mostly a stub right now but will rapidly (have to) be fleshed out over the next couple of days (the deadline for application is March 9). Please feel free to add ideas that you have directly (wiki registration is open), or email them to Hilmar Lapp (hlapp at duke dot edu) who is managing the program. If we are accepted, we'll (hopefully) have students over the summer, some of which will possibly work on BioPerl and BioJava-related projects. These may be newcomers to BioJava, newcomers to distributed OSS development, or even programming newbies ... Given the helping hand that this community has readily extended to newbies in the past, I'm hoping that you'll help us help them overcome the initial barriers, too. Also, this offers us a chance to get work done on BioJava and find future contributors with the financial and logistic help of Google. If anyone is willing to go beyond that and would be willing to help out as a mentor (or back-up mentor), that'd be great; just send me an email. cheers, Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFF6+Oe4C5LeMEKA/QRAp6EAJ9K+Y4Bb1s6zLiaovmApEymvs0eIgCWMJce U+iKoYrFqwBH+aqlrSOXAw== =lSpn -----END PGP SIGNATURE----- From darin.london at duke.edu Tue Mar 6 16:03:59 2007 From: darin.london at duke.edu (Darin London) Date: Tue, 06 Mar 2007 11:03:59 -0500 Subject: [Biojava-dev] Announcing BOSC 2007 Message-ID: <45ED90EF.7030000@duke.edu> The BOSC Organizing Committee are proud to announce BOSC 2007, occurring in Vienna, Austria on July 19th, 20th. The conference this year promises to be exciting, as the BOSC developers attempt to define and solve currently intractable problems in Bioinformatics. Please refer to the following website for complete information, and requests for submissions. Thank you, and we hope to see you in Vienna. http://open-bio.org/wiki/BOSC_2007 The BOSC organizing Committee Please pass this email on to anyone that would be interested. From bugzilla-daemon at portal.open-bio.org Wed Mar 14 00:48:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 13 Mar 2007 20:48:31 -0400 Subject: [Biojava-dev] [Bug 2234] New: Error during "rich" conversion of genbank to EMBL format Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2234 Summary: Error during "rich" conversion of genbank to EMBL format Product: BioJava Version: unspecified Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org An EMBL file generated by RichSequence.IOTools.writeEMBL is not read by RichSequence.IOTools.readEMBLDNA. To reproduce, get AY847516 from NCBI (a CTLA-4 mAb); read with RichSequence.IOTools.readGenbankDNA, write with writeEMBL, and then read again with readEMBLDNA. An IO exception is thrown at the first line with the message "error during parsing". The first line has a suspect null data class (ID AY847516; linear; mRNA; null; PRI; 362 BP). The problem does not appear when AY847516 is queried from EMBL, written and read in EMBL format (ID AY847516; linear; mRNA; STD; HUM; 362 BP). Thanks, George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 14 08:23:55 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 04:23:55 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703140823.l2E8NtJ0031964@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-14 04:23 EST ------- Thanks for pointing this out. It was due to the data class being an EMBL-specific thing and not being populated when reading from other formats such as Genbank. I have made EMBLFormat use a default value of STD for the data class in these cases. This should solve the problem. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 14 18:12:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 14:12:51 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703141812.l2EICp56032597@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED | ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-14 14:12 EST ------- Thanks Richard for the fix, but it seems that other problems were uncovered. First doing the read/write/read operation with EMBL format only, I got "Bad ID line found": ID AY847516; linear; mRNA; STD; HUM; 362 BP. I added the version by hand: ID AY847516; SV 1; linear; mRNA; STD; HUM; 362 BP. and I got "Current bioentry already has a version". I removed the SV line, and I got "Bad date line found...",which corresponds to the following line: DT 01-JUN-2005 (Rel. 84 Created) Finally, I added a coma after 84, and the file finally parsed without error. In the process, I noticed that the first reference lost the last character of the last RT line (missing 's' at the end of lymphocytes); note that the second reference does not have RT lines. Secondly, doing the read/write/read operation from Genbank to EMBL format, I got similar progression; the bad date line has now a 0 instead of 84, but adding the coma after 0 resolved the reading problem. Then I got a new error: "Could not read sequence For input string: "GI:61815557", which is generated by the following FT line: FT /db_xref="taxon:GI:61815557" I did not investigate further at this point. I noticed that the above reference is now complete, but is incomplete when I do the read/write/read operation from EMBL to Genbank. Therefore, this is probably a read problem in the EMBL parser. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From zbhuang2005 at hotmail.com Wed Mar 14 19:08:03 2007 From: zbhuang2005 at hotmail.com (Huang Zhibin) Date: Wed, 14 Mar 2007 19:08:03 +0000 Subject: [Biojava-dev] tandem repeats in biojava Message-ID: Hi, I just wonder whether biojava implements the function of exact/approximate tandem repeats in DNA. I thought it is a typical function of dna, but I could not find anything about it in biojava. Hope someone can give me some hints. Thanks Best, zbhuang _________________________________________________________________ ?????????????? MSN Messenger: http://messenger.msn.com/cn From joel.pitt at gmail.com Wed Mar 14 21:13:00 2007 From: joel.pitt at gmail.com (Joel Pitt) Date: Thu, 15 Mar 2007 10:13:00 +1300 Subject: [Biojava-dev] tandem repeats in biojava In-Reply-To: References: Message-ID: Hi zbhuang, When I last checked about a couple of years ago there wasn't anything like that as part of the core biojava. I'm the author of http://repeatfinder.sourceforge.net/ which is an approximate tandem/exact repeatfinder using our own searching algorithm. I was planning to port it into biojava (as well as build a nice GUI for it), but as is usually the case, I haven't had the time. The RepeatFinder source code is available though, so feel free to port it yourself ;) Cheers, Joel On 3/15/07, Huang Zhibin wrote: > Hi, > I just wonder whether biojava implements the function of exact/approximate > tandem repeats in DNA. I thought it is a typical function of dna, but I > could not find anything about it in biojava. > Hope someone can give me some hints. Thanks > Best, > zbhuang > > _________________________________________________________________ > ?????????????? MSN Messenger: http://messenger.msn.com/cn > > > _______________________________________________ > biojava-dev mailing list > biojava-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-dev > > -- -Joel "Unless you try to do something beyond what you have mastered, you will never grow." -C.R. Lawton From bugzilla-daemon at portal.open-bio.org Wed Mar 14 23:02:49 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Mar 2007 19:02:49 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703142302.l2EN2ngu013096@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 ------- Comment #3 from holland at ebi.ac.uk 2007-03-14 19:02 EST ------- good grief, what a can of worms.... i'll get onto it on Friday when I'm back in the office! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 15 10:11:58 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Mar 2007 06:11:58 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703151011.l2FABw47005920@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 ------- Comment #4 from holland at ebi.ac.uk 2007-03-15 06:11 EST ------- I've identified the problem areas in the EMBLFormat code. I will fix them tomorrow. Note to self: if (format.equals(EMBL_FORMAT)) { // accession; SV version; circular/linear; moltype; dataclass; division; length BP. locusLine.append(rs.getAccession()); locusLine.append("; "); --- INSERT "SV versionLine;" here locusLine.append(rs.getCircular()?"circular; ":"linear; "); locusLine.append(moltype); // version line --- WRAP in test - only execute if EMBL_PRE87_FORMAT if (versionLine!=null) StringTools.writeKeyValueLine(VERSION_TAG, versionLine, 5, this.getLineWidth(), null, VERSION_TAG, this.getPrintStream()); else StringTools.writeKeyValueLine(VERSION_TAG, accession+"."+rs.getVersion(), 5, this.getLineWidth(), null, VERSION_TAG, this.getPrintStream()); this.getPrintStream().println(DELIMITER_TAG+" "); // date line -- INSERT commaa before space before Created StringTools.writeKeyValueLine(DATE_TAG, (cdat==null?udat:cdat)+" (Rel. "+(crel==null?"0":crel)+" Created)", 5, this.getLineWidth(), null, DATE_TAG, this.getPrintStream()); --- CHECK that leading space chomped first - does this record start in space then "? It ends in "; on last line but does not start on same line if (key.equals(TITLE_TAG)) { if (val.length()>1) { if (val.endsWith(";")) val = val.substring(0,val.length()-1); // chomp semicolon if (val.endsWith("\"")) val = val.substring(1,val.length()-2); // chomp quotes title = val; } else title=null; // single semi-colon indicates no title } // add-in other dbxrefs where present for (Iterator j = f.getRankedCrossRefs().iterator(); j.hasNext(); ) { RankedCrossRef rcr = (RankedCrossRef)j.next(); CrossRef cr = rcr.getCrossRef(); --- REMOVE taxon: prefix - shouldn't be there... StringTools.writeKeyValueLine(FEATURE_TAG, "/db_xref=\"taxon:"+cr.getDbname()+":"+cr.getAccession()+"\"", 21, this.getLineWidth(), null, FEATURE_TAG, this.getPrintStream()); } -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 16 09:50:27 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 16 Mar 2007 05:50:27 -0400 Subject: [Biojava-dev] [Bug 2234] Error during "rich" conversion of genbank to EMBL format In-Reply-To: Message-ID: <200703160950.l2G9oR2a005733@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2234 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED ------- Comment #5 from holland at ebi.ac.uk 2007-03-16 05:50 EST ------- Fixes made and committed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 18:38:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 14:38:34 -0400 Subject: [Biojava-dev] [Bug 2244] New: uniprot files do not load Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2244 Summary: uniprot files do not load Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org Exceptions are thrown when Uniprot files are tentatively read by the biojavax parser. Reading P01717 or P07724 with RichSequence.IOTools.readUniProt throw : Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:393) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) P07724 (this one initially reported by Sofia Burvall) Reading P13346 throw: Caused by: org.biojava.bio.seq.io.ParseException: Bad date line found: 01-JAN-1990 (Rel. 13, Created) at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:349) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 19:07:57 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 15:07:57 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703211907.l2LJ7v5u010410@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |holland at ebi.ac.uk Status|NEW |ASSIGNED ------- Comment #1 from holland at ebi.ac.uk 2007-03-21 15:07 EST ------- I don't understand what the problem is. If I download UniProt files for the 3 accessions you mention from the official UniProt website at http://www.pir.uniprot.org/ then I get files that parse correctly. The date lines shown in your example are not in UniProt format. They are EMBL format date lines. Where are you getting your UniProt files from? This sounds like a bug in the software at the site you are getting from, not a bug with BioJava. Please let me know where you are getting the files from and I will investigate a bit more. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 20:15:56 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 16:15:56 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212015.l2LKFu2x014547@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-21 16:15 EST ------- Hi Richard, We must have some mixup of some sort. The bad date line was reported when I tried to parse the file included in the email from Sofia (it was posted on biojava-1 on March 16, 2007). Effectively, this file is different from what the one I get at the uniprot web site. Reading Sophia email, it is possible that the incorrect file was in fact generated by biojavax (in the same kind of operation like read/write/read problem that was corrected recently). I am going to investigate the other exception. The 2 files were from the correct source (http://www.expasy.org/uniprot/P01717, click on "View entry in original UniProtKB/Swiss-Prot format", etc.). I'll get back on this latter. - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 20:27:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 16:27:15 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212027.l2LKRFrc015247@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #3 from gwaldon at geneinfinity.org 2007-03-21 16:27 EST ------- Well, the 2 files are respectively from: http://www.expasy.org/cgi-bin/get-sprot-entry?P07724 http://www.expasy.org/cgi-bin/get-sprot-entry?P01717 The ArrayIndexOutOfBoundsException are thrown for the lines: P07724: DR ProDom [Domain structure / List of seq. sharing at least 1 domain ] P01717: DR HOVERGEN [Family / Alignment / Tree] Hope this helps! - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 21 21:53:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 21 Mar 2007 17:53:26 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703212153.l2LLrQ8X020150@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #4 from holland at ebi.ac.uk 2007-03-21 17:53 EST ------- Thanks for that extra info! I'll investigate further over the next couple of days and get back to you when I've got a solution. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 22 12:19:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 22 Mar 2007 08:19:20 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703221219.l2MCJKIM028661@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #5 from holland at ebi.ac.uk 2007-03-22 08:19 EST ------- I have investigated the two problem DR lines and found that they do not match the current UniProt file format specifications as defined at http://www.expasy.org/sprot/userman.html#DR_line . There is no indication in the UniProt file format specification as to what a DR line in this format might actually mean. As these files came from UniProt themselves, I think the best thing to do is for you to contact UniProt and raise a bug report with them directly indicating that their website is producing files that do not conform to their own file format standards. BJX can only parse files which follow the official format definition. There's only so much flexibility we can build in! :) I'll leave this bug as ASSIGNED for now so that you can add any comments regarding what UniProt say about their bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 26 08:23:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Mar 2007 04:23:09 -0400 Subject: [Biojava-dev] [Bug 2250] New: Genbank format error Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2250 Summary: Genbank format error Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: mark.schreiber at novartis.com I think that the following error was caused by the last fix to GenbankFormat Format_object=org.biojavax.bio.seq.io.GenbankFormat Accession=null Id=null Comments=Bad locus line Parse_block=LOCUS NM_182008 1629 bp mRNA linear INV 23-MAR-2007 Stack trace follows .... at org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:323) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 2 more -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 26 14:41:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Mar 2007 10:41:40 -0400 Subject: [Biojava-dev] [Bug 2250] Genbank format error In-Reply-To: Message-ID: <200703261441.l2QEfent022095@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2250 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-26 10:41 EST ------- Fixed. Now testing before commit. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 04:47:10 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 00:47:10 -0400 Subject: [Biojava-dev] [Bug 2253] New: NullPointerException in MultiSourceCompoundRichLocation Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2253 Summary: NullPointerException in MultiSourceCompoundRichLocation Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org More exactly in the blockIterator method taken from CompoundRichLocation where the "sortedMembers" variable is null. This set is actually never instantiated, see constructor. Well, I don't understand the function of the "sortedMembers" list; why not sorting the "members" list instead from the start? - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 05:27:41 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 01:27:41 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703290527.l2T5Rfu3010839@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 ------- Comment #6 from gwaldon at geneinfinity.org 2007-03-29 01:27 EST ------- The problem is limited to the Expasy server. I am trying to get in touch with someone overthere. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 08:54:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 04:54:26 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703290854.l2T8sQkH019389@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-29 04:54 EST ------- The only way that sortedMembers can be null is if the object is loaded in via Hibernate. This is a bug which I have now fixed in CVS. PS. In order to reproduce output files that are largely the same as the input files these locations are read from, whilst simultaneously providing a sensible iterator for the user, it is important to maintain two separate orders. If the user ever calls sort() on the location, the two orders become the same. PPS. There is NO test in BJX for this bug as it is Hibernate-related and therefore almost impossible to write a test case for! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 16:08:42 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 12:08:42 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703291608.l2TG8gWo008372@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED ------- Comment #2 from gwaldon at geneinfinity.org 2007-03-29 12:08 EST ------- The problem was in MultiSourceCompoundRichLocation and not its parent class. This resulted in the second set to be null from both public constructors. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 29 17:13:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Mar 2007 13:13:40 -0400 Subject: [Biojava-dev] [Bug 2255] New: Problems with read and write of EMBL files Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2255 Summary: Problems with read and write of EMBL files Product: BioJava Version: live (CVS source) Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: seq.io AssignedTo: biojava-dev at biojava.org ReportedBy: gwaldon at geneinfinity.org I just noticed two problems with EMBL files. During read operations, the first keyword (KW line) is not parsed. Also during write operations, the sequence line (SQ) is missing the word "Sequence" in the output. - George -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gwaldon at geneinfinity.org Fri Mar 30 05:28:51 2007 From: gwaldon at geneinfinity.org (george waldon) Date: Thu, 29 Mar 2007 22:28:51 -0700 Subject: [Biojava-dev] Exception in MassCalc.getMass Message-ID: <20070330052851.16176.qmail@mmm1924.dulles19-verio.com> I got a NullPointerException in MassCalc.getMass that I traced to the presence of an ambiguity symbol X in the peptide sequence. The origin lies in SimpleSymbolPropertyTable.getDoubleValue. The ambiguity symbol is first validated (through its matches in AbstractAlphabet) and the exception is thrown when we try to find a value associated with a null key in the property map. First I think we should redirect this exception to the checked IllegalSymbolException to tell the user that something is going wrong. Concerning calculating masses, I see two needs: need to estimate the mass of polypeptides containing ambiguity symbols and need to obtain an exact mass or an IllegalSymbolException when ambiguity occurs. I propose to add a new method MassCalc.getEstimatedMass to do the first part and to modify the existing method so that an IllegalSymbolException is thrown whenever a non-atomic symbol is encounter. Mass values for ambiguity symbols would be added to the ResidueProperty.xml file and taken as simple average of the atomic match values. - George From mark.schreiber at novartis.com Fri Mar 30 05:43:05 2007 From: mark.schreiber at novartis.com (mark.schreiber at novartis.com) Date: Fri, 30 Mar 2007 13:43:05 +0800 Subject: [Biojava-dev] Exception in MassCalc.getMass Message-ID: Hi George - Sounds like a good idea. Can you check it in? And add a JUnit test??? You probably don't need to add ambiguity weights to the XML as you can decompose a basis symbol into it's components, look up each and get the average. Infact there are so many possible amino acid ambiguities it would be a very bad idea to list them all in the XML file. Although IUPAC only has about 3 amino acid ambiguities BioJava's BasisSymbols can infact represent any and every possible combination of ambiguity (although there is no obvious way to tokenize them to text) so you would need to allow for this in your getEstimated mass method. Thanks for spotting this, - Mark Mark Schreiber Research Investigator (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 "george waldon" Sent by: biojava-dev-bounces at lists.open-bio.org 03/30/2007 01:28 PM Please respond to george waldon To: biojava-dev at biojava.org cc: (bcc: Mark Schreiber/GP/Novartis) Subject: [Biojava-dev] Exception in MassCalc.getMass I got a NullPointerException in MassCalc.getMass that I traced to the presence of an ambiguity symbol X in the peptide sequence. The origin lies in SimpleSymbolPropertyTable.getDoubleValue. The ambiguity symbol is first validated (through its matches in AbstractAlphabet) and the exception is thrown when we try to find a value associated with a null key in the property map. First I think we should redirect this exception to the checked IllegalSymbolException to tell the user that something is going wrong. Concerning calculating masses, I see two needs: need to estimate the mass of polypeptides containing ambiguity symbols and need to obtain an exact mass or an IllegalSymbolException when ambiguity occurs. I propose to add a new method MassCalc.getEstimatedMass to do the first part and to modify the existing method so that an IllegalSymbolException is thrown whenever a non-atomic symbol is encounter. Mass values for ambiguity symbols would be added to the ResidueProperty.xml file and taken as simple average of the atomic match values. - George _______________________________________________ biojava-dev mailing list biojava-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-dev From bugzilla-daemon at portal.open-bio.org Fri Mar 30 08:51:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 04:51:20 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703300851.l2U8pKbZ016735@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #3 from holland at ebi.ac.uk 2007-03-30 04:51 EST ------- Indeed you're right. I just checked again and the bugfix I applied should have fixed this too. Could you confirm that? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From holland at ebi.ac.uk Fri Mar 30 08:53:02 2007 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 30 Mar 2007 09:53:02 +0100 Subject: [Biojava-dev] ParseExceptions Message-ID: <460CCFEE.4030609@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all. Can the person who recently modified the BJX EMBLFormat to throw ParseExceptions using the ParseException.newMessage() method please check all their code to make sure that it is correct. I have noticed that every occurrence of this in EMBLFormat was implemented incorrectly and never threw the generated message. I have fixed it there, but have no idea how many other places this might have happened. I have replaced all occurrences with two lines similar to this: String message = ParseException.newMessage(....); throw new ParseException(message); cheers, Richard -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGDM/u4C5LeMEKA/QRAkf3AJ4opHh8SDa+jUIoNnKTc3XgdJzLAACfcci5 LyA41NgjkULPvdy4AeALOhw= =PXKG -----END PGP SIGNATURE----- From bugzilla-daemon at portal.open-bio.org Fri Mar 30 10:29:41 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 06:29:41 -0400 Subject: [Biojava-dev] [Bug 2255] Problems with read and write of EMBL files In-Reply-To: Message-ID: <200703301029.l2UATf4w020994@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2255 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from holland at ebi.ac.uk 2007-03-30 06:29 EST ------- Fixed in CVS. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 15:39:11 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:39:11 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703301539.l2UFdB29001755@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #4 from gwaldon at geneinfinity.org 2007-03-30 11:39 EST ------- Yes, it is fixed. I'll try to get the time to write a test on equality of MultiSourceCompoundRichLocation - this is how the bug was revealed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 15:45:38 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:45:38 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703301545.l2UFjcY8002022@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 gwaldon at geneinfinity.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #7 from gwaldon at geneinfinity.org 2007-03-30 11:45 EST ------- Following is the comment of Severine Duvaud from Expasy. The links she mentionned are found on niceprot View; here is a link exemple: http://www.expasy.org/cgi-bin/niceprot.pl?P01717 ====================================================================== When clicking on "View entry in original UniProtKB/Swiss-Prot format", you get an "enhanced" view of the original entry (compared to the "raw" entry view as pointed out by the other link), notably with implicit links. The standard entry, in (hopefully) regular Swiss-Prot format can be found when clicking on "View entry in raw text format (no link)" or by downloading uniprot_sprot.dat on our FTP server. Sorry for the confusion. Best regards, Severine -- ************************************** Severine Duvaud, Swiss-Prot Group Swiss Institute of Bioinformatics CMU, 1 rue Michel Servet CH - 1211 Geneva 4 Switzerland Tel. (+41) 22 379 58 25 Fax (+41) 22 379 58 58 Severine.Duvaud at isb-sib.ch http://www.expasy.org ************************************** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 15:46:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:46:09 -0400 Subject: [Biojava-dev] [Bug 2253] NullPointerException in MultiSourceCompoundRichLocation In-Reply-To: Message-ID: <200703301546.l2UFk96q002090@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2253 ------- Comment #5 from holland at ebi.ac.uk 2007-03-30 11:46 EST ------- Great. If you get time to come up with a test, can you add a note to this bug and then close the bug! I'll leave it as verified/fixed in the meantime. Tests should go in an appropriate package under the tests/ folder in biojava-live - you'll get the picture I'm sure just by looking around there to see where other tests have gone (e.g. org.biojavax.bio.seq.io for all BJX sequence i/o related tests). Please do create a new test package if you can't find one that seems appropriate. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 30 15:49:40 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 30 Mar 2007 11:49:40 -0400 Subject: [Biojava-dev] [Bug 2244] uniprot files do not load In-Reply-To: Message-ID: <200703301549.l2UFnekY002285@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2244 holland at ebi.ac.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED ------- Comment #8 from holland at ebi.ac.uk 2007-03-30 11:49 EST ------- Thanks for that. I'm glad I'm not going mad! :) So it seems that Expasy is providing slightly confusing links, and the best route to get files from them for programmatic parsing is to get raw text files wherever possible rather then cutting-and-pasting their enhanced-with-HTML-links versions that they show by default when browsing their database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From amit.p at ocimumbio.com Tue Mar 13 06:38:52 2007 From: amit.p at ocimumbio.com (Amit Dattatreya Parhar) Date: Tue, 13 Mar 2007 06:38:52 -0000 Subject: [Biojava-dev] ESD megabace trace file parser Message-ID: <004a01c76537$adca6e40$fc01a8c0@ocimumbio.net> Hi, I need a .esd file parser. can anybody help me out? atleast information on how to parse this type of trace file. thanks in advance Amit Bioinformatics Analyst amit.p at ocimumbio.com Ocimum Biosolutions ...enabling R&D 6th Floor, Reliance Classic, Road No.1, Banjara Hills, Hyderabad - 500 034, A.P, India Business Phone: 04055627203,ext-132 | Fax: 04055627205 BioIT Solutions | Microarrays | Oligonucleotide Synthesis | Research Services | Live Help Disclaimer: Please note that this e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please notify the system manager at webmaster at ocimumbio.com and destroy all copies of the original message. Any unauthorised use, disclosure, review, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of Ocimum. The recipient should check this email and attachments if any, for the presence of viruses. Ocimum accepts no liability for any damage caused by any virus transmitted by this email. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.png Type: image/png Size: 41679 bytes Desc: not available URL: From renauxc.isat at gmail.com Mon Mar 26 13:18:08 2007 From: renauxc.isat at gmail.com (Caroline Renaux) Date: Mon, 26 Mar 2007 15:18:08 +0200 Subject: [Biojava-dev] org.biojava.bio.symbol.UkkonenSuffixTree.class BUG Message-ID: <401c5fc20703260618t3f38eb86yd7d66043399ba69e@mail.gmail.com> Bonjour, j'ai r?cemment utilis? le Package org.biojava.bio.symbol et plus particuli?rement la classe UkkonenSuffixTree. Cependant lorsque que je veux ajouter un ensemble de s?quences ? l'arbre et que je les s?pares par le caract?re de s?paration '$' cel? ne fonctionne pas. Lorsqu'il traite la seconde s?quence j'obtiens une erreur "NullPointerException" dans la m?thode jumpTo ? la ligne : arrivedAt=(SuffixNode)currentNode.children.get(*new* Character(source.charAt (from))); Je ne comprend pas ce que j'aurai pu faire de travers. D'avance merci de votre r?ponse. RENAUX C. -------------------------------- Hello, I used for a java application the org.biojava.bio.symbol package and particularly the UkkonenSuffixTree class. When i want to add a set of sequences to the tree, i add a '$' between the sequences but it doesn't work. I have a NullPointerException when the system add the second sequence int the method jumTo at the line : arrivedAt=(SuffixNode)currentNode.children.get(*new* Character(source.charAt (from))); I don't understand why it doesn't work. Thank you in advance. RENAUX C.