From tiagoantao at gmail.com Mon Oct 3 18:12:18 2011 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 3 Oct 2011 23:12:18 +0100 Subject: [Biojava-l] VCF parser Message-ID: Hi, I wonder if there is a VCF parser in either Python or Java? Either I am being dumb at searching (probably) or nothing exists? Thanks, Tiago -- "If you want to get laid, go to college.? If you want an education, go to the library." - Frank Zappa From biojava at hannes.oib.com Wed Oct 5 03:41:57 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 09:41:57 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment Message-ID: Hello! I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but when I run it, I get: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:173) What could I be doing wrong? ( on the cookbook page, there is also an import missing: import org.biojava3.alignment.Alignments; ) -> then the cookbook runs, but my code does not private static void processFile(String filename) { try { FileInputStream inStream = new FileInputStream(filename); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap b = fastaReader.process(); List sequences = new ArrayList(); for (Entry entry : b.entrySet()) { if (sequences.size() < 5) { sequences.add(entry.getValue()); } System.out.println(entry.getValue()); } Profile profile = Alignments.getMultipleSequenceAlignment(sequences); System.out.printf("Clustalw:%n%s%n", profile); ConcurrencyTools.shutdown(); } catch (Exception ex) { Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); } } From biojava at hannes.oib.com Wed Oct 5 04:50:33 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 10:50:33 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Hi! I read the sequence from a fasta file. FileInputStream inStream = new FileInputStream(filename); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap b = fastaReader.process(); and then use List sequences = new ArrayList(); for (Entry entry : b.entrySet()) { sequences.add(entry.getValue()); } to get the required list of DNA sequences. I noticed in an earlier discussion, there was some talk about this too (3-4 months ago, perhaps) and something about a possible fix in SVN. when will it be released on the maven server? Hannes On Wed, Oct 5, 2011 at 10:46, Hashem Koohy wrote: > Hi Hannes, > It seems to me it doesn't like your ?dna Sequence! > Is your sequence in the following format? > > Sequence dnaSeq = DNATools.createDNASequence("acccgggttttacagt", "id"); > > Good luck > Hashem > > On 05/10/2011 08:41, "Hannes Brandst?tter-M?ller" > wrote: > >> Hello! >> >> I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but >> when I run it, I get: >> >> java.util.concurrent.ExecutionException: java.lang.NullPointerException >> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) >> at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) >> at >> org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java >> :173) >> >> ?What could I be doing wrong? >> >> ( >> on the cookbook page, there is also an import missing: >> import org.biojava3.alignment.Alignments; >> ) >> -> then the cookbook runs, but my code does not >> >> private static void processFile(String filename) { >> ? ? ? ? try { >> ? ? ? ? ? ? FileInputStream inStream = new FileInputStream(filename); >> ? ? ? ? ? ? FastaReader fastaReader = >> ? ? ? ? ? ? ? ? ? ? new FastaReader( >> ? ? ? ? ? ? ? ? ? ? inStream, >> ? ? ? ? ? ? ? ? ? ? new GenericFastaHeaderParser> NucleotideCompound>(), >> ? ? ? ? ? ? ? ? ? ? new >> DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); >> ? ? ? ? ? ? LinkedHashMap b = fastaReader.process(); >> >> ? ? ? ? ? ? List sequences = new ArrayList(); >> ? ? ? ? ? ? for (Entry entry : b.entrySet()) { >> ? ? ? ? ? ? ? ? if (sequences.size() < 5) { >> ? ? ? ? ? ? ? ? ? ? sequences.add(entry.getValue()); >> ? ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? ? System.out.println(entry.getValue()); >> ? ? ? ? ? ? } >> >> ? ? ? ? ? ? Profile profile = >> Alignments.getMultipleSequenceAlignment(sequences); >> ? ? ? ? ? ? System.out.printf("Clustalw:%n%s%n", profile); >> >> ? ? ? ? ? ? ConcurrencyTools.shutdown(); >> ? ? ? ? } catch (Exception ex) { >> ? ? ? ? ? ? Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); >> ? ? ? ? } >> ? ? } >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > ------------------------------- > Hashem Koohy > PhD > Postdoctoral Fellow, > Sanger Institute, > Cambridge > Mobile: 07515425433 > > > > > -- > ?The Wellcome Trust Sanger Institute is operated by Genome Research > ?Limited, a charity registered in England with number 1021457 and a > ?company registered in England with number 2742969, whose registered > ?office is 215 Euston Road, London, NW1 2BE. > From andreas at sdsc.edu Wed Oct 5 13:21:19 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 10:21:19 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Hi Hannes, > I noticed in an earlier discussion, there was some talk about this too > (3-4 months ago, perhaps) and something about a possible fix in SVN. > when will it be released on the maven server? What version are you on? We released Maven 3.0.2 just recently.. Andreas From sbliven at ucsd.edu Wed Oct 5 13:39:42 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Wed, 5 Oct 2011 10:39:42 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: In the current SVN code (and therefor probably Biojava 3.0.2), CookbookMSA.java includes that import statement but is otherwise identical to the wiki version. It runs just fine for me. Probably updating to Biojava 3.0.2 will fix the null pointer exception. This does highlight the problem of keeping the wiki synchronized with the current BioJava version. Ideally, part of the release process could include automated updating of the wiki cookbook from the SVN code, with older versions available as a reference. However, I'm not sure that anyone would be willing to set up such a complex release process. -Spencer On Wed, Oct 5, 2011 at 10:21, Andreas Prlic wrote: > Hi Hannes, > > > I noticed in an earlier discussion, there was some talk about this too > > (3-4 months ago, perhaps) and something about a possible fix in SVN. > > when will it be released on the maven server? > > What version are you on? We released Maven 3.0.2 just recently.. > > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From amr_alhossary at hotmail.com Wed Oct 5 14:26:16 2011 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Wed, 5 Oct 2011 20:26:16 +0200 Subject: [Biojava-l] NullPointerException when usingAlignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: I agree with you 100%. I have met problems this week in working with the SPICE code too, to the extent that I delayed the whole idea till I have more suitable time to reimplement it myself. Amr -------------------------------------------------- From: "Spencer Bliven" Sent: Wednesday, October 05, 2011 7:39 PM To: "Andreas Prlic" Cc: Subject: Re: [Biojava-l] NullPointerException when usingAlignments.getMultipleSequenceAlignment > In the current SVN code (and therefor probably Biojava 3.0.2), > CookbookMSA.java includes that import statement but is otherwise identical > to the wiki version. It runs just fine for me. Probably updating to > Biojava > 3.0.2 will fix the null pointer exception. > > This does highlight the problem of keeping the wiki synchronized with the > current BioJava version. Ideally, part of the release process could > include > automated updating of the wiki cookbook from the SVN code, with older > versions available as a reference. However, I'm not sure that anyone would > be willing to set up such a complex release process. > > -Spencer > > On Wed, Oct 5, 2011 at 10:21, Andreas Prlic wrote: > >> Hi Hannes, >> >> > I noticed in an earlier discussion, there was some talk about this too >> > (3-4 months ago, perhaps) and something about a possible fix in SVN. >> > when will it be released on the maven server? >> >> What version are you on? We released Maven 3.0.2 just recently.. >> >> Andreas >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 5 15:22:13 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 21:22:13 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: 2011/10/5 Spencer Bliven : > In the current SVN code (and therefor probably Biojava 3.0.2), > CookbookMSA.java includes that import statement but is otherwise identical > to the wiki version. It runs just fine for me. Probably updating to Biojava > 3.0.2 will fix the null pointer exception. I installed it via Maven just last week - so I guess it should be on 3.0.2 - I'll check tomorrow. Anyhow, the problem isn't the wiki (I already updated that, btw) but the fact that it seems to work with Protein Sequences, but when I use my DNA sequences, it breaks. If you go to my first post, I copied my code there (just add a wrapper main that supplies a valid fasta file as parameter) Hannes From andreas at sdsc.edu Wed Oct 5 15:30:41 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 12:30:41 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Can you provide the fasta file? otherwise this is difficult to reproduce... Andreas On Wed, Oct 5, 2011 at 12:22 PM, Hannes Brandst?tter-M?ller wrote: > 2011/10/5 Spencer Bliven : >> In the current SVN code (and therefor probably Biojava 3.0.2), >> CookbookMSA.java includes that import statement but is otherwise identical >> to the wiki version. It runs just fine for me. Probably updating to Biojava >> 3.0.2 will fix the null pointer exception. > > I installed it via Maven just last week - so I guess it should be on > 3.0.2 - I'll check tomorrow. > > Anyhow, the problem isn't the wiki (I already updated that, btw) but > the fact that it seems to work with Protein Sequences, but when I use > my DNA sequences, it breaks. > If you go to my first post, I copied my code there (just add a wrapper > main that supplies a valid fasta file as parameter) > > Hannes > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 5 15:34:38 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 21:34:38 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: On Wed, Oct 5, 2011 at 21:30, Andreas Prlic wrote: > Can you provide the fasta file? otherwise this is difficult to reproduce... > > Andreas Unfortunately, the files in question are under NDA - does it work with other fasta files? I could not get it to work with the files I tried. Hannes From andreas at sdsc.edu Wed Oct 5 20:29:11 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 17:29:11 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: > Unfortunately, the files in question are under NDA - does it work with > other fasta files? I could not get it to work with the files I tried. I just wrote a junit test for DNA alignments and it works for me. DNA alignments by default are using the nuc-4_4 substitution matrix for the alignment. It contains the following columns. A T G C S W R Y K M B V H D N Does your FASTA file contain any characters that are not in this list? Andreas From biojava at hannes.oib.com Thu Oct 6 03:32:59 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 09:32:59 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment Message-ID: Hi again! So, my MSA is now working, thanks for the help so far. What I ran into now is that most of the meta-information of a Sequence seems to get lost during the MSA step. e.g.: I'd like to attach the string from the fasta file (description, header, whatever it is called) to the sequence and keep it attached for further use after the MSA step. 1) shouldn't the fasta reader automatically populate this info? which is the correct field (OriginalHeader or Description or something else)? 2) the MSA step seems to throw away all meta info. I attached the string as OriginalHeader before starting the MSA, and afterwards it is "null" Thanks, Hannes From biojava at hannes.oib.com Thu Oct 6 04:07:17 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 10:07:17 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller wrote: > Hi again! > What I ran into now is that most of the meta-information of a Sequence > seems to get lost during the MSA step. Okay, that was something caused by following another cookbook script (that, unfortunately, has absolutely no docs or comments) - I found the getOriginalSequence() method, can work with that. Thanks! Hannes From biojava at hannes.oib.com Thu Oct 6 06:37:23 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 12:37:23 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: Thanks, I'll try to add what I find out. It's a wiki after all. I'll just ask the mailing list if things are unclear before I add stuff to the wiki. One thing that bugged me just now, and since I can't find documentation on it: Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather un-java-like, especially since you just get an OutOfBoundsException, and the range is not specified in the javadoc, or I could not find it easily in the complex class hierarchy. Hannes 2011/10/6 Scooter Willis : > Hannes > > As you can tell we need to improve the cookbook examples. Since you are > going through that process would welcome any contributions you can make. > > Thanks > > Scooter > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" > wrote: > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> What I ran into now is that most of the meta-information of a Sequence >>> seems to get lost during the MSA step. >> >>Okay, that was something caused by following another cookbook script >>(that, unfortunately, has absolutely no docs or comments) - I found >>the getOriginalSequence() method, can work with that. Thanks! >> >>Hannes >> >>_______________________________________________ >>Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > From HWillis at scripps.edu Thu Oct 6 06:32:10 2011 From: HWillis at scripps.edu (Scooter Willis) Date: Thu, 6 Oct 2011 06:32:10 -0400 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: Message-ID: Hannes As you can tell we need to improve the cookbook examples. Since you are going through that process would welcome any contributions you can make. Thanks Scooter On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" wrote: >On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > wrote: >> Hi again! >> What I ran into now is that most of the meta-information of a Sequence >> seems to get lost during the MSA step. > >Okay, that was something caused by following another cookbook script >(that, unfortunately, has absolutely no docs or comments) - I found >the getOriginalSequence() method, can work with that. Thanks! > >Hannes > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas.prlic at gmail.com Thu Oct 6 11:35:22 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 6 Oct 2011 08:35:22 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, We are having problems with the anonymous SVN server quite often. I recommend trying the git copy at github, or their SVN interface, or using Maven to install it. http://www.biojava.org/wiki/CVS_to_SVN_Migration Andreas On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: > Dear Andreas, > > I wish to install BioJava for my eclipse IDE. > I followed all the instructions in the website > http://www.biojava.org/wiki/BioJava3_eclipse. > I currently have problems with the last step. I create a new Maven > project, but when I try to type /biojava/biojava-live/trunk in the SCM > URL, I get an "invalid URL" error. > Please let me know what seems to be the problem and how can I fix it. > > Thanks much! > omer > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From sbliven at ucsd.edu Thu Oct 6 14:29:06 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Thu, 6 Oct 2011 11:29:06 -0700 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: I think that 1-based indexing was chosen because that's what's used in genome databases like GenBank. Thus gene offsets from outside data sources can be used directly without subtracting 1. That said, almost every time I write code using sequences I get off-by-one errors, so I understand your frustration. We should definitely improve documentation so that every method that takes 1-based indexes are clearly marked. -Spencer On Thu, Oct 6, 2011 at 03:37, Hannes Brandst?tter-M?ller < biojava at hannes.oib.com> wrote: > Thanks, I'll try to add what I find out. It's a wiki after all. I'll > just ask the mailing list if things are unclear before I add stuff to > the wiki. > > One thing that bugged me just now, and since I can't find documentation on > it: > > Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather > un-java-like, especially since you just get an OutOfBoundsException, > and the range is not specified in the javadoc, or I could not find it > easily in the complex class hierarchy. > > Hannes > > 2011/10/6 Scooter Willis : > > Hannes > > > > As you can tell we need to improve the cookbook examples. Since you are > > going through that process would welcome any contributions you can make. > > > > Thanks > > > > Scooter > > > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" > > > wrote: > > > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > >> wrote: > >>> Hi again! > >>> What I ran into now is that most of the meta-information of a Sequence > >>> seems to get lost during the MSA step. > >> > >>Okay, that was something caused by following another cookbook script > >>(that, unfortunately, has absolutely no docs or comments) - I found > >>the getOriginalSequence() method, can work with that. Thanks! > >> > >>Hannes > >> > >>_______________________________________________ > >>Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas.prlic at gmail.com Sun Oct 9 21:58:43 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Sun, 9 Oct 2011 18:58:43 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, Our recommendation nowadays for working with Blast is to get XML output and simply parse that. Depending on what you need that will give you more details than the parser that is in the BioJava 1.8 (legacy) project. (if you still want to give that one a try, there is a cookbook page for it) Andreas On Sun, Oct 9, 2011 at 2:26 AM, Omer Eilam wrote: > I eventually succeeded in importing Biojava. > I read in the paper that there is a parser for BLAST - where can I get > more information/API on this? > > Thanks! > omer > > On Thu, Oct 6, 2011 at 5:35 PM, Andreas Prlic wrote: >> Hi Omer, >> >> We are having problems with the anonymous SVN server quite often. I >> recommend trying the git copy at github, or their SVN interface, or >> using Maven to install it. >> >> http://www.biojava.org/wiki/CVS_to_SVN_Migration >> >> Andreas >> >> On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: >>> Dear Andreas, >>> >>> I wish to install BioJava for my eclipse IDE. >>> I followed all the instructions in the website >>> http://www.biojava.org/wiki/BioJava3_eclipse. >>> I currently have problems with the last step. I create a new Maven >>> project, but when I try to type /biojava/biojava-live/trunk in the SCM >>> URL, I get an "invalid URL" error. >>> Please let me know what seems to be the problem and how can I fix it. >>> >>> Thanks much! >>> omer >>> >>> -- >>> Omer Eilam >>> Complex Network Systems >>> Prof. Eytan Ruppin >>> Tel-Aviv University >>> http://cns.cs.tau.ac.il/ >>> >> > > > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From shakunb at uom.ac.mu Mon Oct 10 05:03:34 2011 From: shakunb at uom.ac.mu (Shakuntala Baichoo) Date: Mon, 10 Oct 2011 13:03:34 +0400 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 5 In-Reply-To: References: Message-ID: Does anyone know how to find codon usage from data in a genbank file or embl file. Thanks Shakun On Fri, Oct 7, 2011 at 8:00 PM, wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: losing meta-info after multiple sequence alignment > (Spencer Bliven) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 6 Oct 2011 11:29:06 -0700 > From: Spencer Bliven > Subject: Re: [Biojava-l] losing meta-info after multiple sequence > alignment > To: Hannes Brandst?tter-M?ller > Cc: "Biojava-l at lists.open-bio.org" , > Scooter Willis > Message-ID: > > > Content-Type: text/plain; charset=UTF-8 > > I think that 1-based indexing was chosen because that's what's used in > genome databases like GenBank. Thus gene offsets from outside data sources > can be used directly without subtracting 1. That said, almost every time I > write code using sequences I get off-by-one errors, so I understand your > frustration. We should definitely improve documentation so that every > method > that takes 1-based indexes are clearly marked. > > -Spencer > > On Thu, Oct 6, 2011 at 03:37, Hannes Brandst?tter-M?ller < > biojava at hannes.oib.com> wrote: > > > Thanks, I'll try to add what I find out. It's a wiki after all. I'll > > just ask the mailing list if things are unclear before I add stuff to > > the wiki. > > > > One thing that bugged me just now, and since I can't find documentation > on > > it: > > > > Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather > > un-java-like, especially since you just get an OutOfBoundsException, > > and the range is not specified in the javadoc, or I could not find it > > easily in the complex class hierarchy. > > > > Hannes > > > > 2011/10/6 Scooter Willis : > > > Hannes > > > > > > As you can tell we need to improve the cookbook examples. Since you are > > > going through that process would welcome any contributions you can > make. > > > > > > Thanks > > > > > > Scooter > > > > > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" < > biojava at hannes.oib.com > > > > > > wrote: > > > > > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > > >> wrote: > > >>> Hi again! > > >>> What I ran into now is that most of the meta-information of a > Sequence > > >>> seems to get lost during the MSA step. > > >> > > >>Okay, that was something caused by following another cookbook script > > >>(that, unfortunately, has absolutely no docs or comments) - I found > > >>the getOriginalSequence() method, can work with that. Thanks! > > >> > > >>Hannes > > >> > > >>_______________________________________________ > > >>Biojava-l mailing list - Biojava-l at lists.open-bio.org > > >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 5 > ***************************************** > Email Disclaimer: This email and all its contents are subject to the disclaimer at http://www.uom.ac.mu/emaildisclaimer From tariq_cp at hotmail.com Mon Oct 10 12:27:27 2011 From: tariq_cp at hotmail.com (Muhammad Tariq Pervez) Date: Mon, 10 Oct 2011 16:27:27 +0000 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 2 In-Reply-To: References: Message-ID: Yes, the issue was also faced by me. I also highlighted the solution. The solution has also been uploaded/updated in SVN. Muhammad Tariq Pervez PhD Scholar > From: biojava-l-request at lists.open-bio.org > Subject: Biojava-l Digest, Vol 105, Issue 2 > To: biojava-l at lists.open-bio.org > Date: Wed, 5 Oct 2011 12:00:04 -0400 > > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. NullPointerException when using > Alignments.getMultipleSequenceAlignment (Hannes Brandst?tter-M?ller) > 2. Re: NullPointerException when using > Alignments.getMultipleSequenceAlignment (Hannes Brandst?tter-M?ller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 5 Oct 2011 09:41:57 +0200 > From: Hannes Brandst?tter-M?ller > Subject: [Biojava-l] NullPointerException when using > Alignments.getMultipleSequenceAlignment > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hello! > > I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but > when I run it, I get: > > java.util.concurrent.ExecutionException: java.lang.NullPointerException > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) > at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) > at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:173) > > What could I be doing wrong? > > ( > on the cookbook page, there is also an import missing: > import org.biojava3.alignment.Alignments; > ) > -> then the cookbook runs, but my code does not > > private static void processFile(String filename) { > try { > FileInputStream inStream = new FileInputStream(filename); > FastaReader fastaReader = > new FastaReader( > inStream, > new GenericFastaHeaderParser NucleotideCompound>(), > new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > LinkedHashMap b = fastaReader.process(); > > List sequences = new ArrayList(); > for (Entry entry : b.entrySet()) { > if (sequences.size() < 5) { > sequences.add(entry.getValue()); > } > System.out.println(entry.getValue()); > } > > Profile profile = > Alignments.getMultipleSequenceAlignment(sequences); > System.out.printf("Clustalw:%n%s%n", profile); > > ConcurrencyTools.shutdown(); > } catch (Exception ex) { > Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); > } > } > > > ------------------------------ > > Message: 2 > Date: Wed, 5 Oct 2011 10:50:33 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] NullPointerException when using > Alignments.getMultipleSequenceAlignment > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi! > > I read the sequence from a fasta file. > > FileInputStream inStream = new FileInputStream(filename); > FastaReader fastaReader = > new FastaReader( > inStream, > new GenericFastaHeaderParser NucleotideCompound>(), > new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > LinkedHashMap b = fastaReader.process(); > > and then use > List sequences = new ArrayList(); > for (Entry entry : b.entrySet()) { > sequences.add(entry.getValue()); > } > > to get the required list of DNA sequences. > > I noticed in an earlier discussion, there was some talk about this too > (3-4 months ago, perhaps) and something about a possible fix in SVN. > when will it be released on the maven server? > > Hannes > > On Wed, Oct 5, 2011 at 10:46, Hashem Koohy wrote: > > Hi Hannes, > > It seems to me it doesn't like your ?dna Sequence! > > Is your sequence in the following format? > > > > Sequence dnaSeq = DNATools.createDNASequence("acccgggttttacagt", "id"); > > > > Good luck > > Hashem > > > > On 05/10/2011 08:41, "Hannes Brandst?tter-M?ller" > > wrote: > > > >> Hello! > >> > >> I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but > >> when I run it, I get: > >> > >> java.util.concurrent.ExecutionException: java.lang.NullPointerException > >> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) > >> at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) > >> at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) > >> at > >> org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java > >> :173) > >> > >> ?What could I be doing wrong? > >> > >> ( > >> on the cookbook page, there is also an import missing: > >> import org.biojava3.alignment.Alignments; > >> ) > >> -> then the cookbook runs, but my code does not > >> > >> private static void processFile(String filename) { > >> ? ? ? ? try { > >> ? ? ? ? ? ? FileInputStream inStream = new FileInputStream(filename); > >> ? ? ? ? ? ? FastaReader fastaReader = > >> ? ? ? ? ? ? ? ? ? ? new FastaReader( > >> ? ? ? ? ? ? ? ? ? ? inStream, > >> ? ? ? ? ? ? ? ? ? ? new GenericFastaHeaderParser >> NucleotideCompound>(), > >> ? ? ? ? ? ? ? ? ? ? new > >> DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > >> ? ? ? ? ? ? LinkedHashMap b = fastaReader.process(); > >> > >> ? ? ? ? ? ? List sequences = new ArrayList(); > >> ? ? ? ? ? ? for (Entry entry : b.entrySet()) { > >> ? ? ? ? ? ? ? ? if (sequences.size() < 5) { > >> ? ? ? ? ? ? ? ? ? ? sequences.add(entry.getValue()); > >> ? ? ? ? ? ? ? ? } > >> ? ? ? ? ? ? ? ? System.out.println(entry.getValue()); > >> ? ? ? ? ? ? } > >> > >> ? ? ? ? ? ? Profile profile = > >> Alignments.getMultipleSequenceAlignment(sequences); > >> ? ? ? ? ? ? System.out.printf("Clustalw:%n%s%n", profile); > >> > >> ? ? ? ? ? ? ConcurrencyTools.shutdown(); > >> ? ? ? ? } catch (Exception ex) { > >> ? ? ? ? ? ? Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); > >> ? ? ? ? } > >> ? ? } > >> _______________________________________________ > >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > ------------------------------- > > Hashem Koohy > > PhD > > Postdoctoral Fellow, > > Sanger Institute, > > Cambridge > > Mobile: 07515425433 > > > > > > > > > > -- > > ?The Wellcome Trust Sanger Institute is operated by Genome Research > > ?Limited, a charity registered in England with number 1021457 and a > > ?company registered in England with number 2742969, whose registered > > ?office is 215 Euston Road, London, NW1 2BE. > > > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 2 > ***************************************** From xue.vin at gmail.com Mon Oct 10 20:39:52 2011 From: xue.vin at gmail.com (Vincent Xue) Date: Mon, 10 Oct 2011 20:39:52 -0400 Subject: [Biojava-l] Stockholm Parser Implementation? Message-ID: Hi, I was wondering if there has been any work on developing a Stockholm 1.0 parser? Thanks! From andreas.prlic at gmail.com Tue Oct 11 18:51:48 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Tue, 11 Oct 2011 15:51:48 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, please keep such questions on the public list. Chances are high that if you are struggling with something, somebody else is having the same problem as well. I wasactually talking about BioJava 1.8, the previous version of BioJava. It is still available and the cookbook page that I meant is here: http://biojava.org/wiki/BioJava:CookBook:Blast:Parser About Maven: It is probably best to start with a new project in your favorite IDE and set it up as a Maven project that depends on BioJava. (i.e. add the BioJava repository to the pom configuraiton). If you think this is difficult, we can/should set up a documentation page that explains those first steps. Andreas On Tue, Oct 11, 2011 at 6:34 AM, Omer Eilam wrote: > Are you referring to the BlastXMLQuery class? because I see in the API > that it contains only one method. > Also, I am not familiar with using Maven, I see the various biojava > packages in the eclipse project explorer, but I don't know how to > create a new class (i.e. do I need imports and stuff?) > > Thanks again! > omer > > On Mon, Oct 10, 2011 at 3:58 AM, Andreas Prlic wrote: >> Hi Omer, >> >> Our recommendation nowadays for working with Blast is to get XML >> output and simply parse that. Depending on what you need that will >> give you more details than the parser that is in the BioJava 1.8 >> (legacy) project. (if you still want to give that one a try, there is >> a cookbook page for it) >> >> Andreas >> >> >> On Sun, Oct 9, 2011 at 2:26 AM, Omer Eilam wrote: >>> I eventually succeeded in importing Biojava. >>> I read in the paper that there is a parser for BLAST - where can I get >>> more information/API on this? >>> >>> Thanks! >>> omer >>> >>> On Thu, Oct 6, 2011 at 5:35 PM, Andreas Prlic wrote: >>>> Hi Omer, >>>> >>>> We are having problems with the anonymous SVN server quite often. I >>>> recommend trying the git copy at github, or their SVN interface, or >>>> using Maven to install it. >>>> >>>> http://www.biojava.org/wiki/CVS_to_SVN_Migration >>>> >>>> Andreas >>>> >>>> On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: >>>>> Dear Andreas, >>>>> >>>>> I wish to install BioJava for my eclipse IDE. >>>>> I followed all the instructions in the website >>>>> http://www.biojava.org/wiki/BioJava3_eclipse. >>>>> I currently have problems with the last step. I create a new Maven >>>>> project, but when I try to type /biojava/biojava-live/trunk in the SCM >>>>> URL, I get an "invalid URL" error. >>>>> Please let me know what seems to be the problem and how can I fix it. >>>>> >>>>> Thanks much! >>>>> omer >>>>> >>>>> -- >>>>> Omer Eilam >>>>> Complex Network Systems >>>>> Prof. Eytan Ruppin >>>>> Tel-Aviv University >>>>> http://cns.cs.tau.ac.il/ >>>>> >>>> >>> >>> >>> >>> -- >>> Omer Eilam >>> Complex Network Systems >>> Prof. Eytan Ruppin >>> Tel-Aviv University >>> http://cns.cs.tau.ac.il/ >>> >> > > > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From andreas.prlic at gmail.com Wed Oct 12 12:00:14 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Wed, 12 Oct 2011 09:00:14 -0700 Subject: [Biojava-l] User defined substitution matrix. In-Reply-To: References: Message-ID: Hi Canan, Please subscribe to the biojava-l mailing list for sending messages to the list. Details for how to do that can be found here: http://biojava.org/wiki/BioJava:MailingLists About your question: getResourceAsStream is just opening a stream to a file that is bundled together with a jar file. There should be no need for you to call that to get a matrix. If you want to use any of the (many) standard substitution matrices that are supported by BioJava you just need to call SubstitutionMatrixHelper .getMatrixFromAAINDEX(nameOfMatrix) If you want to work with a matrix that is not yet the AAINDEX collection of matrices, it depends on how the matrix is represented in the file. If it is in the same style as in AAINDEX, you can use the DefaultAAIndexProvider to parse your custom matrix. Andreas On Tue, Oct 11, 2011 at 11:09 PM, Canan Has wrote: > Dear Dr. Prlic, > Sorry to bother you, but my mails to biojava forum-nabble are not being > answered. Even I have been already subscribed, I am getting alerts saying > your message is in pending list, because you are not subscribed.?Therefore, > I decided to ?ask my question to you directly. I hope you will answer to me. > Because, it is so important and urgent. > Simply,I want to create my own substitution matrix object. I examined the > source code of SubstitutionMatrixHelper and the related others. I did some > modifications and took null pointer exception for getResourceAsStream(). > I tried to find out where the directory for getResourceAsStream() has been > set. At first, I thought the file directory is the resources folder in > alignment folder. Then, I deleted one of the matrices - pam250 and tried to > initialize it by calling getPam250() and matrix was created. This leads me > to think that the developer introduced ftp.ncbi url to ?getResourceAsStream > to take matrices by name. > Can you show a way to create my own? Or you can tell me and I can change > where the url is given and recompile the code. > Thanks in advance > Canan Has > Research Assistant & MSc Student > Computational Biology & Bioinformatics Lab > Molecular Biology and Genetics Department > Izmir Institute of Technology > From andreas at sdsc.edu Wed Oct 12 12:05:58 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 12 Oct 2011 09:05:58 -0700 Subject: [Biojava-l] Stockholm Parser Implementation? In-Reply-To: References: Message-ID: Hi Vincent, I am not aware of parser for that as part of Biojava. It would be great to have, though. If you want to make a contribution, you would be more than welcome ... Andreas On Mon, Oct 10, 2011 at 5:39 PM, Vincent Xue wrote: > Hi, > > I was wondering if there has been any work on developing a Stockholm > 1.0 parser? > > Thanks! > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jprocter at compbio.dundee.ac.uk Fri Oct 14 05:07:01 2011 From: jprocter at compbio.dundee.ac.uk (Jim Procter) Date: Fri, 14 Oct 2011 10:07:01 +0100 Subject: [Biojava-l] Stockholm Parser Implementation? In-Reply-To: References: Message-ID: <4E97FBB5.5090402@compbio.dundee.ac.uk> On 12/10/2011 17:05, Andreas Prlic wrote: > Hi Vincent, > > I am not aware of parser for that as part of Biojava. It would be > great to have, though. If you want to make a contribution, you would > be more than welcome ... I thought Jules worked on this - he took a look at Jalview's one and promptly wrote a better one, IIRC. However, whether it got in to bj3 is another matter! Perhaps asking him nicely would work :) Jim. From kern3020 at gmail.com Mon Oct 17 20:49:17 2011 From: kern3020 at gmail.com (John Kern) Date: Mon, 17 Oct 2011 17:49:17 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? Message-ID: Hello, I am not to Biojava and bioinformatics in general. I noticed your latest release is 3.0.2 but the tutorial reference to the 1.8 release ( http://biojava.org/wiki/BioJava:Tutorial). Does this imply the tutorial is out of sync with the tutorial? Regards, John From andreas at sdsc.edu Mon Oct 17 23:11:16 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 17 Oct 2011 20:11:16 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? In-Reply-To: References: Message-ID: Hi John, This Tutorial was written for what is currently the legacy 1.8 release. It would be great if somebody would volunteer and provide such a tutorial for the current 3.X code base ... Andreas On Mon, Oct 17, 2011 at 5:49 PM, John Kern wrote: > Hello, > > I am not to Biojava and bioinformatics in general. I noticed your latest > release is 3.0.2 but the tutorial reference to the 1.8 release ( > http://biojava.org/wiki/BioJava:Tutorial). Does this imply the tutorial is > out of sync with the tutorial? > > Regards, > John > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From biojava at hannes.oib.com Tue Oct 18 05:46:44 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 18 Oct 2011 11:46:44 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller wrote: > Hi again! > > I am quite happy with the Multiple Sequence Alignment, but I noticed > that there seems to be a limit of 132 Sequences that are present in > the final alignment - is this some kind of hardcoded limit, or can I > work around that somehow? > > Hannes > Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in fasta format. Is there a way to work around that limit? Hannes From kern3020 at gmail.com Tue Oct 18 10:42:54 2011 From: kern3020 at gmail.com (John Kern) Date: Tue, 18 Oct 2011 07:42:54 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? In-Reply-To: References: Message-ID: Hello Andreas, On Mon, Oct 17, 2011 at 8:11 PM, Andreas Prlic wrote: > It would be great if somebody would volunteer and provide > such a tutorial for the current ?3.X code base ... Great. Sign me up. I guess the way to proceed is to learn 3.x via the cookbook and examples. Then return to the tutorial. -jk From andreas at sdsc.edu Tue Oct 18 17:01:05 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 18 Oct 2011 14:01:05 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hi Hannes, did you try to increase memory settings for your JVM? e.g. -Xmx500M Andreas On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller wrote: > On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller > wrote: >> Hi again! >> >> I am quite happy with the Multiple Sequence Alignment, but I noticed >> that there seems to be a limit of 132 Sequences that are present in >> the final alignment - is this some kind of hardcoded limit, or can I >> work around that somehow? >> >> Hannes >> > > Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in > fasta format. Is there a way to work around that limit? > > Hannes > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 19 00:36:19 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 19 Oct 2011 06:36:19 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hi Andreas, I will try that later today if that makes any difference; I ran a larger alignment batch overnight, and I noticed that this limit seems to have been a coincidence; HOWEVER, the aligned sequences are always not as many as the input sequences, is this caused by memory constraints or how can I influence that? Hannes On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: > Hi Hannes, > > did you try to increase memory settings for your JVM? ?e.g. -Xmx500M > > Andreas > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > wrote: >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>> that there seems to be a limit of 132 Sequences that are present in >>> the final alignment - is this some kind of hardcoded limit, or can I >>> work around that somehow? >>> >>> Hannes >>> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> fasta format. Is there a way to work around that limit? >> >> Hannes >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > From biojava at hannes.oib.com Wed Oct 19 03:32:25 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 19 Oct 2011 09:32:25 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: I'm currently running another test, now with even more memory for java (500M) - it looks fine now so far. I'll re-check it later with the other files that gave me some problems, and will report back later today. I had a "out of heap" exception when I tried it with the default memory settings, and with 256M it seems to have swallowed some sequences - I'll re-check and help you reproduce. It would be really bad if the code would swallow sequences without error messages when running out of memory, so I'll make sure I have proof :D Hannes On Wed, Oct 19, 2011 at 09:22, Spencer Bliven wrote: > Hannes? > > There should not be a limit on the number of sequences, nor should you be > running into a memory problem. The FastaParser should be able to read > thousands of sequences, since it is used for genome FASTA files as well as > multiple alignments. My guess would be either a malformed FASTA file > (perhaps a problem with line endings?), or else a problem with the code to > generate the MultipleAlignment. Can you post some code snippets? > > -Spencer > > On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller > wrote: >> >> Hi Andreas, >> >> I will try that later today if that makes any difference; I ran a >> larger alignment batch overnight, and I noticed that this limit seems >> to have been a coincidence; HOWEVER, the aligned sequences are always >> not as many as the input sequences, is this caused by memory >> constraints or how can I influence that? >> >> Hannes >> >> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >> > Hi Hannes, >> > >> > did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >> > >> > Andreas >> > >> > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >> > wrote: >> >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> >> wrote: >> >>> Hi again! >> >>> >> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >> >>> that there seems to be a limit of 132 Sequences that are present in >> >>> the final alignment - is this some kind of hardcoded limit, or can I >> >>> work around that somehow? >> >>> >> >>> Hannes >> >>> >> >> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> >> fasta format. Is there a way to work around that limit? >> >> >> >> Hannes >> >> >> >> _______________________________________________ >> >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> >> > >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > From sbliven at ucsd.edu Wed Oct 19 03:22:55 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Wed, 19 Oct 2011 00:22:55 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hannes? There should not be a limit on the number of sequences, nor should you be running into a memory problem. The FastaParser should be able to read thousands of sequences, since it is used for genome FASTA files as well as multiple alignments. My guess would be either a malformed FASTA file (perhaps a problem with line endings?), or else a problem with the code to generate the MultipleAlignment. Can you post some code snippets? -Spencer On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller < biojava at hannes.oib.com> wrote: > Hi Andreas, > > I will try that later today if that makes any difference; I ran a > larger alignment batch overnight, and I noticed that this limit seems > to have been a coincidence; HOWEVER, the aligned sequences are always > not as many as the input sequences, is this caused by memory > constraints or how can I influence that? > > Hannes > > On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: > > Hi Hannes, > > > > did you try to increase memory settings for your JVM? e.g. -Xmx500M > > > > Andreas > > > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > > wrote: > >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller > >> wrote: > >>> Hi again! > >>> > >>> I am quite happy with the Multiple Sequence Alignment, but I noticed > >>> that there seems to be a limit of 132 Sequences that are present in > >>> the final alignment - is this some kind of hardcoded limit, or can I > >>> work around that somehow? > >>> > >>> Hannes > >>> > >> > >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in > >> fasta format. Is there a way to work around that limit? > >> > >> Hannes > >> > >> _______________________________________________ > >> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > >> > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jvb at Cs.Nott.AC.UK Wed Oct 19 07:15:26 2011 From: jvb at Cs.Nott.AC.UK (jvb at Cs.Nott.AC.UK) Date: 19 Oct 2011 12:15:26 +0100 Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil ? Message-ID: <201110191215.aa17789@pat.Cs.Nott.AC.UK> Hello, I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, even though it appears in the JavaDocs: http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html What is it's status? Can I get it, and should it rely on it if I can? Thanks, Jon From p.v.troshin at dundee.ac.uk Wed Oct 19 11:13:44 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Wed, 19 Oct 2011 16:13:44 +0100 Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil ? In-Reply-To: <201110191215.aa17789@pat.Cs.Nott.AC.UK> References: <201110191215.aa17789@pat.Cs.Nott.AC.UK> Message-ID: <4E9EE928.4050506@dundee.ac.uk> Hi Jon, This class is a part of protein disorder prediction JAR and a recent addition to BioJava. You are welcome to use if it suits your needs. Bear in mid though that the FASTA file reader from this class reads the content of the whole FASTA file at once, i.e. if you are working with large FASTA files you will want to use something else instead. I've got a Stream based FASTA reader if you need one and if there is not one in BioJava already. I would imagine the functionality from this class is not going to disappear overnight, but it may and perhaps should be merged with other FASTA parsers in BioJava once somebody have time to do this. Regards, Peter On 19/10/2011 12:15, jvb at cs.nott.ac.uk wrote: > Hello, > > I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, > even though it appears in the JavaDocs: > http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html > > What is it's status? Can I get it, and should it rely on it if I can? > > Thanks, > > Jon > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From khalil.elmazouari at gmail.com Wed Oct 19 14:36:28 2011 From: khalil.elmazouari at gmail.com (Khalil El Mazouari) Date: Wed, 19 Oct 2011 20:36:28 +0200 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 12 In-Reply-To: References: Message-ID: Hi Hannes, just did a MSA test with 521 seq... and it works. It must be a memory issue. try something like: java -Xmx1g -jar yourApp.jar args... If you don't have enough RAM, try with 500m as suggested by Andreas, Regards, Khalil On 19 Oct 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: Multiple Sequence Alignment - Limits? (Andreas Prlic) > 2. Re: Multiple Sequence Alignment - Limits? > (Hannes Brandst?tter-M?ller) > 3. Re: Multiple Sequence Alignment - Limits? > (Hannes Brandst?tter-M?ller) > 4. Re: Multiple Sequence Alignment - Limits? (Spencer Bliven) > 5. Status of org.biojava3.data.sequence.SequenceUtil ? > (jvb at Cs.Nott.AC.UK) > 6. Re: Status of org.biojava3.data.sequence.SequenceUtil ? > (Peter Troshin) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 18 Oct 2011 14:01:05 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Hannes Brandst?tter-M?ller > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Hannes, > > did you try to increase memory settings for your JVM? e.g. -Xmx500M > > Andreas > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > wrote: >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>> that there seems to be a limit of 132 Sequences that are present in >>> the final alignment - is this some kind of hardcoded limit, or can I >>> work around that somehow? >>> >>> Hannes >>> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> fasta format. Is there a way to work around that limit? >> >> Hannes >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > ------------------------------ > > Message: 2 > Date: Wed, 19 Oct 2011 06:36:19 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Andreas Prlic > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Andreas, > > I will try that later today if that makes any difference; I ran a > larger alignment batch overnight, and I noticed that this limit seems > to have been a coincidence; HOWEVER, the aligned sequences are always > not as many as the input sequences, is this caused by memory > constraints or how can I influence that? > > Hannes > > On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >> Hi Hannes, >> >> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >> >> Andreas >> >> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >> wrote: >>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>> wrote: >>>> Hi again! >>>> >>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>> that there seems to be a limit of 132 Sequences that are present in >>>> the final alignment - is this some kind of hardcoded limit, or can I >>>> work around that somehow? >>>> >>>> Hannes >>>> >>> >>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>> fasta format. Is there a way to work around that limit? >>> >>> Hannes >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> > > > > ------------------------------ > > Message: 3 > Date: Wed, 19 Oct 2011 09:32:25 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Spencer Bliven > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=windows-1252 > > I'm currently running another test, now with even more memory for java > (500M) - it looks fine now so far. I'll re-check it later with the > other files that gave me some problems, and will report back later > today. > > I had a "out of heap" exception when I tried it with the default > memory settings, and with 256M it seems to have swallowed some > sequences - I'll re-check and help you reproduce. It would be really > bad if the code would swallow sequences without error messages when > running out of memory, so I'll make sure I have proof :D > > Hannes > > On Wed, Oct 19, 2011 at 09:22, Spencer Bliven wrote: >> Hannes? >> >> There should not be a limit on the number of sequences, nor should you be >> running into a memory problem. The FastaParser should be able to read >> thousands of sequences, since it is used for genome FASTA files as well as >> multiple alignments. My guess would be either a malformed FASTA file >> (perhaps a problem with line endings?), or else a problem with the code to >> generate the MultipleAlignment. Can you post some code snippets? >> >> -Spencer >> >> On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller >> wrote: >>> >>> Hi Andreas, >>> >>> I will try that later today if that makes any difference; I ran a >>> larger alignment batch overnight, and I noticed that this limit seems >>> to have been a coincidence; HOWEVER, the aligned sequences are always >>> not as many as the input sequences, is this caused by memory >>> constraints or how can I influence that? >>> >>> Hannes >>> >>> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >>>> Hi Hannes, >>>> >>>> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >>>> >>>> Andreas >>>> >>>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >>>> wrote: >>>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>>>> wrote: >>>>>> Hi again! >>>>>> >>>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>>>> that there seems to be a limit of 132 Sequences that are present in >>>>>> the final alignment - is this some kind of hardcoded limit, or can I >>>>>> work around that somehow? >>>>>> >>>>>> Hannes >>>>>> >>>>> >>>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>>>> fasta format. Is there a way to work around that limit? >>>>> >>>>> Hannes >>>>> >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>>> >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> > > > > ------------------------------ > > Message: 4 > Date: Wed, 19 Oct 2011 00:22:55 -0700 > From: Spencer Bliven > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Hannes Brandst?tter-M?ller > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > Hannes? > > There should not be a limit on the number of sequences, nor should you be > running into a memory problem. The FastaParser should be able to read > thousands of sequences, since it is used for genome FASTA files as well as > multiple alignments. My guess would be either a malformed FASTA file > (perhaps a problem with line endings?), or else a problem with the code to > generate the MultipleAlignment. Can you post some code snippets? > > -Spencer > > On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller < > biojava at hannes.oib.com> wrote: > >> Hi Andreas, >> >> I will try that later today if that makes any difference; I ran a >> larger alignment batch overnight, and I noticed that this limit seems >> to have been a coincidence; HOWEVER, the aligned sequences are always >> not as many as the input sequences, is this caused by memory >> constraints or how can I influence that? >> >> Hannes >> >> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >>> Hi Hannes, >>> >>> did you try to increase memory settings for your JVM? e.g. -Xmx500M >>> >>> Andreas >>> >>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >>> wrote: >>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>>> wrote: >>>>> Hi again! >>>>> >>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>>> that there seems to be a limit of 132 Sequences that are present in >>>>> the final alignment - is this some kind of hardcoded limit, or can I >>>>> work around that somehow? >>>>> >>>>> Hannes >>>>> >>>> >>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>>> fasta format. Is there a way to work around that limit? >>>> >>>> Hannes >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > ------------------------------ > > Message: 5 > Date: 19 Oct 2011 12:15:26 +0100 > From: jvb at Cs.Nott.AC.UK > Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil > ? > To: biojava-l > Message-ID: <201110191215.aa17789 at pat.Cs.Nott.AC.UK> > Content-Type: text/plain; format=flowed; charset=ISO-8859-1 > > Hello, > > I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, even > though it appears in the JavaDocs: > http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html > > What is it's status? Can I get it, and should it rely on it if I can? > > Thanks, > > Jon > > > > > ------------------------------ > > Message: 6 > Date: Wed, 19 Oct 2011 16:13:44 +0100 > From: Peter Troshin > Subject: Re: [Biojava-l] Status of > org.biojava3.data.sequence.SequenceUtil ? > To: jvb at cs.nott.ac.uk > Cc: biojava-l at lists.open-bio.org > Message-ID: <4E9EE928.4050506 at dundee.ac.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi Jon, > > This class is a part of protein disorder prediction JAR and a recent > addition to BioJava. You are welcome to use if it suits your needs. > Bear in mid though that the FASTA file reader from this class reads the > content of the whole FASTA file at once, i.e. if you are working with > large FASTA files you will want to use something else instead. I've got > a Stream based FASTA reader if you need one and if there is not one in > BioJava already. > I would imagine the functionality from this class is not going to > disappear overnight, but it may and perhaps should be merged with other > FASTA parsers in BioJava once somebody have time to do this. > > Regards, > Peter > > > On 19/10/2011 12:15, jvb at cs.nott.ac.uk wrote: >> Hello, >> >> I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, >> even though it appears in the JavaDocs: >> http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html >> >> What is it's status? Can I get it, and should it rely on it if I can? >> >> Thanks, >> >> Jon >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 12 > ****************************************** From biojava at hannes.oib.com Thu Oct 20 08:29:14 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 20 Oct 2011 14:29:14 +0200 Subject: [Biojava-l] Undeclared/uncaught exception in Fasta Parser Message-ID: If you feed the fasta parser (code from cookbook) with an empty file, you get a java.lang.ArrayIndexOutOfBoundsException: 0 at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:111) at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:60) at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:136) Code: FileInputStream inStream = new FileInputStream(inputFileName); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap originalSeqs = fastaReader.process(); //boom! I believe this exception should either be caught and converted to an IOException or declared. Hannes ps: I also ran into an OutOfMemoryError, I tried MSA with approx. 1000+ sequences :D even though I gave java 4000M - are there any metrics I can use before starting the MSA to determine If it can work? java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.biojava3.alignment.Alignments.getProgressiveAlignment(Alignments.java:560) at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:187) at ... Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at org.biojava3.core.sequence.location.SimplePoint.getPosition(SimplePoint.java:55) at org.biojava3.alignment.SimpleAlignedSequence.isGap(SimpleAlignedSequence.java:231) at org.biojava3.alignment.SimpleAlignedSequence.getSequenceIndexAt(SimpleAlignedSequence.java:205) at org.biojava3.alignment.SimpleAlignedSequence.getCompoundAt(SimpleAlignedSequence.java:270) at org.biojava3.alignment.SimpleProfile.getCompoundsAt(SimpleProfile.java:231) at org.biojava3.alignment.SimpleProfile.getCompoundCountsAt(SimpleProfile.java:217) at org.biojava3.alignment.SimpleProfile.getCompoundWeightsAt(SimpleProfile.java:249) at org.biojava3.alignment.template.AbstractProfileProfileAligner.reset(AbstractProfileProfileAligner.java:235) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:215) at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:270) at org.biojava3.alignment.template.AbstractProfileProfileAligner.getPair(AbstractProfileProfileAligner.java:158) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:54) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:38) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:210) at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:270) at org.biojava3.alignment.template.AbstractProfileProfileAligner.getPair(AbstractProfileProfileAligner.java:158) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:54) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:38) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at org.biojava3.core.sequence.location.SimplePoint.getPosition(SimplePoint.java:55) at org.biojava3.alignment.SimpleAlignedSequence.isGap(SimpleAlignedSequence.java:231) at org.biojava3.alignment.SimpleAlignedSequence.getSequenceIndexAt(SimpleAlignedSequence.java:205) at org.biojava3.alignment.SimpleAlignedSequence.getCompoundAt(SimpleAlignedSequence.java:270) at org.biojava3.alignment.SimpleProfile.getCompoundsAt(SimpleProfile.java:231) at org.biojava3.alignment.SimpleProfile.getCompoundCountsAt(SimpleProfile.java:217) at org.biojava3.alignment.SimpleProfile.getCompoundWeightsAt(SimpleProfile.java:249) at org.biojava3.alignment.template.AbstractProfileProfileAligner.reset(AbstractProfileProfileAligner.java:235) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:215) ... 9 more From kern3020 at gmail.com Thu Oct 20 12:19:27 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 09:19:27 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? Message-ID: Hello Hannes, When programs run out of memory it can be insightful to look at them via the operating system. It helps to determine if the program consumes all the memory or is hitting another limit. These comments are specific to Lunix. All versions of UNIX have similar shells. If you are running Windows, this is not relevant. Process size can be limited by a shell on unix systems. $ man sh $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited On my system, there are now limits. What are the results on your system? If the shell is not limiting your process and as other have pointed out the JVM isn't limited, the program top can be very insightful. http://www.kernelhardware.org/linux-top-command/ When the program is running out of memory, does top confirm you have no more memory? Regards, John From kern3020 at gmail.com Thu Oct 20 16:12:39 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 13:12:39 -0700 Subject: [Biojava-l] demos from 1.8 Message-ID: Hello, The 1.8 tutorial refers to a demos directory. "Additionally, a number of small demo programs can be found in the demos directory of the BioJava source distribution." Would it make sense to migrate them to the 3.x APIs? I downloaded the aggregate tar ball for 1.8.1. I do not see it there. Where can I find it? Thanks, -john From andreas at sdsc.edu Thu Oct 20 17:19:27 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 20 Oct 2011 14:19:27 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: Hi John, several of the 3.X modules have a demo directory that give examples how to work with them. I believe in the old code base there was also a demo directory somewhere, not sure if it gets bundled in the aggregate tar ball though. Andreas On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: > Hello, > > The 1.8 tutorial refers to a demos directory. > "Additionally, a number of small demo programs can be found in the > demos directory of the BioJava source distribution." > > Would it make sense to migrate them to the 3.x APIs? > > I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. > Where can I find it? > > Thanks, > -john > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Thu Oct 20 17:33:56 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 14:33:56 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: Hello Andreas, I checked out the 3.x branch from subversion. I see two demo directories. jkern at ubuntu:~/src/java/biojava-trunk/biojava/biojava3-core/src/main/java/org/biojava3/core/sequence$ find ~/src/java/biojava-trunk/ -name demo /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure-gui/src/main/java/demo /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure/src/main/java/demo I found no corresponding directories in the 1.8.1 tar ball. One of the articles (http://biojava.org/wiki/BioJava:Tutorial:Dynamic_programming_examples) refers to a source file called Dice.java. Would it make sense to migrate it to the new source? If so, do you know where I can find it? -jk On Thu, Oct 20, 2011 at 2:19 PM, Andreas Prlic wrote: > Hi John, > > several of the 3.X modules have a demo directory that give examples > how to work with them. I believe in the old code base there was also a > demo directory somewhere, not sure if it gets bundled in the aggregate > tar ball though. > > Andreas > > > On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: >> Hello, >> >> The 1.8 tutorial refers to a demos directory. >> "Additionally, a number of small demo programs can be found in the >> demos directory of the BioJava source distribution." >> >> Would it make sense to migrate them to the 3.x APIs? >> >> I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. >> Where can I find it? >> >> Thanks, >> -john >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > From andreas at sdsc.edu Thu Oct 20 17:37:44 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 20 Oct 2011 14:37:44 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: I think that most of those demos are heavily dependent on 1.8 code and would need some major refactoring to work for 3.x. The best reference to get started is the Cookbook ... (and we probably need more examples there for 3.X ) Andreas On Thu, Oct 20, 2011 at 2:33 PM, John Kern wrote: > Hello Andreas, > > I checked out the 3.x branch from subversion. I see two demo directories. > > jkern at ubuntu:~/src/java/biojava-trunk/biojava/biojava3-core/src/main/java/org/biojava3/core/sequence$ > find ~/src/java/biojava-trunk/ -name demo > /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure-gui/src/main/java/demo > /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure/src/main/java/demo > > I found no corresponding directories in the 1.8.1 tar ball. > > One of the articles > (http://biojava.org/wiki/BioJava:Tutorial:Dynamic_programming_examples) > refers to a source file called Dice.java. Would it make sense to > migrate it to the new source? If so, do you know where I can find it? > > -jk > > > On Thu, Oct 20, 2011 at 2:19 PM, Andreas Prlic wrote: >> Hi John, >> >> several of the 3.X modules have a demo directory that give examples >> how to work with them. I believe in the old code base there was also a >> demo directory somewhere, not sure if it gets bundled in the aggregate >> tar ball though. >> >> Andreas >> >> >> On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: >>> Hello, >>> >>> The 1.8 tutorial refers to a demos directory. >>> "Additionally, a number of small demo programs can be found in the >>> demos directory of the BioJava source distribution." >>> >>> Would it make sense to migrate them to the 3.x APIs? >>> >>> I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. >>> Where can I find it? >>> >>> Thanks, >>> -john >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> > From kern3020 at gmail.com Tue Oct 25 20:55:39 2011 From: kern3020 at gmail.com (John Kern) Date: Tue, 25 Oct 2011 17:55:39 -0700 Subject: [Biojava-l] demo directory in API Message-ID: Hello, While reviewing the BioJava 3.x API (http://www.biojava.org/docs/api/index.html), I noticed the demo directory Andreas mentioned. I do see it in neither 3.0.2 tarball nor the subversion checkout from the trunk. If there is a demo directory for the current source base, I would love to see it. -jk From andreas at sdsc.edu Tue Oct 25 21:00:44 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 25 Oct 2011 18:00:44 -0700 Subject: [Biojava-l] demo directory in API In-Reply-To: References: Message-ID: it is in the structure module. Andreas On Tue, Oct 25, 2011 at 5:55 PM, John Kern wrote: > Hello, > > While reviewing the BioJava 3.x API > (http://www.biojava.org/docs/api/index.html), I noticed the demo > directory Andreas mentioned. I do see it in neither 3.0.2 tarball nor > the subversion checkout from the trunk. If there is a demo directory > for the current source base, I would love to see it. > > -jk > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Wed Oct 26 19:00:46 2011 From: kern3020 at gmail.com (John Kern) Date: Wed, 26 Oct 2011 16:00:46 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? Message-ID: Hello, Based I reviewing the 1.8 version of the tutorial, I want to write a section in the tutorial about features, locations and annotations. Would someone recommend a good sequence(s) to work with? Thanks, -John From andreas at sdsc.edu Thu Oct 27 13:44:03 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 27 Oct 2011 10:44:03 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? In-Reply-To: References: Message-ID: Hi John, Are you working on the 1.8 tutorial? It would be better to focus documentation efforts on the 3.X branch! We declared the 1.8 code base to be legacy and all developmental efforts are happening on the 3.X trunk. Andreas On Wed, Oct 26, 2011 at 4:00 PM, John Kern wrote: > Hello, > > Based I reviewing the 1.8 version of the tutorial, I want to write a > section in the tutorial about features, locations and annotations. > Would someone recommend a good sequence(s) to work with? > > Thanks, > -John > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Thu Oct 27 15:20:54 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 27 Oct 2011 12:20:54 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? In-Reply-To: References: Message-ID: Hello Andreas, I was still looking at it at a very high level. I want to understand the appropriate workflow for BioJava. I am a software engineer. I do not share your background in biology but I want to learn as much as I possible. I am reading a college-level introduction to biology. Additional suggestions to my read list would be appreciated. On Thu, Oct 27, 2011 at 10:44 AM, Andreas Prlic wrote: > Are you working on the 1.8 tutorial? It would be better to focus > documentation efforts on the 3.X branch! We declared the 1.8 code base > to be legacy and all developmental efforts are happening on the 3.X > trunk. Thanks. I will do that. Sincerely, John From tiagoantao at gmail.com Mon Oct 3 22:12:18 2011 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 3 Oct 2011 23:12:18 +0100 Subject: [Biojava-l] VCF parser Message-ID: Hi, I wonder if there is a VCF parser in either Python or Java? Either I am being dumb at searching (probably) or nothing exists? Thanks, Tiago -- "If you want to get laid, go to college.? If you want an education, go to the library." - Frank Zappa From biojava at hannes.oib.com Wed Oct 5 07:41:57 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 09:41:57 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment Message-ID: Hello! I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but when I run it, I get: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:173) What could I be doing wrong? ( on the cookbook page, there is also an import missing: import org.biojava3.alignment.Alignments; ) -> then the cookbook runs, but my code does not private static void processFile(String filename) { try { FileInputStream inStream = new FileInputStream(filename); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap b = fastaReader.process(); List sequences = new ArrayList(); for (Entry entry : b.entrySet()) { if (sequences.size() < 5) { sequences.add(entry.getValue()); } System.out.println(entry.getValue()); } Profile profile = Alignments.getMultipleSequenceAlignment(sequences); System.out.printf("Clustalw:%n%s%n", profile); ConcurrencyTools.shutdown(); } catch (Exception ex) { Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); } } From biojava at hannes.oib.com Wed Oct 5 08:50:33 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 10:50:33 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Hi! I read the sequence from a fasta file. FileInputStream inStream = new FileInputStream(filename); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap b = fastaReader.process(); and then use List sequences = new ArrayList(); for (Entry entry : b.entrySet()) { sequences.add(entry.getValue()); } to get the required list of DNA sequences. I noticed in an earlier discussion, there was some talk about this too (3-4 months ago, perhaps) and something about a possible fix in SVN. when will it be released on the maven server? Hannes On Wed, Oct 5, 2011 at 10:46, Hashem Koohy wrote: > Hi Hannes, > It seems to me it doesn't like your ?dna Sequence! > Is your sequence in the following format? > > Sequence dnaSeq = DNATools.createDNASequence("acccgggttttacagt", "id"); > > Good luck > Hashem > > On 05/10/2011 08:41, "Hannes Brandst?tter-M?ller" > wrote: > >> Hello! >> >> I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but >> when I run it, I get: >> >> java.util.concurrent.ExecutionException: java.lang.NullPointerException >> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) >> at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) >> at >> org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java >> :173) >> >> ?What could I be doing wrong? >> >> ( >> on the cookbook page, there is also an import missing: >> import org.biojava3.alignment.Alignments; >> ) >> -> then the cookbook runs, but my code does not >> >> private static void processFile(String filename) { >> ? ? ? ? try { >> ? ? ? ? ? ? FileInputStream inStream = new FileInputStream(filename); >> ? ? ? ? ? ? FastaReader fastaReader = >> ? ? ? ? ? ? ? ? ? ? new FastaReader( >> ? ? ? ? ? ? ? ? ? ? inStream, >> ? ? ? ? ? ? ? ? ? ? new GenericFastaHeaderParser> NucleotideCompound>(), >> ? ? ? ? ? ? ? ? ? ? new >> DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); >> ? ? ? ? ? ? LinkedHashMap b = fastaReader.process(); >> >> ? ? ? ? ? ? List sequences = new ArrayList(); >> ? ? ? ? ? ? for (Entry entry : b.entrySet()) { >> ? ? ? ? ? ? ? ? if (sequences.size() < 5) { >> ? ? ? ? ? ? ? ? ? ? sequences.add(entry.getValue()); >> ? ? ? ? ? ? ? ? } >> ? ? ? ? ? ? ? ? System.out.println(entry.getValue()); >> ? ? ? ? ? ? } >> >> ? ? ? ? ? ? Profile profile = >> Alignments.getMultipleSequenceAlignment(sequences); >> ? ? ? ? ? ? System.out.printf("Clustalw:%n%s%n", profile); >> >> ? ? ? ? ? ? ConcurrencyTools.shutdown(); >> ? ? ? ? } catch (Exception ex) { >> ? ? ? ? ? ? Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); >> ? ? ? ? } >> ? ? } >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > ------------------------------- > Hashem Koohy > PhD > Postdoctoral Fellow, > Sanger Institute, > Cambridge > Mobile: 07515425433 > > > > > -- > ?The Wellcome Trust Sanger Institute is operated by Genome Research > ?Limited, a charity registered in England with number 1021457 and a > ?company registered in England with number 2742969, whose registered > ?office is 215 Euston Road, London, NW1 2BE. > From andreas at sdsc.edu Wed Oct 5 17:21:19 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 10:21:19 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Hi Hannes, > I noticed in an earlier discussion, there was some talk about this too > (3-4 months ago, perhaps) and something about a possible fix in SVN. > when will it be released on the maven server? What version are you on? We released Maven 3.0.2 just recently.. Andreas From sbliven at ucsd.edu Wed Oct 5 17:39:42 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Wed, 5 Oct 2011 10:39:42 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: In the current SVN code (and therefor probably Biojava 3.0.2), CookbookMSA.java includes that import statement but is otherwise identical to the wiki version. It runs just fine for me. Probably updating to Biojava 3.0.2 will fix the null pointer exception. This does highlight the problem of keeping the wiki synchronized with the current BioJava version. Ideally, part of the release process could include automated updating of the wiki cookbook from the SVN code, with older versions available as a reference. However, I'm not sure that anyone would be willing to set up such a complex release process. -Spencer On Wed, Oct 5, 2011 at 10:21, Andreas Prlic wrote: > Hi Hannes, > > > I noticed in an earlier discussion, there was some talk about this too > > (3-4 months ago, perhaps) and something about a possible fix in SVN. > > when will it be released on the maven server? > > What version are you on? We released Maven 3.0.2 just recently.. > > Andreas > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From amr_alhossary at hotmail.com Wed Oct 5 18:26:16 2011 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Wed, 5 Oct 2011 20:26:16 +0200 Subject: [Biojava-l] NullPointerException when usingAlignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: I agree with you 100%. I have met problems this week in working with the SPICE code too, to the extent that I delayed the whole idea till I have more suitable time to reimplement it myself. Amr -------------------------------------------------- From: "Spencer Bliven" Sent: Wednesday, October 05, 2011 7:39 PM To: "Andreas Prlic" Cc: Subject: Re: [Biojava-l] NullPointerException when usingAlignments.getMultipleSequenceAlignment > In the current SVN code (and therefor probably Biojava 3.0.2), > CookbookMSA.java includes that import statement but is otherwise identical > to the wiki version. It runs just fine for me. Probably updating to > Biojava > 3.0.2 will fix the null pointer exception. > > This does highlight the problem of keeping the wiki synchronized with the > current BioJava version. Ideally, part of the release process could > include > automated updating of the wiki cookbook from the SVN code, with older > versions available as a reference. However, I'm not sure that anyone would > be willing to set up such a complex release process. > > -Spencer > > On Wed, Oct 5, 2011 at 10:21, Andreas Prlic wrote: > >> Hi Hannes, >> >> > I noticed in an earlier discussion, there was some talk about this too >> > (3-4 months ago, perhaps) and something about a possible fix in SVN. >> > when will it be released on the maven server? >> >> What version are you on? We released Maven 3.0.2 just recently.. >> >> Andreas >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 5 19:22:13 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 21:22:13 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: 2011/10/5 Spencer Bliven : > In the current SVN code (and therefor probably Biojava 3.0.2), > CookbookMSA.java includes that import statement but is otherwise identical > to the wiki version. It runs just fine for me. Probably updating to Biojava > 3.0.2 will fix the null pointer exception. I installed it via Maven just last week - so I guess it should be on 3.0.2 - I'll check tomorrow. Anyhow, the problem isn't the wiki (I already updated that, btw) but the fact that it seems to work with Protein Sequences, but when I use my DNA sequences, it breaks. If you go to my first post, I copied my code there (just add a wrapper main that supplies a valid fasta file as parameter) Hannes From andreas at sdsc.edu Wed Oct 5 19:30:41 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 12:30:41 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: Can you provide the fasta file? otherwise this is difficult to reproduce... Andreas On Wed, Oct 5, 2011 at 12:22 PM, Hannes Brandst?tter-M?ller wrote: > 2011/10/5 Spencer Bliven : >> In the current SVN code (and therefor probably Biojava 3.0.2), >> CookbookMSA.java includes that import statement but is otherwise identical >> to the wiki version. It runs just fine for me. Probably updating to Biojava >> 3.0.2 will fix the null pointer exception. > > I installed it via Maven just last week - so I guess it should be on > 3.0.2 - I'll check tomorrow. > > Anyhow, the problem isn't the wiki (I already updated that, btw) but > the fact that it seems to work with Protein Sequences, but when I use > my DNA sequences, it breaks. > If you go to my first post, I copied my code there (just add a wrapper > main that supplies a valid fasta file as parameter) > > Hannes > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 5 19:34:38 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 5 Oct 2011 21:34:38 +0200 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: On Wed, Oct 5, 2011 at 21:30, Andreas Prlic wrote: > Can you provide the fasta file? otherwise this is difficult to reproduce... > > Andreas Unfortunately, the files in question are under NDA - does it work with other fasta files? I could not get it to work with the files I tried. Hannes From andreas at sdsc.edu Thu Oct 6 00:29:11 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 5 Oct 2011 17:29:11 -0700 Subject: [Biojava-l] NullPointerException when using Alignments.getMultipleSequenceAlignment In-Reply-To: References: Message-ID: > Unfortunately, the files in question are under NDA - does it work with > other fasta files? I could not get it to work with the files I tried. I just wrote a junit test for DNA alignments and it works for me. DNA alignments by default are using the nuc-4_4 substitution matrix for the alignment. It contains the following columns. A T G C S W R Y K M B V H D N Does your FASTA file contain any characters that are not in this list? Andreas From biojava at hannes.oib.com Thu Oct 6 07:32:59 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 09:32:59 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment Message-ID: Hi again! So, my MSA is now working, thanks for the help so far. What I ran into now is that most of the meta-information of a Sequence seems to get lost during the MSA step. e.g.: I'd like to attach the string from the fasta file (description, header, whatever it is called) to the sequence and keep it attached for further use after the MSA step. 1) shouldn't the fasta reader automatically populate this info? which is the correct field (OriginalHeader or Description or something else)? 2) the MSA step seems to throw away all meta info. I attached the string as OriginalHeader before starting the MSA, and afterwards it is "null" Thanks, Hannes From biojava at hannes.oib.com Thu Oct 6 08:07:17 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 10:07:17 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller wrote: > Hi again! > What I ran into now is that most of the meta-information of a Sequence > seems to get lost during the MSA step. Okay, that was something caused by following another cookbook script (that, unfortunately, has absolutely no docs or comments) - I found the getOriginalSequence() method, can work with that. Thanks! Hannes From biojava at hannes.oib.com Thu Oct 6 10:37:23 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 6 Oct 2011 12:37:23 +0200 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: Thanks, I'll try to add what I find out. It's a wiki after all. I'll just ask the mailing list if things are unclear before I add stuff to the wiki. One thing that bugged me just now, and since I can't find documentation on it: Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather un-java-like, especially since you just get an OutOfBoundsException, and the range is not specified in the javadoc, or I could not find it easily in the complex class hierarchy. Hannes 2011/10/6 Scooter Willis : > Hannes > > As you can tell we need to improve the cookbook examples. Since you are > going through that process would welcome any contributions you can make. > > Thanks > > Scooter > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" > wrote: > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> What I ran into now is that most of the meta-information of a Sequence >>> seems to get lost during the MSA step. >> >>Okay, that was something caused by following another cookbook script >>(that, unfortunately, has absolutely no docs or comments) - I found >>the getOriginalSequence() method, can work with that. Thanks! >> >>Hannes >> >>_______________________________________________ >>Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > From HWillis at scripps.edu Thu Oct 6 10:32:10 2011 From: HWillis at scripps.edu (Scooter Willis) Date: Thu, 6 Oct 2011 06:32:10 -0400 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: Message-ID: Hannes As you can tell we need to improve the cookbook examples. Since you are going through that process would welcome any contributions you can make. Thanks Scooter On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" wrote: >On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > wrote: >> Hi again! >> What I ran into now is that most of the meta-information of a Sequence >> seems to get lost during the MSA step. > >Okay, that was something caused by following another cookbook script >(that, unfortunately, has absolutely no docs or comments) - I found >the getOriginalSequence() method, can work with that. Thanks! > >Hannes > >_______________________________________________ >Biojava-l mailing list - Biojava-l at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biojava-l From andreas.prlic at gmail.com Thu Oct 6 15:35:22 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 6 Oct 2011 08:35:22 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, We are having problems with the anonymous SVN server quite often. I recommend trying the git copy at github, or their SVN interface, or using Maven to install it. http://www.biojava.org/wiki/CVS_to_SVN_Migration Andreas On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: > Dear Andreas, > > I wish to install BioJava for my eclipse IDE. > I followed all the instructions in the website > http://www.biojava.org/wiki/BioJava3_eclipse. > I currently have problems with the last step. I create a new Maven > project, but when I try to type /biojava/biojava-live/trunk in the SCM > URL, I get an "invalid URL" error. > Please let me know what seems to be the problem and how can I fix it. > > Thanks much! > omer > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From sbliven at ucsd.edu Thu Oct 6 18:29:06 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Thu, 6 Oct 2011 11:29:06 -0700 Subject: [Biojava-l] losing meta-info after multiple sequence alignment In-Reply-To: References: Message-ID: I think that 1-based indexing was chosen because that's what's used in genome databases like GenBank. Thus gene offsets from outside data sources can be used directly without subtracting 1. That said, almost every time I write code using sequences I get off-by-one errors, so I understand your frustration. We should definitely improve documentation so that every method that takes 1-based indexes are clearly marked. -Spencer On Thu, Oct 6, 2011 at 03:37, Hannes Brandst?tter-M?ller < biojava at hannes.oib.com> wrote: > Thanks, I'll try to add what I find out. It's a wiki after all. I'll > just ask the mailing list if things are unclear before I add stuff to > the wiki. > > One thing that bugged me just now, and since I can't find documentation on > it: > > Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather > un-java-like, especially since you just get an OutOfBoundsException, > and the range is not specified in the javadoc, or I could not find it > easily in the complex class hierarchy. > > Hannes > > 2011/10/6 Scooter Willis : > > Hannes > > > > As you can tell we need to improve the cookbook examples. Since you are > > going through that process would welcome any contributions you can make. > > > > Thanks > > > > Scooter > > > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" > > > wrote: > > > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > >> wrote: > >>> Hi again! > >>> What I ran into now is that most of the meta-information of a Sequence > >>> seems to get lost during the MSA step. > >> > >>Okay, that was something caused by following another cookbook script > >>(that, unfortunately, has absolutely no docs or comments) - I found > >>the getOriginalSequence() method, can work with that. Thanks! > >> > >>Hannes > >> > >>_______________________________________________ > >>Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas.prlic at gmail.com Mon Oct 10 01:58:43 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Sun, 9 Oct 2011 18:58:43 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, Our recommendation nowadays for working with Blast is to get XML output and simply parse that. Depending on what you need that will give you more details than the parser that is in the BioJava 1.8 (legacy) project. (if you still want to give that one a try, there is a cookbook page for it) Andreas On Sun, Oct 9, 2011 at 2:26 AM, Omer Eilam wrote: > I eventually succeeded in importing Biojava. > I read in the paper that there is a parser for BLAST - where can I get > more information/API on this? > > Thanks! > omer > > On Thu, Oct 6, 2011 at 5:35 PM, Andreas Prlic wrote: >> Hi Omer, >> >> We are having problems with the anonymous SVN server quite often. I >> recommend trying the git copy at github, or their SVN interface, or >> using Maven to install it. >> >> http://www.biojava.org/wiki/CVS_to_SVN_Migration >> >> Andreas >> >> On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: >>> Dear Andreas, >>> >>> I wish to install BioJava for my eclipse IDE. >>> I followed all the instructions in the website >>> http://www.biojava.org/wiki/BioJava3_eclipse. >>> I currently have problems with the last step. I create a new Maven >>> project, but when I try to type /biojava/biojava-live/trunk in the SCM >>> URL, I get an "invalid URL" error. >>> Please let me know what seems to be the problem and how can I fix it. >>> >>> Thanks much! >>> omer >>> >>> -- >>> Omer Eilam >>> Complex Network Systems >>> Prof. Eytan Ruppin >>> Tel-Aviv University >>> http://cns.cs.tau.ac.il/ >>> >> > > > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From shakunb at uom.ac.mu Mon Oct 10 09:03:34 2011 From: shakunb at uom.ac.mu (Shakuntala Baichoo) Date: Mon, 10 Oct 2011 13:03:34 +0400 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 5 In-Reply-To: References: Message-ID: Does anyone know how to find codon usage from data in a genbank file or embl file. Thanks Shakun On Fri, Oct 7, 2011 at 8:00 PM, wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: losing meta-info after multiple sequence alignment > (Spencer Bliven) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 6 Oct 2011 11:29:06 -0700 > From: Spencer Bliven > Subject: Re: [Biojava-l] losing meta-info after multiple sequence > alignment > To: Hannes Brandst?tter-M?ller > Cc: "Biojava-l at lists.open-bio.org" , > Scooter Willis > Message-ID: > > > Content-Type: text/plain; charset=UTF-8 > > I think that 1-based indexing was chosen because that's what's used in > genome databases like GenBank. Thus gene offsets from outside data sources > can be used directly without subtracting 1. That said, almost every time I > write code using sequences I get off-by-one errors, so I understand your > frustration. We should definitely improve documentation so that every > method > that takes 1-based indexes are clearly marked. > > -Spencer > > On Thu, Oct 6, 2011 at 03:37, Hannes Brandst?tter-M?ller < > biojava at hannes.oib.com> wrote: > > > Thanks, I'll try to add what I find out. It's a wiki after all. I'll > > just ask the mailing list if things are unclear before I add stuff to > > the wiki. > > > > One thing that bugged me just now, and since I can't find documentation > on > > it: > > > > Why is a sequence indexed by 1-(n+1) instead of 0-n? That's rather > > un-java-like, especially since you just get an OutOfBoundsException, > > and the range is not specified in the javadoc, or I could not find it > > easily in the complex class hierarchy. > > > > Hannes > > > > 2011/10/6 Scooter Willis : > > > Hannes > > > > > > As you can tell we need to improve the cookbook examples. Since you are > > > going through that process would welcome any contributions you can > make. > > > > > > Thanks > > > > > > Scooter > > > > > > On 10/6/11 4:07 AM, "Hannes Brandst?tter-M?ller" < > biojava at hannes.oib.com > > > > > > wrote: > > > > > >>On Thu, Oct 6, 2011 at 09:32, Hannes Brandst?tter-M?ller > > >> wrote: > > >>> Hi again! > > >>> What I ran into now is that most of the meta-information of a > Sequence > > >>> seems to get lost during the MSA step. > > >> > > >>Okay, that was something caused by following another cookbook script > > >>(that, unfortunately, has absolutely no docs or comments) - I found > > >>the getOriginalSequence() method, can work with that. Thanks! > > >> > > >>Hannes > > >> > > >>_______________________________________________ > > >>Biojava-l mailing list - Biojava-l at lists.open-bio.org > > >>http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 5 > ***************************************** > Email Disclaimer: This email and all its contents are subject to the disclaimer at http://www.uom.ac.mu/emaildisclaimer From tariq_cp at hotmail.com Mon Oct 10 16:27:27 2011 From: tariq_cp at hotmail.com (Muhammad Tariq Pervez) Date: Mon, 10 Oct 2011 16:27:27 +0000 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 2 In-Reply-To: References: Message-ID: Yes, the issue was also faced by me. I also highlighted the solution. The solution has also been uploaded/updated in SVN. Muhammad Tariq Pervez PhD Scholar > From: biojava-l-request at lists.open-bio.org > Subject: Biojava-l Digest, Vol 105, Issue 2 > To: biojava-l at lists.open-bio.org > Date: Wed, 5 Oct 2011 12:00:04 -0400 > > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. NullPointerException when using > Alignments.getMultipleSequenceAlignment (Hannes Brandst?tter-M?ller) > 2. Re: NullPointerException when using > Alignments.getMultipleSequenceAlignment (Hannes Brandst?tter-M?ller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 5 Oct 2011 09:41:57 +0200 > From: Hannes Brandst?tter-M?ller > Subject: [Biojava-l] NullPointerException when using > Alignments.getMultipleSequenceAlignment > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hello! > > I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but > when I run it, I get: > > java.util.concurrent.ExecutionException: java.lang.NullPointerException > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) > at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) > at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:173) > > What could I be doing wrong? > > ( > on the cookbook page, there is also an import missing: > import org.biojava3.alignment.Alignments; > ) > -> then the cookbook runs, but my code does not > > private static void processFile(String filename) { > try { > FileInputStream inStream = new FileInputStream(filename); > FastaReader fastaReader = > new FastaReader( > inStream, > new GenericFastaHeaderParser NucleotideCompound>(), > new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > LinkedHashMap b = fastaReader.process(); > > List sequences = new ArrayList(); > for (Entry entry : b.entrySet()) { > if (sequences.size() < 5) { > sequences.add(entry.getValue()); > } > System.out.println(entry.getValue()); > } > > Profile profile = > Alignments.getMultipleSequenceAlignment(sequences); > System.out.printf("Clustalw:%n%s%n", profile); > > ConcurrencyTools.shutdown(); > } catch (Exception ex) { > Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); > } > } > > > ------------------------------ > > Message: 2 > Date: Wed, 5 Oct 2011 10:50:33 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] NullPointerException when using > Alignments.getMultipleSequenceAlignment > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi! > > I read the sequence from a fasta file. > > FileInputStream inStream = new FileInputStream(filename); > FastaReader fastaReader = > new FastaReader( > inStream, > new GenericFastaHeaderParser NucleotideCompound>(), > new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > LinkedHashMap b = fastaReader.process(); > > and then use > List sequences = new ArrayList(); > for (Entry entry : b.entrySet()) { > sequences.add(entry.getValue()); > } > > to get the required list of DNA sequences. > > I noticed in an earlier discussion, there was some talk about this too > (3-4 months ago, perhaps) and something about a possible fix in SVN. > when will it be released on the maven server? > > Hannes > > On Wed, Oct 5, 2011 at 10:46, Hashem Koohy wrote: > > Hi Hannes, > > It seems to me it doesn't like your ?dna Sequence! > > Is your sequence in the following format? > > > > Sequence dnaSeq = DNATools.createDNASequence("acccgggttttacagt", "id"); > > > > Good luck > > Hashem > > > > On 05/10/2011 08:41, "Hannes Brandst?tter-M?ller" > > wrote: > > > >> Hello! > >> > >> I tried to follow http://biojava.org/wiki/BioJava:CookBook3:MSA, but > >> when I run it, I get: > >> > >> java.util.concurrent.ExecutionException: java.lang.NullPointerException > >> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) > >> at org.biojava3.alignment.Alignments.getListFromFutures(Alignments.java:282) > >> at org.biojava3.alignment.Alignments.runPairwiseScorers(Alignments.java:602) > >> at > >> org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java > >> :173) > >> > >> ?What could I be doing wrong? > >> > >> ( > >> on the cookbook page, there is also an import missing: > >> import org.biojava3.alignment.Alignments; > >> ) > >> -> then the cookbook runs, but my code does not > >> > >> private static void processFile(String filename) { > >> ? ? ? ? try { > >> ? ? ? ? ? ? FileInputStream inStream = new FileInputStream(filename); > >> ? ? ? ? ? ? FastaReader fastaReader = > >> ? ? ? ? ? ? ? ? ? ? new FastaReader( > >> ? ? ? ? ? ? ? ? ? ? inStream, > >> ? ? ? ? ? ? ? ? ? ? new GenericFastaHeaderParser >> NucleotideCompound>(), > >> ? ? ? ? ? ? ? ? ? ? new > >> DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); > >> ? ? ? ? ? ? LinkedHashMap b = fastaReader.process(); > >> > >> ? ? ? ? ? ? List sequences = new ArrayList(); > >> ? ? ? ? ? ? for (Entry entry : b.entrySet()) { > >> ? ? ? ? ? ? ? ? if (sequences.size() < 5) { > >> ? ? ? ? ? ? ? ? ? ? sequences.add(entry.getValue()); > >> ? ? ? ? ? ? ? ? } > >> ? ? ? ? ? ? ? ? System.out.println(entry.getValue()); > >> ? ? ? ? ? ? } > >> > >> ? ? ? ? ? ? Profile profile = > >> Alignments.getMultipleSequenceAlignment(sequences); > >> ? ? ? ? ? ? System.out.printf("Clustalw:%n%s%n", profile); > >> > >> ? ? ? ? ? ? ConcurrencyTools.shutdown(); > >> ? ? ? ? } catch (Exception ex) { > >> ? ? ? ? ? ? Logger.getLogger(App.class.getName()).log(Level.SEVERE, null, ex); > >> ? ? ? ? } > >> ? ? } > >> _______________________________________________ > >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > ------------------------------- > > Hashem Koohy > > PhD > > Postdoctoral Fellow, > > Sanger Institute, > > Cambridge > > Mobile: 07515425433 > > > > > > > > > > -- > > ?The Wellcome Trust Sanger Institute is operated by Genome Research > > ?Limited, a charity registered in England with number 1021457 and a > > ?company registered in England with number 2742969, whose registered > > ?office is 215 Euston Road, London, NW1 2BE. > > > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 2 > ***************************************** From xue.vin at gmail.com Tue Oct 11 00:39:52 2011 From: xue.vin at gmail.com (Vincent Xue) Date: Mon, 10 Oct 2011 20:39:52 -0400 Subject: [Biojava-l] Stockholm Parser Implementation? Message-ID: Hi, I was wondering if there has been any work on developing a Stockholm 1.0 parser? Thanks! From andreas.prlic at gmail.com Tue Oct 11 22:51:48 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Tue, 11 Oct 2011 15:51:48 -0700 Subject: [Biojava-l] BioJava Help In-Reply-To: References: Message-ID: Hi Omer, please keep such questions on the public list. Chances are high that if you are struggling with something, somebody else is having the same problem as well. I wasactually talking about BioJava 1.8, the previous version of BioJava. It is still available and the cookbook page that I meant is here: http://biojava.org/wiki/BioJava:CookBook:Blast:Parser About Maven: It is probably best to start with a new project in your favorite IDE and set it up as a Maven project that depends on BioJava. (i.e. add the BioJava repository to the pom configuraiton). If you think this is difficult, we can/should set up a documentation page that explains those first steps. Andreas On Tue, Oct 11, 2011 at 6:34 AM, Omer Eilam wrote: > Are you referring to the BlastXMLQuery class? because I see in the API > that it contains only one method. > Also, I am not familiar with using Maven, I see the various biojava > packages in the eclipse project explorer, but I don't know how to > create a new class (i.e. do I need imports and stuff?) > > Thanks again! > omer > > On Mon, Oct 10, 2011 at 3:58 AM, Andreas Prlic wrote: >> Hi Omer, >> >> Our recommendation nowadays for working with Blast is to get XML >> output and simply parse that. Depending on what you need that will >> give you more details than the parser that is in the BioJava 1.8 >> (legacy) project. (if you still want to give that one a try, there is >> a cookbook page for it) >> >> Andreas >> >> >> On Sun, Oct 9, 2011 at 2:26 AM, Omer Eilam wrote: >>> I eventually succeeded in importing Biojava. >>> I read in the paper that there is a parser for BLAST - where can I get >>> more information/API on this? >>> >>> Thanks! >>> omer >>> >>> On Thu, Oct 6, 2011 at 5:35 PM, Andreas Prlic wrote: >>>> Hi Omer, >>>> >>>> We are having problems with the anonymous SVN server quite often. I >>>> recommend trying the git copy at github, or their SVN interface, or >>>> using Maven to install it. >>>> >>>> http://www.biojava.org/wiki/CVS_to_SVN_Migration >>>> >>>> Andreas >>>> >>>> On Thu, Oct 6, 2011 at 1:56 AM, Omer Eilam wrote: >>>>> Dear Andreas, >>>>> >>>>> I wish to install BioJava for my eclipse IDE. >>>>> I followed all the instructions in the website >>>>> http://www.biojava.org/wiki/BioJava3_eclipse. >>>>> I currently have problems with the last step. I create a new Maven >>>>> project, but when I try to type /biojava/biojava-live/trunk in the SCM >>>>> URL, I get an "invalid URL" error. >>>>> Please let me know what seems to be the problem and how can I fix it. >>>>> >>>>> Thanks much! >>>>> omer >>>>> >>>>> -- >>>>> Omer Eilam >>>>> Complex Network Systems >>>>> Prof. Eytan Ruppin >>>>> Tel-Aviv University >>>>> http://cns.cs.tau.ac.il/ >>>>> >>>> >>> >>> >>> >>> -- >>> Omer Eilam >>> Complex Network Systems >>> Prof. Eytan Ruppin >>> Tel-Aviv University >>> http://cns.cs.tau.ac.il/ >>> >> > > > > -- > Omer Eilam > Complex Network Systems > Prof. Eytan Ruppin > Tel-Aviv University > http://cns.cs.tau.ac.il/ > From andreas.prlic at gmail.com Wed Oct 12 16:00:14 2011 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Wed, 12 Oct 2011 09:00:14 -0700 Subject: [Biojava-l] User defined substitution matrix. In-Reply-To: References: Message-ID: Hi Canan, Please subscribe to the biojava-l mailing list for sending messages to the list. Details for how to do that can be found here: http://biojava.org/wiki/BioJava:MailingLists About your question: getResourceAsStream is just opening a stream to a file that is bundled together with a jar file. There should be no need for you to call that to get a matrix. If you want to use any of the (many) standard substitution matrices that are supported by BioJava you just need to call SubstitutionMatrixHelper .getMatrixFromAAINDEX(nameOfMatrix) If you want to work with a matrix that is not yet the AAINDEX collection of matrices, it depends on how the matrix is represented in the file. If it is in the same style as in AAINDEX, you can use the DefaultAAIndexProvider to parse your custom matrix. Andreas On Tue, Oct 11, 2011 at 11:09 PM, Canan Has wrote: > Dear Dr. Prlic, > Sorry to bother you, but my mails to biojava forum-nabble are not being > answered. Even I have been already subscribed, I am getting alerts saying > your message is in pending list, because you are not subscribed.?Therefore, > I decided to ?ask my question to you directly. I hope you will answer to me. > Because, it is so important and urgent. > Simply,I want to create my own substitution matrix object. I examined the > source code of SubstitutionMatrixHelper and the related others. I did some > modifications and took null pointer exception for getResourceAsStream(). > I tried to find out where the directory for getResourceAsStream() has been > set. At first, I thought the file directory is the resources folder in > alignment folder. Then, I deleted one of the matrices - pam250 and tried to > initialize it by calling getPam250() and matrix was created. This leads me > to think that the developer introduced ftp.ncbi url to ?getResourceAsStream > to take matrices by name. > Can you show a way to create my own? Or you can tell me and I can change > where the url is given and recompile the code. > Thanks in advance > Canan Has > Research Assistant & MSc Student > Computational Biology & Bioinformatics Lab > Molecular Biology and Genetics Department > Izmir Institute of Technology > From andreas at sdsc.edu Wed Oct 12 16:05:58 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 12 Oct 2011 09:05:58 -0700 Subject: [Biojava-l] Stockholm Parser Implementation? In-Reply-To: References: Message-ID: Hi Vincent, I am not aware of parser for that as part of Biojava. It would be great to have, though. If you want to make a contribution, you would be more than welcome ... Andreas On Mon, Oct 10, 2011 at 5:39 PM, Vincent Xue wrote: > Hi, > > I was wondering if there has been any work on developing a Stockholm > 1.0 parser? > > Thanks! > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jprocter at compbio.dundee.ac.uk Fri Oct 14 09:07:01 2011 From: jprocter at compbio.dundee.ac.uk (Jim Procter) Date: Fri, 14 Oct 2011 10:07:01 +0100 Subject: [Biojava-l] Stockholm Parser Implementation? In-Reply-To: References: Message-ID: <4E97FBB5.5090402@compbio.dundee.ac.uk> On 12/10/2011 17:05, Andreas Prlic wrote: > Hi Vincent, > > I am not aware of parser for that as part of Biojava. It would be > great to have, though. If you want to make a contribution, you would > be more than welcome ... I thought Jules worked on this - he took a look at Jalview's one and promptly wrote a better one, IIRC. However, whether it got in to bj3 is another matter! Perhaps asking him nicely would work :) Jim. From kern3020 at gmail.com Tue Oct 18 00:49:17 2011 From: kern3020 at gmail.com (John Kern) Date: Mon, 17 Oct 2011 17:49:17 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? Message-ID: Hello, I am not to Biojava and bioinformatics in general. I noticed your latest release is 3.0.2 but the tutorial reference to the 1.8 release ( http://biojava.org/wiki/BioJava:Tutorial). Does this imply the tutorial is out of sync with the tutorial? Regards, John From andreas at sdsc.edu Tue Oct 18 03:11:16 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Mon, 17 Oct 2011 20:11:16 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? In-Reply-To: References: Message-ID: Hi John, This Tutorial was written for what is currently the legacy 1.8 release. It would be great if somebody would volunteer and provide such a tutorial for the current 3.X code base ... Andreas On Mon, Oct 17, 2011 at 5:49 PM, John Kern wrote: > Hello, > > I am not to Biojava and bioinformatics in general. I noticed your latest > release is 3.0.2 but the tutorial reference to the 1.8 release ( > http://biojava.org/wiki/BioJava:Tutorial). Does this imply the tutorial is > out of sync with the tutorial? > > Regards, > John > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- ----------------------------------------------------------------------- Dr. Andreas Prlic Senior Scientist, RCSB PDB Protein Data Bank University of California, San Diego (+1) 858.246.0526 ----------------------------------------------------------------------- From biojava at hannes.oib.com Tue Oct 18 09:46:44 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Tue, 18 Oct 2011 11:46:44 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller wrote: > Hi again! > > I am quite happy with the Multiple Sequence Alignment, but I noticed > that there seems to be a limit of 132 Sequences that are present in > the final alignment - is this some kind of hardcoded limit, or can I > work around that somehow? > > Hannes > Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in fasta format. Is there a way to work around that limit? Hannes From kern3020 at gmail.com Tue Oct 18 14:42:54 2011 From: kern3020 at gmail.com (John Kern) Date: Tue, 18 Oct 2011 07:42:54 -0700 Subject: [Biojava-l] is the tutorial in sync with the latest version? In-Reply-To: References: Message-ID: Hello Andreas, On Mon, Oct 17, 2011 at 8:11 PM, Andreas Prlic wrote: > It would be great if somebody would volunteer and provide > such a tutorial for the current ?3.X code base ... Great. Sign me up. I guess the way to proceed is to learn 3.x via the cookbook and examples. Then return to the tutorial. -jk From andreas at sdsc.edu Tue Oct 18 21:01:05 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 18 Oct 2011 14:01:05 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hi Hannes, did you try to increase memory settings for your JVM? e.g. -Xmx500M Andreas On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller wrote: > On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller > wrote: >> Hi again! >> >> I am quite happy with the Multiple Sequence Alignment, but I noticed >> that there seems to be a limit of 132 Sequences that are present in >> the final alignment - is this some kind of hardcoded limit, or can I >> work around that somehow? >> >> Hannes >> > > Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in > fasta format. Is there a way to work around that limit? > > Hannes > > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From biojava at hannes.oib.com Wed Oct 19 04:36:19 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 19 Oct 2011 06:36:19 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hi Andreas, I will try that later today if that makes any difference; I ran a larger alignment batch overnight, and I noticed that this limit seems to have been a coincidence; HOWEVER, the aligned sequences are always not as many as the input sequences, is this caused by memory constraints or how can I influence that? Hannes On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: > Hi Hannes, > > did you try to increase memory settings for your JVM? ?e.g. -Xmx500M > > Andreas > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > wrote: >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>> that there seems to be a limit of 132 Sequences that are present in >>> the final alignment - is this some kind of hardcoded limit, or can I >>> work around that somehow? >>> >>> Hannes >>> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> fasta format. Is there a way to work around that limit? >> >> Hannes >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > From biojava at hannes.oib.com Wed Oct 19 07:32:25 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Wed, 19 Oct 2011 09:32:25 +0200 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: I'm currently running another test, now with even more memory for java (500M) - it looks fine now so far. I'll re-check it later with the other files that gave me some problems, and will report back later today. I had a "out of heap" exception when I tried it with the default memory settings, and with 256M it seems to have swallowed some sequences - I'll re-check and help you reproduce. It would be really bad if the code would swallow sequences without error messages when running out of memory, so I'll make sure I have proof :D Hannes On Wed, Oct 19, 2011 at 09:22, Spencer Bliven wrote: > Hannes? > > There should not be a limit on the number of sequences, nor should you be > running into a memory problem. The FastaParser should be able to read > thousands of sequences, since it is used for genome FASTA files as well as > multiple alignments. My guess would be either a malformed FASTA file > (perhaps a problem with line endings?), or else a problem with the code to > generate the MultipleAlignment. Can you post some code snippets? > > -Spencer > > On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller > wrote: >> >> Hi Andreas, >> >> I will try that later today if that makes any difference; I ran a >> larger alignment batch overnight, and I noticed that this limit seems >> to have been a coincidence; HOWEVER, the aligned sequences are always >> not as many as the input sequences, is this caused by memory >> constraints or how can I influence that? >> >> Hannes >> >> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >> > Hi Hannes, >> > >> > did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >> > >> > Andreas >> > >> > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >> > wrote: >> >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> >> wrote: >> >>> Hi again! >> >>> >> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >> >>> that there seems to be a limit of 132 Sequences that are present in >> >>> the final alignment - is this some kind of hardcoded limit, or can I >> >>> work around that somehow? >> >>> >> >>> Hannes >> >>> >> >> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> >> fasta format. Is there a way to work around that limit? >> >> >> >> Hannes >> >> >> >> _______________________________________________ >> >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> >> > >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > From sbliven at ucsd.edu Wed Oct 19 07:22:55 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Wed, 19 Oct 2011 00:22:55 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? In-Reply-To: References: Message-ID: Hannes? There should not be a limit on the number of sequences, nor should you be running into a memory problem. The FastaParser should be able to read thousands of sequences, since it is used for genome FASTA files as well as multiple alignments. My guess would be either a malformed FASTA file (perhaps a problem with line endings?), or else a problem with the code to generate the MultipleAlignment. Can you post some code snippets? -Spencer On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller < biojava at hannes.oib.com> wrote: > Hi Andreas, > > I will try that later today if that makes any difference; I ran a > larger alignment batch overnight, and I noticed that this limit seems > to have been a coincidence; HOWEVER, the aligned sequences are always > not as many as the input sequences, is this caused by memory > constraints or how can I influence that? > > Hannes > > On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: > > Hi Hannes, > > > > did you try to increase memory settings for your JVM? e.g. -Xmx500M > > > > Andreas > > > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > > wrote: > >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller > >> wrote: > >>> Hi again! > >>> > >>> I am quite happy with the Multiple Sequence Alignment, but I noticed > >>> that there seems to be a limit of 132 Sequences that are present in > >>> the final alignment - is this some kind of hardcoded limit, or can I > >>> work around that somehow? > >>> > >>> Hannes > >>> > >> > >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in > >> fasta format. Is there a way to work around that limit? > >> > >> Hannes > >> > >> _______________________________________________ > >> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biojava-l > >> > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jvb at Cs.Nott.AC.UK Wed Oct 19 11:15:26 2011 From: jvb at Cs.Nott.AC.UK (jvb at Cs.Nott.AC.UK) Date: 19 Oct 2011 12:15:26 +0100 Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil ? Message-ID: <201110191215.aa17789@pat.Cs.Nott.AC.UK> Hello, I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, even though it appears in the JavaDocs: http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html What is it's status? Can I get it, and should it rely on it if I can? Thanks, Jon From p.v.troshin at dundee.ac.uk Wed Oct 19 15:13:44 2011 From: p.v.troshin at dundee.ac.uk (Peter Troshin) Date: Wed, 19 Oct 2011 16:13:44 +0100 Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil ? In-Reply-To: <201110191215.aa17789@pat.Cs.Nott.AC.UK> References: <201110191215.aa17789@pat.Cs.Nott.AC.UK> Message-ID: <4E9EE928.4050506@dundee.ac.uk> Hi Jon, This class is a part of protein disorder prediction JAR and a recent addition to BioJava. You are welcome to use if it suits your needs. Bear in mid though that the FASTA file reader from this class reads the content of the whole FASTA file at once, i.e. if you are working with large FASTA files you will want to use something else instead. I've got a Stream based FASTA reader if you need one and if there is not one in BioJava already. I would imagine the functionality from this class is not going to disappear overnight, but it may and perhaps should be merged with other FASTA parsers in BioJava once somebody have time to do this. Regards, Peter On 19/10/2011 12:15, jvb at cs.nott.ac.uk wrote: > Hello, > > I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, > even though it appears in the JavaDocs: > http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html > > What is it's status? Can I get it, and should it rely on it if I can? > > Thanks, > > Jon > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From khalil.elmazouari at gmail.com Wed Oct 19 18:36:28 2011 From: khalil.elmazouari at gmail.com (Khalil El Mazouari) Date: Wed, 19 Oct 2011 20:36:28 +0200 Subject: [Biojava-l] Biojava-l Digest, Vol 105, Issue 12 In-Reply-To: References: Message-ID: Hi Hannes, just did a MSA test with 521 seq... and it works. It must be a memory issue. try something like: java -Xmx1g -jar yourApp.jar args... If you don't have enough RAM, try with 500m as suggested by Andreas, Regards, Khalil On 19 Oct 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: Multiple Sequence Alignment - Limits? (Andreas Prlic) > 2. Re: Multiple Sequence Alignment - Limits? > (Hannes Brandst?tter-M?ller) > 3. Re: Multiple Sequence Alignment - Limits? > (Hannes Brandst?tter-M?ller) > 4. Re: Multiple Sequence Alignment - Limits? (Spencer Bliven) > 5. Status of org.biojava3.data.sequence.SequenceUtil ? > (jvb at Cs.Nott.AC.UK) > 6. Re: Status of org.biojava3.data.sequence.SequenceUtil ? > (Peter Troshin) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 18 Oct 2011 14:01:05 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Hannes Brandst?tter-M?ller > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Hannes, > > did you try to increase memory settings for your JVM? e.g. -Xmx500M > > Andreas > > On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller > wrote: >> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >> wrote: >>> Hi again! >>> >>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>> that there seems to be a limit of 132 Sequences that are present in >>> the final alignment - is this some kind of hardcoded limit, or can I >>> work around that somehow? >>> >>> Hannes >>> >> >> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >> fasta format. Is there a way to work around that limit? >> >> Hannes >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > ------------------------------ > > Message: 2 > Date: Wed, 19 Oct 2011 06:36:19 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Andreas Prlic > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Andreas, > > I will try that later today if that makes any difference; I ran a > larger alignment batch overnight, and I noticed that this limit seems > to have been a coincidence; HOWEVER, the aligned sequences are always > not as many as the input sequences, is this caused by memory > constraints or how can I influence that? > > Hannes > > On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >> Hi Hannes, >> >> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >> >> Andreas >> >> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >> wrote: >>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>> wrote: >>>> Hi again! >>>> >>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>> that there seems to be a limit of 132 Sequences that are present in >>>> the final alignment - is this some kind of hardcoded limit, or can I >>>> work around that somehow? >>>> >>>> Hannes >>>> >>> >>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>> fasta format. Is there a way to work around that limit? >>> >>> Hannes >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> > > > > ------------------------------ > > Message: 3 > Date: Wed, 19 Oct 2011 09:32:25 +0200 > From: Hannes Brandst?tter-M?ller > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Spencer Bliven > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=windows-1252 > > I'm currently running another test, now with even more memory for java > (500M) - it looks fine now so far. I'll re-check it later with the > other files that gave me some problems, and will report back later > today. > > I had a "out of heap" exception when I tried it with the default > memory settings, and with 256M it seems to have swallowed some > sequences - I'll re-check and help you reproduce. It would be really > bad if the code would swallow sequences without error messages when > running out of memory, so I'll make sure I have proof :D > > Hannes > > On Wed, Oct 19, 2011 at 09:22, Spencer Bliven wrote: >> Hannes? >> >> There should not be a limit on the number of sequences, nor should you be >> running into a memory problem. The FastaParser should be able to read >> thousands of sequences, since it is used for genome FASTA files as well as >> multiple alignments. My guess would be either a malformed FASTA file >> (perhaps a problem with line endings?), or else a problem with the code to >> generate the MultipleAlignment. Can you post some code snippets? >> >> -Spencer >> >> On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller >> wrote: >>> >>> Hi Andreas, >>> >>> I will try that later today if that makes any difference; I ran a >>> larger alignment batch overnight, and I noticed that this limit seems >>> to have been a coincidence; HOWEVER, the aligned sequences are always >>> not as many as the input sequences, is this caused by memory >>> constraints or how can I influence that? >>> >>> Hannes >>> >>> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >>>> Hi Hannes, >>>> >>>> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M >>>> >>>> Andreas >>>> >>>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >>>> wrote: >>>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>>>> wrote: >>>>>> Hi again! >>>>>> >>>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>>>> that there seems to be a limit of 132 Sequences that are present in >>>>>> the final alignment - is this some kind of hardcoded limit, or can I >>>>>> work around that somehow? >>>>>> >>>>>> Hannes >>>>>> >>>>> >>>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>>>> fasta format. Is there a way to work around that limit? >>>>> >>>>> Hannes >>>>> >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>>> >>> >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> > > > > ------------------------------ > > Message: 4 > Date: Wed, 19 Oct 2011 00:22:55 -0700 > From: Spencer Bliven > Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits? > To: Hannes Brandst?tter-M?ller > Cc: biojava-l > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > Hannes? > > There should not be a limit on the number of sequences, nor should you be > running into a memory problem. The FastaParser should be able to read > thousands of sequences, since it is used for genome FASTA files as well as > multiple alignments. My guess would be either a malformed FASTA file > (perhaps a problem with line endings?), or else a problem with the code to > generate the MultipleAlignment. Can you post some code snippets? > > -Spencer > > On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller < > biojava at hannes.oib.com> wrote: > >> Hi Andreas, >> >> I will try that later today if that makes any difference; I ran a >> larger alignment batch overnight, and I noticed that this limit seems >> to have been a coincidence; HOWEVER, the aligned sequences are always >> not as many as the input sequences, is this caused by memory >> constraints or how can I influence that? >> >> Hannes >> >> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic wrote: >>> Hi Hannes, >>> >>> did you try to increase memory settings for your JVM? e.g. -Xmx500M >>> >>> Andreas >>> >>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller >>> wrote: >>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller >>>> wrote: >>>>> Hi again! >>>>> >>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed >>>>> that there seems to be a limit of 132 Sequences that are present in >>>>> the final alignment - is this some kind of hardcoded limit, or can I >>>>> work around that somehow? >>>>> >>>>> Hannes >>>>> >>>> >>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in >>>> fasta format. Is there a way to work around that limit? >>>> >>>> Hannes >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > ------------------------------ > > Message: 5 > Date: 19 Oct 2011 12:15:26 +0100 > From: jvb at Cs.Nott.AC.UK > Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil > ? > To: biojava-l > Message-ID: <201110191215.aa17789 at pat.Cs.Nott.AC.UK> > Content-Type: text/plain; format=flowed; charset=ISO-8859-1 > > Hello, > > I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, even > though it appears in the JavaDocs: > http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html > > What is it's status? Can I get it, and should it rely on it if I can? > > Thanks, > > Jon > > > > > ------------------------------ > > Message: 6 > Date: Wed, 19 Oct 2011 16:13:44 +0100 > From: Peter Troshin > Subject: Re: [Biojava-l] Status of > org.biojava3.data.sequence.SequenceUtil ? > To: jvb at cs.nott.ac.uk > Cc: biojava-l at lists.open-bio.org > Message-ID: <4E9EE928.4050506 at dundee.ac.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi Jon, > > This class is a part of protein disorder prediction JAR and a recent > addition to BioJava. You are welcome to use if it suits your needs. > Bear in mid though that the FASTA file reader from this class reads the > content of the whole FASTA file at once, i.e. if you are working with > large FASTA files you will want to use something else instead. I've got > a Stream based FASTA reader if you need one and if there is not one in > BioJava already. > I would imagine the functionality from this class is not going to > disappear overnight, but it may and perhaps should be merged with other > FASTA parsers in BioJava once somebody have time to do this. > > Regards, > Peter > > > On 19/10/2011 12:15, jvb at cs.nott.ac.uk wrote: >> Hello, >> >> I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, >> even though it appears in the JavaDocs: >> http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html >> >> What is it's status? Can I get it, and should it rely on it if I can? >> >> Thanks, >> >> Jon >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 105, Issue 12 > ****************************************** From biojava at hannes.oib.com Thu Oct 20 12:29:14 2011 From: biojava at hannes.oib.com (=?ISO-8859-1?Q?Hannes_Brandst=E4tter=2DM=FCller?=) Date: Thu, 20 Oct 2011 14:29:14 +0200 Subject: [Biojava-l] Undeclared/uncaught exception in Fasta Parser Message-ID: If you feed the fasta parser (code from cookbook) with an empty file, you get a java.lang.ArrayIndexOutOfBoundsException: 0 at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:111) at org.biojava3.core.sequence.io.GenericFastaHeaderParser.parseHeader(GenericFastaHeaderParser.java:60) at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:136) Code: FileInputStream inStream = new FileInputStream(inputFileName); FastaReader fastaReader = new FastaReader( inStream, new GenericFastaHeaderParser(), new DNASequenceCreator(DNACompoundSet.getDNACompoundSet())); LinkedHashMap originalSeqs = fastaReader.process(); //boom! I believe this exception should either be caught and converted to an IOException or declared. Hannes ps: I also ran into an OutOfMemoryError, I tried MSA with approx. 1000+ sequences :D even though I gave java 4000M - are there any metrics I can use before starting the MSA to determine If it can work? java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.biojava3.alignment.Alignments.getProgressiveAlignment(Alignments.java:560) at org.biojava3.alignment.Alignments.getMultipleSequenceAlignment(Alignments.java:187) at ... Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at org.biojava3.core.sequence.location.SimplePoint.getPosition(SimplePoint.java:55) at org.biojava3.alignment.SimpleAlignedSequence.isGap(SimpleAlignedSequence.java:231) at org.biojava3.alignment.SimpleAlignedSequence.getSequenceIndexAt(SimpleAlignedSequence.java:205) at org.biojava3.alignment.SimpleAlignedSequence.getCompoundAt(SimpleAlignedSequence.java:270) at org.biojava3.alignment.SimpleProfile.getCompoundsAt(SimpleProfile.java:231) at org.biojava3.alignment.SimpleProfile.getCompoundCountsAt(SimpleProfile.java:217) at org.biojava3.alignment.SimpleProfile.getCompoundWeightsAt(SimpleProfile.java:249) at org.biojava3.alignment.template.AbstractProfileProfileAligner.reset(AbstractProfileProfileAligner.java:235) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:215) at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:270) at org.biojava3.alignment.template.AbstractProfileProfileAligner.getPair(AbstractProfileProfileAligner.java:158) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:54) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:38) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:210) at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:270) at org.biojava3.alignment.template.AbstractProfileProfileAligner.getPair(AbstractProfileProfileAligner.java:158) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:54) at org.biojava3.alignment.template.CallableProfileProfileAligner.call(CallableProfileProfileAligner.java:38) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at org.biojava3.core.sequence.location.SimplePoint.getPosition(SimplePoint.java:55) at org.biojava3.alignment.SimpleAlignedSequence.isGap(SimpleAlignedSequence.java:231) at org.biojava3.alignment.SimpleAlignedSequence.getSequenceIndexAt(SimpleAlignedSequence.java:205) at org.biojava3.alignment.SimpleAlignedSequence.getCompoundAt(SimpleAlignedSequence.java:270) at org.biojava3.alignment.SimpleProfile.getCompoundsAt(SimpleProfile.java:231) at org.biojava3.alignment.SimpleProfile.getCompoundCountsAt(SimpleProfile.java:217) at org.biojava3.alignment.SimpleProfile.getCompoundWeightsAt(SimpleProfile.java:249) at org.biojava3.alignment.template.AbstractProfileProfileAligner.reset(AbstractProfileProfileAligner.java:235) at org.biojava3.alignment.template.AbstractProfileProfileAligner.isReady(AbstractProfileProfileAligner.java:215) ... 9 more From kern3020 at gmail.com Thu Oct 20 16:19:27 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 09:19:27 -0700 Subject: [Biojava-l] Multiple Sequence Alignment - Limits? Message-ID: Hello Hannes, When programs run out of memory it can be insightful to look at them via the operating system. It helps to determine if the program consumes all the memory or is hitting another limit. These comments are specific to Lunix. All versions of UNIX have similar shells. If you are running Windows, this is not relevant. Process size can be limited by a shell on unix systems. $ man sh $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited On my system, there are now limits. What are the results on your system? If the shell is not limiting your process and as other have pointed out the JVM isn't limited, the program top can be very insightful. http://www.kernelhardware.org/linux-top-command/ When the program is running out of memory, does top confirm you have no more memory? Regards, John From kern3020 at gmail.com Thu Oct 20 20:12:39 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 13:12:39 -0700 Subject: [Biojava-l] demos from 1.8 Message-ID: Hello, The 1.8 tutorial refers to a demos directory. "Additionally, a number of small demo programs can be found in the demos directory of the BioJava source distribution." Would it make sense to migrate them to the 3.x APIs? I downloaded the aggregate tar ball for 1.8.1. I do not see it there. Where can I find it? Thanks, -john From andreas at sdsc.edu Thu Oct 20 21:19:27 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 20 Oct 2011 14:19:27 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: Hi John, several of the 3.X modules have a demo directory that give examples how to work with them. I believe in the old code base there was also a demo directory somewhere, not sure if it gets bundled in the aggregate tar ball though. Andreas On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: > Hello, > > The 1.8 tutorial refers to a demos directory. > "Additionally, a number of small demo programs can be found in the > demos directory of the BioJava source distribution." > > Would it make sense to migrate them to the 3.x APIs? > > I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. > Where can I find it? > > Thanks, > -john > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Thu Oct 20 21:33:56 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 20 Oct 2011 14:33:56 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: Hello Andreas, I checked out the 3.x branch from subversion. I see two demo directories. jkern at ubuntu:~/src/java/biojava-trunk/biojava/biojava3-core/src/main/java/org/biojava3/core/sequence$ find ~/src/java/biojava-trunk/ -name demo /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure-gui/src/main/java/demo /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure/src/main/java/demo I found no corresponding directories in the 1.8.1 tar ball. One of the articles (http://biojava.org/wiki/BioJava:Tutorial:Dynamic_programming_examples) refers to a source file called Dice.java. Would it make sense to migrate it to the new source? If so, do you know where I can find it? -jk On Thu, Oct 20, 2011 at 2:19 PM, Andreas Prlic wrote: > Hi John, > > several of the 3.X modules have a demo directory that give examples > how to work with them. I believe in the old code base there was also a > demo directory somewhere, not sure if it gets bundled in the aggregate > tar ball though. > > Andreas > > > On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: >> Hello, >> >> The 1.8 tutorial refers to a demos directory. >> "Additionally, a number of small demo programs can be found in the >> demos directory of the BioJava source distribution." >> >> Would it make sense to migrate them to the 3.x APIs? >> >> I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. >> Where can I find it? >> >> Thanks, >> -john >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > From andreas at sdsc.edu Thu Oct 20 21:37:44 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 20 Oct 2011 14:37:44 -0700 Subject: [Biojava-l] demos from 1.8 In-Reply-To: References: Message-ID: I think that most of those demos are heavily dependent on 1.8 code and would need some major refactoring to work for 3.x. The best reference to get started is the Cookbook ... (and we probably need more examples there for 3.X ) Andreas On Thu, Oct 20, 2011 at 2:33 PM, John Kern wrote: > Hello Andreas, > > I checked out the 3.x branch from subversion. I see two demo directories. > > jkern at ubuntu:~/src/java/biojava-trunk/biojava/biojava3-core/src/main/java/org/biojava3/core/sequence$ > find ~/src/java/biojava-trunk/ -name demo > /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure-gui/src/main/java/demo > /home/jkern/src/java/biojava-trunk/biojava/biojava3-structure/src/main/java/demo > > I found no corresponding directories in the 1.8.1 tar ball. > > One of the articles > (http://biojava.org/wiki/BioJava:Tutorial:Dynamic_programming_examples) > refers to a source file called Dice.java. Would it make sense to > migrate it to the new source? If so, do you know where I can find it? > > -jk > > > On Thu, Oct 20, 2011 at 2:19 PM, Andreas Prlic wrote: >> Hi John, >> >> several of the 3.X modules have a demo directory that give examples >> how to work with them. I believe in the old code base there was also a >> demo directory somewhere, not sure if it gets bundled in the aggregate >> tar ball though. >> >> Andreas >> >> >> On Thu, Oct 20, 2011 at 1:12 PM, John Kern wrote: >>> Hello, >>> >>> The 1.8 tutorial refers to a demos directory. >>> "Additionally, a number of small demo programs can be found in the >>> demos directory of the BioJava source distribution." >>> >>> Would it make sense to migrate them to the 3.x APIs? >>> >>> I downloaded the aggregate tar ball for 1.8.1. ?I do not see it there. >>> Where can I find it? >>> >>> Thanks, >>> -john >>> _______________________________________________ >>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> > From kern3020 at gmail.com Wed Oct 26 00:55:39 2011 From: kern3020 at gmail.com (John Kern) Date: Tue, 25 Oct 2011 17:55:39 -0700 Subject: [Biojava-l] demo directory in API Message-ID: Hello, While reviewing the BioJava 3.x API (http://www.biojava.org/docs/api/index.html), I noticed the demo directory Andreas mentioned. I do see it in neither 3.0.2 tarball nor the subversion checkout from the trunk. If there is a demo directory for the current source base, I would love to see it. -jk From andreas at sdsc.edu Wed Oct 26 01:00:44 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 25 Oct 2011 18:00:44 -0700 Subject: [Biojava-l] demo directory in API In-Reply-To: References: Message-ID: it is in the structure module. Andreas On Tue, Oct 25, 2011 at 5:55 PM, John Kern wrote: > Hello, > > While reviewing the BioJava 3.x API > (http://www.biojava.org/docs/api/index.html), I noticed the demo > directory Andreas mentioned. I do see it in neither 3.0.2 tarball nor > the subversion checkout from the trunk. If there is a demo directory > for the current source base, I would love to see it. > > -jk > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Wed Oct 26 23:00:46 2011 From: kern3020 at gmail.com (John Kern) Date: Wed, 26 Oct 2011 16:00:46 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? Message-ID: Hello, Based I reviewing the 1.8 version of the tutorial, I want to write a section in the tutorial about features, locations and annotations. Would someone recommend a good sequence(s) to work with? Thanks, -John From andreas at sdsc.edu Thu Oct 27 17:44:03 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 27 Oct 2011 10:44:03 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? In-Reply-To: References: Message-ID: Hi John, Are you working on the 1.8 tutorial? It would be better to focus documentation efforts on the 3.X branch! We declared the 1.8 code base to be legacy and all developmental efforts are happening on the 3.X trunk. Andreas On Wed, Oct 26, 2011 at 4:00 PM, John Kern wrote: > Hello, > > Based I reviewing the 1.8 version of the tutorial, I want to write a > section in the tutorial about features, locations and annotations. > Would someone recommend a good sequence(s) to work with? > > Thanks, > -John > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From kern3020 at gmail.com Thu Oct 27 19:20:54 2011 From: kern3020 at gmail.com (John Kern) Date: Thu, 27 Oct 2011 12:20:54 -0700 Subject: [Biojava-l] What's a good sequence to highlight features in BioJava? In-Reply-To: References: Message-ID: Hello Andreas, I was still looking at it at a very high level. I want to understand the appropriate workflow for BioJava. I am a software engineer. I do not share your background in biology but I want to learn as much as I possible. I am reading a college-level introduction to biology. Additional suggestions to my read list would be appreciated. On Thu, Oct 27, 2011 at 10:44 AM, Andreas Prlic wrote: > Are you working on the 1.8 tutorial? It would be better to focus > documentation efforts on the 3.X branch! We declared the 1.8 code base > to be legacy and all developmental efforts are happening on the 3.X > trunk. Thanks. I will do that. Sincerely, John