From hlk.dogan at gmail.com Wed Jun 27 03:01:07 2012 From: hlk.dogan at gmail.com (Haluk Dogan) Date: Wed, 27 Jun 2012 10:01:07 +0300 Subject: [Biojava-l] reading fasta file out of memory error Message-ID: Hi, I have an 1.8 GB fasta file and I was trying to read it with the following code as in suggested examples page. LinkedHashMap seqs = FastaReaderHelper.readFastaDNASequence(new File(args[0])); I don't get any error for small size files but it gives the following error for big files. Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2746) at java.util.ArrayList.ensureCapacity(ArrayList.java:187) at org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) at org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) at org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) at org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) at org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) Is there any efficient way? Thanks in advance. -- HD From dasarnow at gmail.com Wed Jun 27 03:44:14 2012 From: dasarnow at gmail.com (Daniel Asarnow) Date: Wed, 27 Jun 2012 00:44:14 -0700 Subject: [Biojava-l] reading fasta file out of memory error In-Reply-To: References: Message-ID: Hi, Have you tried increasing the size of the heap? You can use the -Xmx option to java, e.g. -Xmx2048m or higher. The GC overhead error is usually thrown when the constraints of the heap size force the JVM to spend too much time collecting garbage. -da On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > Hi, > > I have an 1.8 GB fasta file and I was trying to read it with the following > code as in suggested examples page. > > LinkedHashMap seqs = > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > I don't get any error for small size files but it gives the following error > for big files. > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > exceeded > at java.util.Arrays.copyOf(Arrays.java:2746) > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > at > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > at > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > at > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > at > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > at > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > at > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > Is there any efficient way? > > Thanks in advance. > > -- > HD > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From mictadlo at gmail.com Thu Jun 28 00:17:39 2012 From: mictadlo at gmail.com (Mic) Date: Thu, 28 Jun 2012 14:17:39 +1000 Subject: [Biojava-l] reading fasta file out of memory error In-Reply-To: References: Message-ID: Is it possible to read entry by entry rather to read the whole file in memory? On Wed, Jun 27, 2012 at 5:44 PM, Daniel Asarnow wrote: > Hi, > Have you tried increasing the size of the heap? You can use the -Xmx option > to java, e.g. -Xmx2048m or higher. > > The GC overhead error is usually thrown when the constraints of the heap > size force the JVM to spend too much time collecting garbage. > > -da > > On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > > > Hi, > > > > I have an 1.8 GB fasta file and I was trying to read it with the > following > > code as in suggested examples page. > > > > LinkedHashMap seqs = > > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > > > I don't get any error for small size files but it gives the following > error > > for big files. > > > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > > exceeded > > at java.util.Arrays.copyOf(Arrays.java:2746) > > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > > at > > > > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > > at > > > > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > > at > > > > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > > at > > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > > > > Is there any efficient way? > > > > Thanks in advance. > > > > -- > > HD > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From HWillis at scripps.edu Thu Jun 28 01:46:55 2012 From: HWillis at scripps.edu (Scooter Willis) Date: Thu, 28 Jun 2012 01:46:55 -0400 Subject: [Biojava-l] reading fasta file out of memory error Message-ID: <3F4E46FC-2906-4C29-B053-58177B2457C6@scripps.edu> Yes look for the lazyread option in the api ----- Reply message ----- From: "Mic" To: "Daniel Asarnow" Cc: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] reading fasta file out of memory error Date: Wed, Jun 27, 2012 9:18 pm Is it possible to read entry by entry rather to read the whole file in memory? On Wed, Jun 27, 2012 at 5:44 PM, Daniel Asarnow wrote: > Hi, > Have you tried increasing the size of the heap? You can use the -Xmx option > to java, e.g. -Xmx2048m or higher. > > The GC overhead error is usually thrown when the constraints of the heap > size force the JVM to spend too much time collecting garbage. > > -da > > On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > > > Hi, > > > > I have an 1.8 GB fasta file and I was trying to read it with the > following > > code as in suggested examples page. > > > > LinkedHashMap seqs = > > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > > > I don't get any error for small size files but it gives the following > error > > for big files. > > > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > > exceeded > > at java.util.Arrays.copyOf(Arrays.java:2746) > > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > > at > > > > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > > at > > > > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > > at > > > > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > > at > > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > > > > Is there any efficient way? > > > > Thanks in advance. > > > > -- > > HD > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l From hlk.dogan at gmail.com Wed Jun 27 07:01:07 2012 From: hlk.dogan at gmail.com (Haluk Dogan) Date: Wed, 27 Jun 2012 10:01:07 +0300 Subject: [Biojava-l] reading fasta file out of memory error Message-ID: Hi, I have an 1.8 GB fasta file and I was trying to read it with the following code as in suggested examples page. LinkedHashMap seqs = FastaReaderHelper.readFastaDNASequence(new File(args[0])); I don't get any error for small size files but it gives the following error for big files. Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2746) at java.util.ArrayList.ensureCapacity(ArrayList.java:187) at org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) at org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) at org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) at org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) at org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) at org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) Is there any efficient way? Thanks in advance. -- HD From dasarnow at gmail.com Wed Jun 27 07:44:14 2012 From: dasarnow at gmail.com (Daniel Asarnow) Date: Wed, 27 Jun 2012 00:44:14 -0700 Subject: [Biojava-l] reading fasta file out of memory error In-Reply-To: References: Message-ID: Hi, Have you tried increasing the size of the heap? You can use the -Xmx option to java, e.g. -Xmx2048m or higher. The GC overhead error is usually thrown when the constraints of the heap size force the JVM to spend too much time collecting garbage. -da On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > Hi, > > I have an 1.8 GB fasta file and I was trying to read it with the following > code as in suggested examples page. > > LinkedHashMap seqs = > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > I don't get any error for small size files but it gives the following error > for big files. > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > exceeded > at java.util.Arrays.copyOf(Arrays.java:2746) > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > at > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > at > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > at > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > at > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > at > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > at > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > Is there any efficient way? > > Thanks in advance. > > -- > HD > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From mictadlo at gmail.com Thu Jun 28 04:17:39 2012 From: mictadlo at gmail.com (Mic) Date: Thu, 28 Jun 2012 14:17:39 +1000 Subject: [Biojava-l] reading fasta file out of memory error In-Reply-To: References: Message-ID: Is it possible to read entry by entry rather to read the whole file in memory? On Wed, Jun 27, 2012 at 5:44 PM, Daniel Asarnow wrote: > Hi, > Have you tried increasing the size of the heap? You can use the -Xmx option > to java, e.g. -Xmx2048m or higher. > > The GC overhead error is usually thrown when the constraints of the heap > size force the JVM to spend too much time collecting garbage. > > -da > > On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > > > Hi, > > > > I have an 1.8 GB fasta file and I was trying to read it with the > following > > code as in suggested examples page. > > > > LinkedHashMap seqs = > > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > > > I don't get any error for small size files but it gives the following > error > > for big files. > > > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > > exceeded > > at java.util.Arrays.copyOf(Arrays.java:2746) > > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > > at > > > > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > > at > > > > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > > at > > > > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > > at > > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > > > > Is there any efficient way? > > > > Thanks in advance. > > > > -- > > HD > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From HWillis at scripps.edu Thu Jun 28 05:46:55 2012 From: HWillis at scripps.edu (Scooter Willis) Date: Thu, 28 Jun 2012 01:46:55 -0400 Subject: [Biojava-l] reading fasta file out of memory error Message-ID: <3F4E46FC-2906-4C29-B053-58177B2457C6@scripps.edu> Yes look for the lazyread option in the api ----- Reply message ----- From: "Mic" To: "Daniel Asarnow" Cc: "biojava-l at lists.open-bio.org" Subject: [Biojava-l] reading fasta file out of memory error Date: Wed, Jun 27, 2012 9:18 pm Is it possible to read entry by entry rather to read the whole file in memory? On Wed, Jun 27, 2012 at 5:44 PM, Daniel Asarnow wrote: > Hi, > Have you tried increasing the size of the heap? You can use the -Xmx option > to java, e.g. -Xmx2048m or higher. > > The GC overhead error is usually thrown when the constraints of the heap > size force the JVM to spend too much time collecting garbage. > > -da > > On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan wrote: > > > Hi, > > > > I have an 1.8 GB fasta file and I was trying to read it with the > following > > code as in suggested examples page. > > > > LinkedHashMap seqs = > > FastaReaderHelper.readFastaDNASequence(new File(args[0])); > > > > I don't get any error for small size files but it gives the following > error > > for big files. > > > > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit > > exceeded > > at java.util.Arrays.copyOf(Arrays.java:2746) > > at java.util.ArrayList.ensureCapacity(ArrayList.java:187) > > at > > > > > org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187) > > at > > > > > org.biojava3.core.sequence.template.AbstractSequence.(AbstractSequence.java:88) > > at org.biojava3.core.sequence.DNASequence.(DNASequence.java:81) > > at > > > > > org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62) > > at > > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106) > > at > > > > > org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118) > > > > > > Is there any efficient way? > > > > Thanks in advance. > > > > -- > > HD > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l