[Biojava-l] Parsing blast result with a lot of hit
PhxGM Gim
phxgm at hotmail.com
Thu Nov 4 21:06:49 EST 2004
what is the exact msg you are recieving from the JVM when it aborts? I'm
*assuming* it's the standard "Out of Memory Exception." You can increase the
heap size allocated to the JVM upon startup of the java application by
throwing a few switches to the jvm invocation. there are complete tutorials
on how to set the heap sizes for the jvms on the sun site at java.sun.com. i
have used these to some degree of success when scaling java apps and hope it
is applicable to your situation.
other than that you can certainly do something about having all those
instances in memory at any one time, perhaps read them 'on demand' from
storage. clearly you are going to have to solve the issue via additional
resource allocations to the JVM or programmatically by reading data only as
needed instead of loading all the data into memory. As I haven't encountered
this particular issue in my development as of yet (with biojava) I do not
know what constraints are imposed on developers ability to do this.
Again, I'm going to assume you have a Blast XML output file, which
theoretically should be handled by either the BlastLikeSAXParser or the
BlastXMLParser. Taken from the biojava docs on the BlastLikeSAXParser - "The
biojava Blast-like parsing framework is designed to uses minimal memory,so
that in principle, extremely large native outputs can be parsed and XML
ContentHandlers can listen only for small amounts of information."
(http://www.biojava.org/docs/api/org/biojava/bio/program/sax/BlastLikeSAXParser.html.)
you can use an 'event driven' SAX parser ContentHandlers to trigger events
caused by the XML document you're parsing. Again, it claims to scale...
whether it does or not is another issue.
hope this has been of at least some help,
jess vermont
chicago
>From: "Lu Qiang" <luqiang at scbit.org>
>To: "biojava-l at biojava.org" <biojava-l at biojava.org>
>Subject: [Biojava-l] Parsing blast result with a lot of hit
>Date: Thu, 4 Nov 2004 18:42:20 +0000
>
>Hi, Guys,
>
>If we are tyring to parse a blast result with a lot of hits, the machine
>will be crashed, for example 5000 sequences blast themselves.
>
>This must be caused by a ArrayList storing all results.
>
>How to solve this problem?
>
>regards,
>
>Lu
>
>
>_______________________________________________
>Biojava-l mailing list - Biojava-l at biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
More information about the Biojava-l
mailing list