[Biojava-l] [Biojava-dev] [Fwd: large genbank data]
James Carman
james at carmanconsulting.com
Fri Jul 18 10:45:50 UTC 2008
That is a limitation for string literals, not any string. Correct?
On Fri, Jul 18, 2008 at 4:47 AM, Richard Holland
<dicknetherlands at gmail.com> wrote:
> In order to persist to BioSQL, BioJava has to convert the symbol list
> into a string so that it can pass it to JDBC via Hibernate. Therefore
> the maximum length of a sequence you wish to persist to BioSQL is the
> maximum length of a string in Java, which is 65536 (2^16) if you are
> working in a UTF-8 environment.
>
> 2008/7/18 Rey Vincent Babilonia <rvincent at asti.dost.gov.ph>:
>> Hi Mark,
>>
>> What is the maximum sequence length that a RichSequence can handle?
>>
>> java -Xms1024m -Xmx1256m -jar loader.jar
>> .
>> 16:09:00,173 INFO Loader:296 - D:\AE005174.gbk is readable.
>> 16:09:06,704 INFO Loader:326 - Loading sequence AE005174 with identifier
>> 56384585, length 5528445 and alphabet DNA...
>> org.hibernate.PropertyAccessException: Exception occurred inside getter of
>> org.biojavax.bio.seq.SimpleRichSequence.sequenceLength
>>
>> Rey Vincent Babilonia wrote:
>>>
>>> Hi Mark,
>>>
>>> At first it throws an out of memory exception. My workaround is to
>>> subdivide the sequence file into individual GenBank files.
>>>
>>> The error now is that if a GenBank sequence has an 'empty alphabet', it
>>> does not get loaded to BioSQL. My workaround is to check if
>>> sequence.getAlphabet().getName() is DNA.
>>>
>>> Thanks.
>>>
>>> Mark Schreiber wrote:
>>>>
>>>> Hi -
>>>>
>>>> Is the code throwing an exception or running out of memory??
>>>>
>>>> Can you send an example program and the problem you encounter to the
>>>> list.
>>>> - Mark
>>>>
>>>> On Thu, May 29, 2008 at 9:53 AM, Rey Vincent Babilonia
>>>> <rvincent at asti.dost.gov.ph> wrote:
>>>>>
>>>>> -------- Original Message --------
>>>>> Subject: large genbank data
>>>>> Date: Wed, 28 May 2008 18:02:48 +0800
>>>>> From: Rey Vincent Babilonia <rvincent at asti.dost.gov.ph>
>>>>> To: biojava-l at biojava.org
>>>>>
>>>>> hi,
>>>>>
>>>>> anybody tried uploading a large genbank data (e.g.
>>>>> ftp://bio-mirror.net/biomirror/genbank/gbbct1.seq.gz) to biosql?
>>>>> load_seqdatabase.pl of bioperl can do this. i'm switching to biojava and
>>>>> it can't read the sequence (maybe because it has 30000+ sequences).
>>>>>
>>>>> thanks.
>>>>>
>>>>> --
>>>>> /**
>>>>> * @author Rey Vincent P. Babilonia
>>>>> * @number +63 2 426 9760 local 1302
>>>>> * @pgp 0x383454CF <at> pgp.mit.edu
>>>>> * @project Philippine Bioinformatics Solutions
>>>>> * @program Philippine e-Science Grid
>>>>> * @division Research and Development Division
>>>>> * @agency Advanced Science and Technology Institute
>>>>> * @url http://www.psigrid.gov.ph
>>>>> */
>>>>>
>>>>>
>>>>> --
>>>>> /**
>>>>> * @author Rey Vincent P. Babilonia
>>>>> * @number +63 2 426 9760 local 1302
>>>>> * @pgp 0x383454CF <at> pgp.mit.edu
>>>>> * @project Philippine Bioinformatics Solutions
>>>>> * @program Philippine e-Science Grid
>>>>> * @division Research and Development Division
>>>>> * @agency Advanced Science and Technology Institute
>>>>> * @url http://www.psigrid.gov.ph
>>>>> */
>>>>>
>>>>> No virus found in this outgoing message.
>>>>> Checked by AVG.
>>>>> Version: 8.0.100 / Virus Database: 269.24.2/1471 - Release Date:
>>>>> 5/28/2008 5:33 PM
>>>>>
>>>>> _______________________________________________
>>>>> biojava-dev mailing list
>>>>> biojava-dev at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>
>>>>
>>>
>>
>> --
>> /**
>> * @author Rey Vincent P. Babilonia
>> * @number +63 2 426 9760 local 1302
>> * @pgp 0x383454CF <at> pgp.mit.edu
>> * @project Philippine Bioinformatics Solutions
>> * @program Philippine e-Science Grid
>> * @division Research and Development Division
>> * @agency Advanced Science and Technology Institute
>> * @url http://www.psigrid.gov.ph
>> */
>>
>> _______________________________________________
>> biojava-dev mailing list
>> biojava-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
More information about the Biojava-l
mailing list