[Bioperl-l] Asking for advice on full EMBL extraction

brian li brianli.cas at gmail.com
Thu May 7 05:32:56 UTC 2009


Thank you very much for your offer.

The director of our lab wants me to do the extraction every time a new
release of EMBL is published. I can't push the task to you every time.

I can offer more information of the server I run my script on if needed.

-Brian

On Thu, May 7, 2009 at 1:01 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> Sadly, that's the same code as I ran but I had a Data::Dump in the middle.
> Versions of Perl and BioPerl are the same.
> We're running RHEL 5 (kernel 2.6.18-92.1.18.el5) with 16GB RAM
>
> If you get a full script running on a smaller dataset, I could probably run it on the bigger stuff and give you back tab-separated (or is that tab\tseparated ?) data for loading into your db.
>
> --Russell
>
>> -----Original Message-----
>> From: brian li [mailto:brianli.cas at gmail.com]
>> Sent: Thursday, 7 May 2009 4:50 p.m.
>> To: Smithies, Russell
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Asking for advice on full EMBL extraction
>>
>> Dear Russell,
>>
>> My example code is as following. I omit the parse process and these
>> lines give me "Segmentation Fault" too.
>>
>> # Start of code
>> my $seqio = Bio::SeqIO->new(-file => 'rel_ann_mus_01_r99.dat',
>>                                              -format => 'EMBL');
>> my $index = 1;
>> while (my $seq = $seqio->next_seq)
>> {
>>     print "Dealing with entry: $index\n";
>>     $index++;
>> }
>> # End
>>
>> The platform I run this code on:
>> BioPerl 1.6.0
>> Perl 5.8.8
>> Ubuntu 8.04 LTS Server 64-bit version (Linux 2.6.24-23-server)
>>
>> I have monitored the memory usage when I run the code above. There is
>> always around 20GB free memory (buffer size counted in) left. So I
>> suppose the segfault can't be explained just by memory shortage.
>>
>> Brian
>>
>>
>> On Thu, May 7, 2009 at 11:32 AM, Smithies, Russell
>> <Russell.Smithies at agresearch.co.nz> wrote:
>> > Hi Brian,
>> > I hate to say it but it worked OK for me using rel_ann_mus_01_r99.dat.gz and
>> simple example Bio::SeqIO code from bugzilla
>> > It's not using more than 1GB memory on our server and doesn't segfault.
>> >
>> > Send me your example code and I'll give it a go if you like.
>> >
>> >
>> > Russell Smithies
>> >
>> > Bioinformatics Applications Developer
>> > T +64 3 489 9085
>> > E  russell.smithies at agresearch.co.nz
>> >
>> > Invermay  Research Centre
>> > Puddle Alley,
>> > Mosgiel,
>> > New Zealand
>> > T  +64 3 489 3809
>> > F  +64 3 489 9174
>> > www.agresearch.co.nz
>> >
>> >
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>




More information about the Bioperl-l mailing list