[Bioperl-l] phd format parsing produces infinite memory usage

Jean-Marc FRIGERIO Frigerio at pierroton.inra.fr
Thu Jul 24 16:15:22 UTC 2008


> Date: Wed, 23 Jul 2008 11:08:44 +0200
> From: Jorge.DUARTE at biogemma.com
> Subject: [Bioperl-l] phd format parsing produces infinite memory usage
> To: bioperl-l at lists.open-bio.org
> Message-ID:
>        
> <OF5A0BD63C.18B9317B-ONC125748F.00318B40-C125748F.00324B8F at LGLimagrain.com>
>
> Content-Type: text/plain; charset="ISO-8859-1"
>
> Hello,
>
> i've been trying to use Bio::SeqIO to parse a phd-like format file.
>
> The script doesn't produce any error (nor output), but the memory usage
> keeps increasing until it reaches its limit (well i stoped the process at
> 16Gb of memory)
>
> Is the problem known ? Or is my file format wrong ?
>
> If i use only the first sequence from my file (bellow), the script works
> fine, maybe there is something wrong in the middle of the file... how can
> i print debugging info ?
>
> Thanks for any help
>
> jorge


Hi, 
> If i use only the first sequence from my file (bellow), the script works
> fine
Huuh !! phd files should contain only one sequence !! 

The following code works well on your file.


#!/usr/bin/env perl

# vim: filetype=perl fdm=marker fdl=0 fdc=2 fmr=#/*,#*\

use strict;
use warnings;

use Bio::SeqIO;

my $seqqio = Bio::SeqIO->new('-file' => 'phd_test', '-format' => 'phd');
my $seqout =  Bio::SeqIO->new('-fh'  => \*STDOUT,   '-format' => 'phd');
while(my $seq = $seqqio->next_seq)
{
  print $seq->id,"\n",$seq->seq,"\n";
  $seqout->write_seq($seq);
}



More information about the Bioperl-l mailing list