[Bioperl-l] StandAloneFasta and Too many open files

Chris Fields cjfields at illinois.edu
Tue May 11 03:57:18 UTC 2010


Addendum to that last post.

On May 10, 2010, at 10:04 PM, Chris Fields wrote:

> On May 10, 2010, at 9:03 PM, Dimitar Kenanov wrote:
> 
>> Hi guys,
>> yesterday i got the following error:
>> 
>>   'Too many open files at /usr/lib64/perl5/site_perl/5.10.0/Bio/Tools/Run/Alignment/StandAloneFasta.pm line 380'
>> 
>> from the following code:
>> ------------
>>   my $ssout="my_seq_out.txt";
>>   print "SS:$tquery:\n:$tseq:\n";
>>   my @sargs=(
>>       'q' => '',
>>       'E' => '1',
>>       'w' => '100',
>>       'O' => "$ssout",
>>       'program' => "ssearch36",
>>       );
>>   my $fac_ss=Bio::Tools::Run::Alignment::StandAloneFasta->new(@sargs);
>>   $fac_ss->library($tmpseq);
>>   my @sreport=$fac_ss->run($tqtmp);
>> 
>> foreach my $sr (@sreport){
>>     while(my $result=$sr->next_result){
>>         while(my $hit=$result->next_hit){
>>             while(my $hsp=$hit->next_hsp){
>>                 my $iden=$hsp->frac_identical;
>>                 $rv3=$iden;
>> #               print "IDEN:$iden:$rv1\n";
>>             }
>>         }
>>     }
>> }
>> --------------------
>> I am using that code over several thousands of HSPs for which i get the sequence and then 'ssearch36' with it against another sequence. I was digging around the module StandAloneFasta but couldnt get where the problem is. There should be somewhere many opened filehandles but do not know where. I checked the module but couldnt find such filehandles. May be the problem is in the base modules.
>> I also checked and my script for left open filehandles and i have not. I found only that i can actually close SeqIO streams with '$stream->close' which i didnt see on the web documentation. So something positive out of this :) So i closed all my SeqIO streams and i still had the same problem.
>> Next i commented out the above code and rewrote my script into the following:
>> --------------
>>   my $ssout="my_seq_out.txt";
>>   my @sargs=("ssearch36 -q -E 1 -d 1 $tqtmp $tmpseq > $ssout");
>>   system(@sargs) == 0 or die "system @sargs failed: $!";
>> 
>>   my $sreport=Bio::SearchIO->new(-file => $ssout, -format => 'fasta');
>>   while(my $result=$sreport->next_result){
>> #    print Dumper($result);
>>       while(my $hit=$result->next_hit){
>>           while(my $hsp=$hit->next_hsp){
>> 
>>           my $iden=$hsp->frac_identical;
>>           $rv3=$iden;
>> #            print "IDEN:$iden:$rv1\n";
>>           }
>>       }
>>   }
>> ---------------
>> Fortunately this code overcame the error message with too many filehandles. So the problem was indeed coming from the module or the modules behind it.
>> 
>> I have also read that one can change the number of how many files can be opened on the system but i didnt want to mess with that for now because i do not know what could be the implications of that.
>> 
>> Ok that is it. I just wanted to inform about my experience and to report the problem.
>> 
>> Cheers
>> Dimitar
> 
> 
> Seems this is hitting the system ulimit somehow, but it's not immediately apparent how that's happening unless you are caching the IO objects somehow.  Can you file this as a bug, maybe with a fuller test script?  Might give us something to check against.
> 
> chris

Dimitar,

I think Peter had answered this before, might indicate the problem is actually using the 'O' option in output.  We can look at possibly just capturing STDOUT instead, but we may not support the use of 'O' if it's as buggy as indicated.

http://groups.google.com/group/bioperl-l/msg/25c17748d1ac6ef4

chris





More information about the Bioperl-l mailing list