[Bioperl-l] SeqIO problem on fasta file with large number of sequences.

Jason Eric Stajich jason@cgt.mc.duke.edu
Tue, 9 Oct 2001 08:15:09 -0400 (EDT)


Kun -

I'm not aware of any times when we allocate tempfiles in the SeqIO system
I regularly parse files with SeqIO that are 10k+ with no problem so
perhaps you are doing something else in addition within these loops that
is allocating new tempfiles.

The tempfile cleanup function as part of Temp::File is often not called
until the program exits so you have the confusing situation of running out
of space/free FH while the directory is empty when program exits.  We have
worked around this by trying to register cleanup in the object destructor
but I'm not sure what version of bioperl you are running and whether or
not that fix has been implemented properly.

Happy to help chase it down but would be helpful if you could submit this
as a bug report and describe your version of perl, bioperl, architecture,
etc.  Is it possible for you to send your complete code so we don't go
chasing too far?

-Jason

On Mon, 8 Oct 2001, Kun Zhang wrote:

> Hello! I'm using SeqIO to read DNA sequences of NCBI's unigene cluster,
> which is distributed as a big fasta file
> (ftp://ftp.ncbi.nlm.nih.gov/pub/schuler/unigene/Hs.seq.uniq.Z), and process
> every gene sequentially. I got the following error message when 1002 (or
> 1003) sequences have been processed.
> =================================================================================
> Error in tempfile() using /tmp/XXXXXXXXXX: Could not create temp file
> /tmp/eo0UwZVasW: Too many open files at
> /usr/lib/perl5/site_perl/5.6.0/Bio/Root/IO.pm line 41
> ==================================================================================
>
> However, I check the /tmp directory, and found not a single temporary
> sequence. Is there any way to work around this problem except for chopping
> the fasta file into several smaller ones? Thanks!
>
> Kun Zhang
> Human Genetics Center
> UT-Houston Health Science Center
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu