[Bioperl-l] reducing time

Chris Dagdigian dag at sonsorol.org
Wed Jan 28 11:18:33 EST 2004


Interestingly enough there is a thread on the bioclusters list today 
about dealing with filesystems and directories of large files.

  Here it is in the archives:
  https://bioinformatics.org/pipermail/bioclusters/2004-January/001404.html

  Tim Cutts also proposed an intersting perl hashing method that should
  handle up to "64 million files or so" -- this could be an option for 
people stuck with ext2 filesystems:
  https://bioinformatics.org/pipermail/bioclusters/2004-January/001408.html

  -Chris




Andreas Bernauer wrote:

> Heikki Lehvaslaiho wrote:
> 
>>I could not find anything wrong in your code. 95 seconds per file (you mean 
>>per sequence?) is really slow. The problem according my tests is your local 
>>system. Maybe you have too many files per directory? 
> 
> 
> I had this problem with too many files per directory.  When I use more
> than about 2000 files or so in single directory, the access to a file
> slows down noticeably.  There is nothing you can do about it on ext2
> or ext3 file systems except of not using so many files in a single
> directory (as far as I know).
> 
> 
> Andreas.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Chris Dagdigian, <dag at sonsorol.org>
Independent life science IT & informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net



More information about the Bioperl-l mailing list