[Bioperl-l] Fishing redundant sequences in FASTA files [Right	formatting]
    Adam Sjøgren 
    adsj at novozymes.com
       
    Wed Feb 16 08:01:31 EST 2011
    
    
  
On Tue, 15 Feb 2011 14:25:07 -0600, Chris wrote:
> SHA should work as well, didn't think of that (though I suppose the
> encoding step for either would be rate-limiting?).
Disk I/O might be the bottleneck - on a 3+ year old desktop I get ~144
MB/s for sha1 and ~217 MB/s for md5 in a simple test:
  $ dd if=/dev/zero bs=1M count=1024 | sha1sum -
  1024+0 records in
  1024+0 records out
  1073741824 bytes (1.1 GB) copied, 7.44032 s, 144 MB/s
  2a492f15396a6768bcbca016993f4b4c8b0b5307  -
  $ dd if=/dev/zero bs=1M count=1024 | md5sum -
  1024+0 records in
  1024+0 records out
  1073741824 bytes (1.1 GB) copied, 4.94205 s, 217 MB/s
  cd573cfaace07e7949bc0c46028904ff  -
On a reasonably new standard Dell desktop I get ~249 MB/s and ~410 MB/s
respectively.
  Best regards,
    Adam
-- 
                                                          Adam Sjøgren
                                                    adsj at novozymes.com
    
    
More information about the Bioperl-l
mailing list