[Bioperl-l] Fishing redundant sequences in FASTA files [Right formatting]
Adam Sjøgren
adsj at novozymes.com
Wed Feb 16 08:01:31 EST 2011
On Tue, 15 Feb 2011 14:25:07 -0600, Chris wrote:
> SHA should work as well, didn't think of that (though I suppose the
> encoding step for either would be rate-limiting?).
Disk I/O might be the bottleneck - on a 3+ year old desktop I get ~144
MB/s for sha1 and ~217 MB/s for md5 in a simple test:
$ dd if=/dev/zero bs=1M count=1024 | sha1sum -
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 7.44032 s, 144 MB/s
2a492f15396a6768bcbca016993f4b4c8b0b5307 -
$ dd if=/dev/zero bs=1M count=1024 | md5sum -
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 4.94205 s, 217 MB/s
cd573cfaace07e7949bc0c46028904ff -
On a reasonably new standard Dell desktop I get ~249 MB/s and ~410 MB/s
respectively.
Best regards,
Adam
--
Adam Sjøgren
adsj at novozymes.com
More information about the Bioperl-l
mailing list