[Biopython-dev] Code to submit: CRC64

Peter biopython-dev at maubp.freeserve.co.uk
Thu Jun 21 13:02:39 UTC 2007


Sebastian Bassi wrote:
> Hello,
>
> I don't have write access to CVS so I post this code here.
> I included CRC64 checksum, it is used in several genomic databases.

HI Sebastian,

Please could you fill an enhancement bug, and attach the code to it -
it makes keeping track of requests and patches much easier.

Could you also give a couple of examples of how you might use this?

In typical usage, does the case of the sequences matter? As it stands
it would be up to the user to adjust the case before calling the CRC64
function.

Looking at the code, it looks like it would fail when used on
sequences (Seq objects) where the "letters" are non single characters
(e.g. sequences using the three letter amino acid codes). This is
probably not a big problem.

Regarding the implementation, I'm not sure if using Bio/utils.py is
the best place - anyone?

You introduce a few "top level" variables:

POLY64REVh = 0xd8000000L
CRCTableh = [0] * 256
CRCTablel = [0] * 256
isInitialized = False

I think it would be better if their names started with an underscore
to mark them as "private" to the utils.py module. I would also use a
CRC64 prefix to make it explicit what the are for given the utils.py
file contains a range of different functions. In particular, the name
isInitialized is too vague and uses mixed case. Maybe:

_CRC64_POLY64REVh = 0xd8000000L
_CRC64_tableh = [0] * 256
_CRC64_tablel = [0] * 256
_CRC64_initialized = False

Peter

P.S. You misspelt recipe in the comments.



More information about the Biopython-dev mailing list