[Bioperl-l] SQ Line

James Gilbert jgrg@sanger.ac.uk
Mon, 31 Jul 2000 17:34:46 +0100 (BST)


Lorenz,

The following bit of code generates SwissProt
style crc32 checksums.  I think it is a
bit-reversal of a conventional crc32 (just to be
different, I guess).

I don't know about the molecular weight question
though.

	James

{
    my( @crcTable );
    
    sub generateCRCTable {
        # 10001000001010010010001110000100
        # 32 
        my $poly = 0xEDB88320;
        
        foreach my $i (0..255) {
            my $crc = $i;
            for (my $j=8; $j > 0; $j--) {
                if ($crc & 1) {
                    $crc = ($crc >> 1) ^ $poly;
                }
                else {
                    $crc >>= 1;
                }
            }
            $crcTable[$i] = $crc;
        }
    }

    sub crc32 {
        my( $str ) = @_;

        die "Argument to crc32() must be ref to scalar"
            unless ref($str) eq 'SCALAR';

        generateCRCTable() unless @crcTable;

        my $len = length($$str);

        my $crc = 0xFFFFFFFF;
        for (my $i = 0; $i < $len; $i++) {
            # Get upper case value of each letter
            my $int = ord uc substr $$str, $i, 1;
            $crc = (($crc >> 8) & 0x00FFFFFF) ^ $crcTable[ ($crc ^ $int) & 0xFF ];
        }
        #return sprintf "%X", $crc; # SwissProt format
        return $crc;
    }
}

On Mon, 31 Jul 2000, L.Pollak wrote:

> (Ewan: Sorry, i wanted to send this also to the list)
> 
> I have 2 questions about the SQ Line in swissprot:
> 
> does anyone know about CRC calculating used there?
> do i really have to add a CRC to the SQ line ??
> 
> does anyone know why the molecular weight in the "SQ" line
> from the "roa1.swiss" samplefile is so different from what i get 
> by using Bio::Tools::SeqStats ??
> (file says: 38715, from SeqStats: 45333)
> 
> kind regards,
> lorenz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge                        Tel: 01223 494906
CB10 1SA                         Fax: 01223 494919