[Bioperl-l] string comparision mismatches and matches

Mark A. Jensen maj at fortinbras.us
Thu Feb 11 08:43:37 EST 2010


Perfectly described, Torsten. Yes, I confess a certain pride in this hack....
Roopa reports that it sped up her script 3X. cheers MAJ
----- Original Message ----- 
From: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Roopa Raghuveer" <rtbio.2009 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Thursday, February 11, 2010 6:52 AM
Subject: Re: [Bioperl-l] string comparision mismatches and matches


>> $in = 'ACCTCCTCCTCGAGTATGTG';
>> $tgt = 'TATCTTGCGCCGGAGATAAT';
>> $mask = pack("A*",$in)^pack("A*",$tgt);
>> $matches = $mask =~ tr/"\x0"/"\x0"/;
> 
> Impressive! Not often you see pack() let alone exclusive-or with a
> scalar context tr// thrown in for good measure!
> 
> For those who don't follow what it is doing, here is my (possibly
> wrong) interpretation: The pack() is converting each of the two (equal
> length) strings into a byte set. A bit-wise exclusive-or (XOR) is
> performed between these two byte sets. This will create bytes of value
> zero (0) where they were the same, and non-zero where they were
> different. The tr// then counts how many of the bytes were zero (\x0
> is ascii zero).
> 
> I'll just assume it is more efficient than for/substr/eq :-)
> 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
> University, AUSTRALIA
> 
>


More information about the Bioperl-l mailing list