[Bioperl-l] A question about the perl code

Wed Aug 17 16:30:39 EDT 2005

> -----Original Message-----
> From: Johan Viklund [mailto:johan.viklund at gmail.com] 
> On 8/16/05, Alex Zhang <mayagao1999 at yahoo.com> wrote:
> > Dear all,
> > 
> > I made a group A which includes 16 combinations of any
> > two nucleotides like: AA,AC,AG,AT,
> > CA,CC,CG,CT,
> > GA,GC,GG,GT,
> > TA,TC,TG,TT
> > 
> > If  I randomly got a pair like AC, I want to exclude
> > AC, AT, AG, AA, TC, CC, GC. In other words, I want to
> > exclude the pairs in group A which has the same
> > nucleotide with the pair randomly selected. Can> 
>
> Hi,
> 
> If you have all the pairs in an array, say @nucleotide_pairs, and the
> pair you randomly selected in the scalar $pair this will work:
> 
> @selected_pairs = grep { not /[$pair]/ } @nucleotide_pairs;

I don't think that's true.  The above exclues anything with an A or C in
either position. (Btw, I used @pairs, not @nucleotide_pairs, for brevity.)

>perl -le 'foreach $i (qw(A C G T)) {foreach $j (qw (A C G T)) { push
@pairs, "$i$j"}} print join " ", @pairs'
AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT
>perl -le 'foreach $i (qw(A C G T)) {foreach $j (qw (A C G T)) { push
@pairs, "$i$j"}} $pair = "AC"; @selected_pairs = grep { not /[$pair]/ }
@pairs; print join " ", @selected_pairs'
GG GT TG TT

I believe the requirement is that it can't have an A in position 0 or a C in
position 1. One way to do it (not a particularly pretty way):

>perl -le 'foreach $i (qw(A C G T)) {foreach $j (qw (A C G T)) { push
@pairs, "$i$j"}} ($n1, $n2) = split //, "AC"; @selected_pairs = grep {
/[^$n1][^$n2]/ } @pairs; print join " ", @selected_pairs'
CA CG CT GA GG GT TA TG TT

The easiest way might really just be something like "grep {substr($_, 0, 1)
!= substr($pair, 0, 1) && substr($_, 1, 1) != substr($pair, 1, 1)}
@nucleotide_pairs

-Amir Karger