[Bioperl-l] K-mer generating script

Chris Fields cjfields at illinois.edu
Mon Jan 5 21:45:33 UTC 2009


Perl 6 (20 random 20-mers):

use v6;

say [~] <A T G C>.pick(20, :repl) for 1..20;

-chris

On Jan 5, 2009, at 2:43 PM, Smithies, Russell wrote:

> Yet another way with recursive use of map:
>
> print "[", join(", ", @$_), "]\n" for
> permute([ qw( A T G C ) ],[ qw( A T G C ) ],[ qw( A T G C ) ], 
> [ qw( A T G C ) ]);
>
> sub permute {
>  my $last = pop @_;
>  unless (@_) {
>    return map [$_], @$last;
>  }
>  return map { my $left = $_; map [@$left, $_], @$last } permute(@_);
> }
>
>
> (Modified from a PerlMonks example http://perlmonks.org/? 
> node_id=24270)
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Cook, Malcolm
>> Sent: Tuesday, 6 January 2009 8:15 a.m.
>> To: 'Chris Fields'; 'Jason Stajich'
>> Cc: 'bioperl list'; 'Mark A. Jensen'; Blanchette, Marco
>> Subject: Re: [Bioperl-l] K-mer generating script
>>
>> Gang,
>>
>> I couldn't resist adding the following non-perl solution...
>>
>> #!/bin/bash
>> k=$1
>> s=$( printf "%${k}s" ); # a string with $k blanks
>> s=${s// /{A,T,G,C\}};   # substitute '{A,T,G,C}' for each of the k
>> blanks
>> echo 'kmers using bash to expand:' $s > /dev/stderr
>> bash -c "echo  $s";     # let brace expanion of inferior bash compute
>> the cross product
>>
>> -- Malcolm
>>
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org
>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
>>> Chris Fields
>>> Sent: Friday, December 19, 2008 11:54 PM
>>> To: Jason Stajich
>>> Cc: bioperl list; Mark A. Jensen; Blanchette, Marco
>>> Subject: Re: [Bioperl-l] K-mer generating script
>>>
>>> To add to the pile:
>>>
>>> Mark-Jason Dominus tackles this problem in Higher-Order Perl
>>> using iterators, which also allows other nifty bits like
>>> 'give variants of A(CTG)T(TGA)', where anything in
>>> parentheses are wild-cards.  The nice advantage of the
>>> iterator approach is you don't tank memory for long strings.
>>> Furthermore, as a bonus, you can now download the book for
>>> free:
>>>
>>> http://hop.perl.plover.com/book/
>>>
>>> The relevant chapter is here (p. 135):
>>>
>>> http://hop.perl.plover.com/book/pdf/04Iterators.pdf
>>>
>>> chris
>>>
>>> On Dec 19, 2008, at 11:02 PM, Jason Stajich wrote:
>>>
>>>> Does someone want to put this on the wiki too?
>>>>
>>>> Maybe we could start a little bit of perl snippets for
>>> examples like
>>>> these.
>>>>
>>>> -j
>>>> On Dec 19, 2008, at 7:45 PM, Mark A. Jensen wrote:
>>>>
>>>>> A little sloppy, but it recurses and is general---
>>>>>
>>>>> # ex...
>>>>> @combs = doit(3, [ qw( A T G C ) ]);
>>>>> 1;
>>>>> # code
>>>>>
>>>>> sub doit {
>>>>> my ($n, $sym) = @_;
>>>>> my $a = [];
>>>>> doit_guts($n, $sym, $a, '');
>>>>> return map {$_ || ()} @$a;
>>>>> }
>>>>>
>>>>> sub doit_guts {
>>>>> my ($n, $sym, $store, $str)  = @_;
>>>>> if (!$n) {
>>>>> return $str;
>>>>> }
>>>>> else {
>>>>> foreach my $s (@$sym) {
>>>>>   push @$store, doit_guts($n-1, $sym, $store, $str.$s);  } } }
>>>>>
>>>>>
>>>>> ----- Original Message ----- From: "Blanchette, Marco"
>>>>> <MAB at stowers-institute.org
>>>>>>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Friday, December 19, 2008 6:25 PM
>>>>> Subject: [Bioperl-l] K-mer generating script
>>>>>
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Does anyone have a little function that I could use to
>>> generate all
>>>>>> possible k-mer DNA sequences? For instance all possible
>>> 3-mer (AAA,
>>>>>> AAT, AAC, AAG, etc...). I need something that I could input the
>>>>>> value of k and get all possible sequences...
>>>>>>
>>>>>> I know that it's a problem that need to use recursive programming
>>>>>> but I can't get my brain around the problem.
>>>>>>
>>>>>> Many thanks
>>>>>>
>>>>>> Marco
>>>>>> --
>>>>>> Marco Blanchette, Ph.D.
>>>>>> Assistant Investigator
>>>>>> Stowers Institute for Medical Research 1000 East 50th St.
>>>>>>
>>>>>> Kansas City, MO 64110
>>>>>>
>>>>>> Tel: 816-926-4071
>>>>>> Cell: 816-726-8419
>>>>>> Fax: 816-926-2018
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Jason Stajich
>>>> jason at bioperl.org
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list