[Bioperl-l] Fasta Genome Splice

Lincoln Stein lstein at cshl.edu
Fri Feb 13 04:29:50 EST 2004


There is actually a one-liner for this.  You can find it in Jim 
Tisdall's "Beginning Bioinformatics" book, which I strongly recommend 
to anyone who wants to do basic bioinformatics tasks without learning 
Bioperl.

Lincoln

On Thursday 12 February 2004 10:29 pm, Ryan Kuykendall wrote:
> I'm sure there is a Perl module for generating the reverse
> compliment of a whole genome, but assuming you wanted to write the
> code from scratch:
>
> ## ...and assuming your genome file has been turned into an array
> of bases ## called @listOfBases;
>
> my $baseComplimentMap =
> {
>  'a' => 't',
>  'c' => 'g',
>  'g' => 'c',
>  't' => 'a',
> };
>
> my @baseComplimentList = ();
>
> foreach my $base ( @listOfBases )
> {
>     my $complimentBase = $baseComplimentMap->{$base};
>     push( @baseComplimentList, $complimentBase );
> }
>
> That would do it...
>
> ============================================================
> Ryan Kuykendall
> ryank at drizzle.com
>
> http://undef.com/ryank/ryanAtBawa50percent.JPG
> ============================================================
>
> On Thu, 12 Feb 2004, David Clark wrote:
> > Good point.  What I need is two fasta files: one with the ofr
> > regions masked, and one with the non-ofr regions masked.  There
> > was another thing I wanted to do that I didn't mention before:
> > how can I generate the reverse compliment of a whole genome file?
> >
> > On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote:
> > > You want these as a fasta file per orf and per non-orf region
> > > or just 2 datasets with the genome masked (all N's or
> > > lowercased)?
> > >
> > > -jason
> > >
> > > On Thu, 12 Feb 2004, David Clark wrote:
> > >> Hello,
> > >>
> > >> I'm a relative newcomer to bioperl, and would like a point in
> > >> the right
> > >> direction.  I need to separate the yeast genome into two
> > >> partial genomes--one with all ORF's, and one with everything
> > >> else.  I have a tab delimited list of the ORF's with the
> > >> coordinates, and can probably parse that myself, but I wanted
> > >> to see if anyone could point me to some
> > >> example code, or give me some place to start in separating
> > >> genomes based on the coordinates.
> > >>
> > >> Thanks,
> > >>
> > >> David Clark
> > >> dfclark at neo.tamu.edu
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724


More information about the Bioperl-l mailing list