[Bioperl-l] Counting Homopolymer regions

Abhishek Pratap abhishek.vit at gmail.com
Mon Jan 12 14:06:13 EST 2009


Hi Heikki

Thanks for a quick reply.

Just wondering what happens if there are multiple homopolymeric regions in a
sequence/contig ?

Thanks,
-Abhi

On Mon, Jan 12, 2009 at 8:33 AM, Heikki Lehvaslaiho <
heikki.lehvaslaiho at gmail.com> wrote:

> If you can load the sequence strings into memory, I'd use a regular
> expression to detect the homopolymers and the use the pos function to
> find the location of hits:
>
>
> $s = "AGGGGGGGAAAAACGATCGGGGGGGTGTGGGGGCCCCCGTG";
> $min = 4;
>
> while ( $s =~ /(A{$min,}|T{$min,}|G{$min,}|C{$min,})/g) {
>    $end = pos($s);
>    $start = $end - length($1) + 1;
>    print "$start, $end, $1 \n";
> }
>
>
>   -Heikki
>
> 2009/1/9 Abhishek Pratap <abhishek.vit at gmail.com>:
> > Hello All
> >
> >
> > Is there a quick way to find the homopolymer stretches in the contigs and
> > also report their base start and end positions.
> >
> > Thanks,
> > -Abhi
> >
> > --
> > -----------------------------
> > Abhishek Pratap
> > Bioinformatics Software Engineer
> > Institute for Genome Sciences
> > School of Medicine, Univ of Maryland
> > 801, W. Baltimore Street, Baltimore, MD 21209
> > Ph: (+1)-410-706-2296
> > www.igs.umaryland.edu/
> >
> > Chair
> > RSG-Worldwide
> > ISCB-Student Council
> > http://iscbsc.org/rsg
> >
> > www.bioinfosolutions.com
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
>
> --
>   -Heikki
> Heikki Lehvaslaiho - heikki lehvaslaiho gmail com
> http://kapkaupunki.blogspot.com/
>



-- 
-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/

Chair
RSG-Worldwide
ISCB-Student Council
http://iscbsc.org/rsg

www.bioinfosolutions.com


More information about the Bioperl-l mailing list