[Bioperl-l] Pattern search with gap

Tamas Horvath hotafin at gmail.com
Wed Jun 22 10:13:31 EDT 2005


Here's a much simpler code:

#!/usr/bin/perl

#                        10        20        30        40   45   50   
    60        7072
#              
123456789012345678901234567890123456789012345678901234567890123456789012345678901
my $seqstring ="--------------------CAAAATAAATAGGTTATACAGAAACA---------------------AGATAAAAATTACA";
my $qseq    = "CAAGATA";

my @qqq = split (//,$qseq);
my $pat = join('-*', at qqq);

my $pat_rege = qr/$pat/;

$seqstring =~ /$pat_rege/;
my $before = $`;
my $match_seq = $&;
my $before_length = length $before;
my $mseq_length = length $match_seq;
my $start = 1 + $before_length;
my $end = $before_length + $mseq_length;
print "Start:$start End:$end\n";
#Start:45 End:72

It should be quite fast. try it out, and let me know, if it works well for you!

Hota

On 6/22/05, khoueiry <khoueiry at ibdm.univ-mrs.fr> wrote:
> Hello,
> 
> I want to parse a gapped sequence and search for a pattern in it... What
> is important for me is to get the Position of the pattern Start and End
> taking gaps into account:
> 
> i.e :
> my $seqstring =
> "--------------------CAAAATAAATAGGTTATACAGAAACA---------------------AGATAAAAATTACA";
> my $qseq    = "CAAGATA";
> 
> so the result should give me : start  = 61 and End  = 89
> 
> I wrote a program to do that.. It works well but when working with very
> large sequences (And I have a lot of them), it take a lot of time....
> 
> In fact, my program parse the sequence with a sliding window equal the
> length of the pattern...
> 
> the while loop is attached :
> 
> Any suggestion will be appreciated....
> 
> 
> Pierre
> 
> 
> 
> --
> ==========================
> Pierre Khoueiry
> LGPD/IBDM
> Campus de Luminy, Case 907
> 13288 Marseille cedex 9, France
> Tel : +33 (0)4 91 82 94 18
> Fax : +33 (0)4 91 82 06 82
> 
> ==========================
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>



More information about the Bioperl-l mailing list