[Bioperl-l] Finding possible primers regex

Sat Aug 2 20:05:37 UTC 2008

Hi there, 
I'm trying to write a perl script to scan an aligned multiple entry fasta
file and find possible primers. So far I've produced a string which contains
bases which match all sequences and * where they don't match e.g.
1) TTAGCCTAA
2) TTAGCAGAA
3) TTACCCTAA

would give TTA*C**AA.

I want to parse this string and pull out all sequences which are 18-21 bp in
length and have no more than 4 * in them.

So far, I've got this:

while($fragment_match =~ /([GTAC*]{18,21})/g){
print "$1\n";
}

hoping to match all fragments 18-21 characters in length. However even that
doesn't work as it has essentially chunked it into 21 char blocks, rather
than what I hoped for of
0-18
0-19
0-20
0-21
1-19
1-20
1-21
1-22

etc.

Can anyone let me know if this is already possible in BioPerl, or how one
would go about it with regex. Sadly I'm fairly new to perl and getting to
grips with BioPerl, so please treat me gently :).

Many thanks,

Ben

-- 
View this message in context: http://www.nabble.com/Finding-possible-primers-regex-tp18792782p18792782.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.