[Bioperl-l] alignable portion of a genome

Mon May 11 09:31:43 UTC 2009

Hi,

I would like to know of a good and fast way that could help me calculate the
alignable portion of a genome (not human), given a reference sequence.
When I say alignable portion I mean that I want to know all the positions of
the genome that can be covered uniquely by reads of 36 bp and up to 2
mismatches.

Some have advised me to work with Perl using the following strategy but I am
not a Perl user so if someone has already a script for this function, it
would be nice:

"you could approach it by walking along the genome in a sliding window of
36 nt, and hash the frequency of each 36 nt sequence that you encounter.
Then count how many of the 36 nt sequences had a frequency of exactly
one. Divide this by the total number of 36nt windows visited. This
should be do-able in about 20 lines of Perl."

Best regards and thanks in advance

-- 
View this message in context: http://www.nabble.com/alignable-portion-of-a-genome-tp23480025p23480025.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.