[BioRuby] restriction enzyme module
Toshiaki Katayama
ktym at hgc.jp
Fri Apr 6 04:55:14 UTC 2007
Trevor,
I changed AA#to_re method symmetrically to the new NA#to_re.
def to_re(seq)
replace = {
'B' => '[DNB]',
'Z' => '[EQZ]',
'J' => '[ILJ]',
'X' => '[ACDEFGHIKLMNPQRSTVWYUOX]',
}
replace.default = '.'
str = seq.to_s.upcase
str.gsub!(/[^ACDEFGHIKLMNPQRSTVWYUO]/) { |aa|
replace[aa]
}
Regexp.new(str)
end
I also added the default value '.' to rescue unexpected char in the sequence.
There are two possibilities:
1. replace abnormal alphabet to '.' (current implementation)
2. left the alphabet as is (previous implementation)
I suppose 1. is better if the sequence has gap signs '-' etc. which may
have different meaning in regexp.
Toshiaki
On 2007/04/06, at 8:21, Trevor Wennblom wrote:
>
> On Apr 5, 2007, at 9:55 AM, Toshiaki Katayama wrote:
>
>>
>> I'll forward the patch I sent to you.
>> Do you think this is applicable?
>
>
> Looks good to me, patch applied. Now how about taking care of AA:
>
> def to_re(seq)
> str = seq.to_s.upcase
> str.gsub!(/[^BZACDEFGHIKLMNPQRSTVWYU]/, ".")
> str.gsub!("B", "[DN]")
> str.gsub!("Z", "[EQ]")
> Regexp.new(str)
> end
>
> What if we changed it to:
>
> def to_re(seq)
> str = seq.to_s.upcase
> str.gsub!(/[^BZACDEFGHIKLMNPQRSTVWYU]/, ".")
> str.gsub!("B", "[BDN]")
> str.gsub!("Z", "[ZEQ]")
> str.gsub!("J", "[JIL]")
> Regexp.new(str)
> end
>
> Note i've added Xle to the list.
>
>
> Trevor
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
More information about the BioRuby
mailing list