[Bioperl-l] IUPAC code similarity

Fri Sep 17 15:04:28 UTC 2010

Hi Shalabh,

The expand method in Bio::Tools::SeqPattern may be useful to convert 
IUPAC codes to regular expressions:

$perl -e 'use Bio::Tools::SeqPattern; print 
Bio::Tools::SeqPattern->new(-seq=>"VGSRVBSSSSSNSC", -type=>'DNA')->expand'
[ACG]G[GC][AG][ACG][CGT][GC][GC][GC][GC][GC].[GC]C

Although that won't work if there are also abiguity codes in your 
database. For a non-BioPerl solution you could try fuzznuc from Emboss.

Cheers.
Roy.

On 17/09/2010 15:28, Aaron Mackey wrote:
> Convert the IUPAC code to a regular expression, and use regular expressions
> (in Perl or grep or similar) to find 100% identical matches.
>
> -Aaron
>
> On Thu, Sep 16, 2010 at 5:38 PM, shalabh sharma
> <shalabh.sharma7 at gmail.com>wrote:
>
>> Hi All,
>>       I have few nucleotide sequences that are composed of IUPAC codes. Like
>>> test
>> VGSRVBSSSSSNSC
>>
>> Similarly i have a database made of of these kind of sequences. I want to
>> find sequences that are 100% similar to the query sequence.
>>
>> Is there any bioPerl module to deal with this, i tried normal blast but it
>> didn't worked.
>> Do i have to convert these sequences to 4 base codes or there is any other
>> way out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l