[Bioperl-l] IUPAC support for DNA alignment

Alexie Papanicolaou apapanicolaou at ice.mpg.de
Wed Jul 2 20:11:31 UTC 2008


Hello

I agree if it is not too much trouble for you.

I think X (if users put them in DNA sequences) they will be masking 
characters (I 'm foolish enough to do it) so perhaps X =0? Does anyone 
else use Xs?

I dunno about Ns too. They could be unknown characters (in which case 
full score version could be 0 as well) or really mean all four 
nucleotides are equally likely. Is it too much trouble to allow users to 
set X and N manually (since it is the same whether they align with an 
A,T,C,G)?

ta
a


Yee Man Chan wrote:
> Hi guys
>
> 	What about providing two switches; one for full score and one for
> probabilistic score?
>
> Assume match is +3 and mismatch -1
>
> Full score version:
> 1) T - U = +3 (I assume U is the same as T for alignment purpose, right?)
> 2) A - W = +3
> 3) A - D = +3
> 4) A - N = +3
> 5) A - X = -1 (not so sure about this one)
>
> Probabilistic score version:
> 1) T - U = +3
> 2) A - W = +3/2-1/2 = +1
> 3) A - D = +3/3-1*2/3 = +1/3
> 4) A - N = +3/4-1*3/4 = 0
> 5) A - X = -1
>
> What do you think?
>
> Yee Man
>
> On Fri, 27 Jun 2008 aaron.j.mackey at gsk.com wrote:
>
>   
>> You could replicate what they do here with EST_GENOME (re-engineered to
>> accept ambiguity codes):
>>
>>   http://www.genome.org/cgi/content/short/17/2/212
>>
>> But I think the answer is user-dependent -- some might want the "full
>> score" (as in the above case), others might want the "(probabilistically)
>> averaged score", etc.  So, let the scoring matrix be subclass-able (or
>> mix-able), so that users can specify the exact desired behavior via a
>> handful of predefined (and useful) behaviors.
>>
>> -Aaron
>>
>>     
>
>   

-- 
"You can't find a hermit to teach you herming, because of course that rather spoils the whole thing."

    -- (Terry Pratchett, Small Gods)

Alexie Papanicolaou
Department of Entomology,
Max Planck Institute for Chemical Ecology,
Hans-Knoell-Strasse 8,
D-07745 Jena, Germany.





More information about the Bioperl-l mailing list