[Bioperl-l] RE: SeqIO fails on masked sequences

Hilmar Lapp hlapp at gmx.net
Mon Jan 10 03:13:35 EST 2005


On Sunday, January 9, 2005, at 05:05  PM, Wes Barris wrote:

>>> Hilmar Lapp wrote:
>>>
>>>> You should not require by default that all sequences in one file be 
>>>> of
>>>> the same type (alphabet). We never have required this, nor 
>>>> documented
>>>> that it is a (not enforced) requirement, and so there may be people 
>>>> out
>>>> there relying on this 'feature'.
>>>
>>> Mixing both DNA and protein sequences in one file and then attempting
>>> to process it seems like kind of a bizarre thing to want to do.  If
>>> the alphabet is explicitly specified, isn't there a way to make that
>>> take precedence?
>> Why are you then able to set the alphabet of a SeqIO object if 
>> whenever you call next_seq() it trys to guess the alphabet of the
>> sequence anyway? It seems more logical to me, that the user can 
>> specify the alphabet without worrying about bioperl guessing it, and
>> getting it wrong, or not setting it at all.
>
> I am guessing that you meant to direct this question to Hilmar because
> I agree with you.  If one specifies the alphabet, bioperl should not
> subsequently try to guess it.

Right, that's what I agree with too. If an alphabet set for the stream 
gets reset to undef after every sequence then I'd call that a bug.

My point was, if the user doesn't specify the alphabet, then don't make 
assumptions that you don't absolutely have to make. You had suggested 
to guess the alphabet from the first sequence in this case and then 
assume every subsequent sequence in that stream will have that same 
alphabet. That's what I think is not a good idea and not necessary 
either. If the user doesn't preset the alphabet, just keep on guessing 
for every new sequence.

Mixing alphabets is indeed bizarre but people who do bizarre things are 
everywhere.

	-hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list