[Bioperl-l] RE: SeqIO fails on masked sequences

Nathan Haigh nathanhaigh at ukonline.co.uk
Mon Jan 10 03:50:02 EST 2005


> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 10 January 2005 08:14
> To: Wes Barris
> Cc: nathanhaigh at ukonline.co.uk; 'Bioperl list'; 'Brian Osborne'
> Subject: Re: [Bioperl-l] RE: SeqIO fails on masked sequences
> 
> 
> On Sunday, January 9, 2005, at 05:05  PM, Wes Barris wrote:
> 
> >>> Hilmar Lapp wrote:
> >>>
> >>>> You should not require by default that all sequences in one file be
> >>>> of
> >>>> the same type (alphabet). We never have required this, nor
> >>>> documented
> >>>> that it is a (not enforced) requirement, and so there may be people
> >>>> out
> >>>> there relying on this 'feature'.
> >>>
> >>> Mixing both DNA and protein sequences in one file and then attempting
> >>> to process it seems like kind of a bizarre thing to want to do.  If
> >>> the alphabet is explicitly specified, isn't there a way to make that
> >>> take precedence?
> >> Why are you then able to set the alphabet of a SeqIO object if
> >> whenever you call next_seq() it trys to guess the alphabet of the
> >> sequence anyway? It seems more logical to me, that the user can
> >> specify the alphabet without worrying about bioperl guessing it, and
> >> getting it wrong, or not setting it at all.
> >
> > I am guessing that you meant to direct this question to Hilmar because
> > I agree with you.  If one specifies the alphabet, bioperl should not
> > subsequently try to guess it.
> 
> Right, that's what I agree with too. If an alphabet set for the stream
> gets reset to undef after every sequence then I'd call that a bug.
> 

agreed :o)

> My point was, if the user doesn't specify the alphabet, then don't make
> assumptions that you don't absolutely have to make. You had suggested
> to guess the alphabet from the first sequence in this case and then
> assume every subsequent sequence in that stream will have that same
> alphabet. That's what I think is not a good idea and not necessary
> either. If the user doesn't preset the alphabet, just keep on guessing
> for every new sequence.
> 

Hmm, yes I think the former was what I had suggested, but soon realised this wasn't a good thing and forgot to correct myself later.
I'll get this fix ready today hopefully.

Nath

> Mixing alphabets is indeed bizarre but people who do bizarre things are
> everywhere.
> 
> 	-hilmar
> 
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
> 
> 
> ---
> avast! Antivirus: Inbound message clean.
> Virus Database (VPS): 0501-1, 07/01/2005
> Tested on: 10/01/2005 08:36:48
> avast! is copyright (c) 2000-2003 ALWIL Software.
> http://www.avast.com
> 
> 

---
avast! Antivirus: Outbound message clean.
Virus Database (VPS): 0501-1, 07/01/2005
Tested on: 10/01/2005 08:49:42
avast! is copyright (c) 2000-2003 ALWIL Software.
http://www.avast.com







More information about the Bioperl-l mailing list