[Biopython-dev] Determining if seq alphabet is protein/dna/rna

Frank fkauff at duke.edu
Mon Oct 30 00:48:39 UTC 2006


Hi all,

On Mon, 2006-10-30 at 00:13 +0000, Peter (BioPython Dev) wrote:
> Hello all,
> 
> I've been looking at writing multiple sequence alignments in Nexus 
> format for the new Bio.SeqIO code, and came up with the following little 
> problem:
> 
> Given one or more Seq objects, how can I reliably decide if they are 
> protein, DNA, or RNA?
> 
> (These are the relevant choices in a Nexus file's format datatype=... 
> header.)
> 
> I'm resigned to the fact that if the Seq object has the generic alphabet 
> this boils down to looking at the sequence strings and making an 
> educated guess (probably following an established algorithm from an 
> alignment program).  Does any such code already exist in BioPython?
> 
I'm not aware of any such code - however, an educated guess would be
easy, (more or less ACGTNX only, ACGUNX only, everything else...?). With
NEXUS it becomes tricky, as a dataset could potentially be partitioned
into a mix of all types. And there is no "official" way to indicate this
in the datatype= option.

Frank


> However - is there a nice/official way to ask an alphabet object what it 
> is (protein, DNA, RNA)?
> 
> Looking over the code in Bio.Alphabet the only thing I can think of is 
> to get the class name as a string and search it(!)  We can't look at the 
> letters property as this is None for the base classes like ProteinAlphabet.
> 
> If we are prepared to meddle with the alphabet system we might add 
> attributes like "isProtein", "isNucleotide", "isRNA", "isDNA" to these 
> base classes.  Or simply have a "sequence_type" method, which the 
> subclasses can re-define as required.
> 
> (I wasn't meaning to reopen the whole "do we need alphabets" 
> conversation last discussed in July 2006.  At least, not yet...)
> 
> Peter
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev




More information about the Biopython-dev mailing list