[Biopython-dev] Determining if seq alphabet is protein/dna/rna
Frank
fkauff at duke.edu
Mon Oct 30 00:48:39 UTC 2006
Hi all,
On Mon, 2006-10-30 at 00:13 +0000, Peter (BioPython Dev) wrote:
> Hello all,
>
> I've been looking at writing multiple sequence alignments in Nexus
> format for the new Bio.SeqIO code, and came up with the following little
> problem:
>
> Given one or more Seq objects, how can I reliably decide if they are
> protein, DNA, or RNA?
>
> (These are the relevant choices in a Nexus file's format datatype=...
> header.)
>
> I'm resigned to the fact that if the Seq object has the generic alphabet
> this boils down to looking at the sequence strings and making an
> educated guess (probably following an established algorithm from an
> alignment program). Does any such code already exist in BioPython?
>
I'm not aware of any such code - however, an educated guess would be
easy, (more or less ACGTNX only, ACGUNX only, everything else...?). With
NEXUS it becomes tricky, as a dataset could potentially be partitioned
into a mix of all types. And there is no "official" way to indicate this
in the datatype= option.
Frank
> However - is there a nice/official way to ask an alphabet object what it
> is (protein, DNA, RNA)?
>
> Looking over the code in Bio.Alphabet the only thing I can think of is
> to get the class name as a string and search it(!) We can't look at the
> letters property as this is None for the base classes like ProteinAlphabet.
>
> If we are prepared to meddle with the alphabet system we might add
> attributes like "isProtein", "isNucleotide", "isRNA", "isDNA" to these
> base classes. Or simply have a "sequence_type" method, which the
> subclasses can re-define as required.
>
> (I wasn't meaning to reopen the whole "do we need alphabets"
> conversation last discussed in July 2006. At least, not yet...)
>
> Peter
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
More information about the Biopython-dev
mailing list