[Biopython-dev] Seq object join method

Peter biopython at maubp.freeserve.co.uk
Mon Nov 23 10:34:31 UTC 2009


On Fri, Nov 20, 2009 at 7:28 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>
> Thoughts:
>
> 1. Why doesn't Alphabet._consensus_alphabet raise a
> TypeError("Incompatable alphabets") where _check_type_compatibility
> would fail, at least as an optional argument? Probably because it's a
> private function. Should it be a public function, with a friendlier
> interface?

It is a private function, and right now I don't recall my precise thinking.

The assorted private functions in Bio.Alphabet were to extract some
commonly repeated actions for reuse (e.g. in the alignment code)
while preserving backwards compatibility where possible, and fixing
bugs as needed.

I agree some of these are candidates for being made public, but this
is a lower priority for me. I am also not sure if functions are the best
way to do some of these tasks - Alphabet methods may be better.

> 2. This might cause massive compatibility problems now, but would it
> be better for Seq() to use an "unknown_alphabet" by default instead of
> "generic"? Then _consensus_alphabet could safely ignore those
> sequences with unspecified alphabets, and Seq.join wouldn't need that
> special case.

The base class generic alphabet *is* the "unknown alphabet".

> 3. Alternately, how much code would break if _consensus_alphabet
> simply treated generic_alphabet as an unspecified sequence, and
> ignored it when calculating the consensus alphabet? This effect could
> be limited to just Seq.join by dropping the test that the sequence
> length is 0, but it might be useful to have the same behavior for
> addition.

I don't know specifically what would break, but that seems too
permissive to me. The Seq("").join(...) seems like a special case
to me as it fits the Python "".join(...) idiom for concatenating a
list of strings.

Peter



More information about the Biopython-dev mailing list