[Biopython] Sequence object "find" is still case specific?

Peter Cock p.j.a.cock at googlemail.com
Mon Mar 4 10:35:12 UTC 2013


On Mon, Mar 4, 2013 at 3:01 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
>> Hmm, well, lower case nucleotides have often represented
>> "masked regions" of sequences. It seems that Biopython
>> sequences were meant to be case-sensitive (e.g.,
>> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22).
>> From the documentation there, it seems like you've discovered
>> a bug in the API; I feel that Seq should raise a ValueError
>> when instantiating with lower-case nucleotides and unambiguous_dna.
>>
> I don't think that this is a bug. The difference between unambiguous
> and ambiguous DNA refers to the difference between ACGT and
> ACGTMRWSYKVHDBXN, > where the nucleotides other than
> ACGT are ambiguous (for example, R = purine = either A or G).

That's part of the issue with the sequence objects not checking
the letters against the list specified in the alphabet object - and
arguably much more important than the case aspect.

Peter



More information about the Biopython mailing list