[BioPython] The count method of a Seq (or MutableSeq) object

Noel O'Boyle baoilleach at gmail.com
Thu Mar 5 15:23:42 UTC 2009


2009/3/5 Peter <biopython at maubp.freeserve.co.uk>:
> On Thu, Mar 5, 2009 at 2:49 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
>>
>>
>> I vote (b).
>> Another option is to continue to use count() for a Python-style count,
>> and to add a new method that does a overlapping-type count. For this
>> new method we'd need a clear but short name, and I can't think of
>> anything now.
>>
>> --Michiel.
>
> Did you like plan (c), which preserves the Python string style count
> as the default but offers the non-overlapping count via an optional
> argument?
>
> i.e.
>>>> from Bio.Seq import Seq
>>>> nuc = Seq("AAAA")
>>>> nuc.count("AA") #default is non-overlapping
> 2
>>>> nuc.count("AA", overlap=True)
> 3
>>>> nuc.count("AA", overlap=False)
> 2
>
> Peter

I think we are arguing here over which should be the default value.

Several people here believe that behaviour analagous to Python's
string.count will reduce bug reports and user confusion. However,
no-one except Leighton has been able to come up with a single use case
where the current behaviour is useful (and even that example, with
respect, was flimsy). So we end up with a method with adheres
magnificently to the principle of least surprise, but which is of no
use to users. Aren't you trying to provide methods which are useful
for biological analysis? Isn't that the purpose of wrapping the string
in the first place?

Noel (getting far too excited over painting this bikeshed)



More information about the Biopython mailing list