[Bioperl-l] Bug or special design of the 'length' method for Bio::Seq ?
cjfields at illinois.edu
Sun Jul 17 14:20:48 UTC 2011
length() is defined in BioPerl as 'Get the length of the sequence in number of symbols (bases or amino acids)'. We count '*' as a translated codon and as part of length() for the reasons Peter mentions. One can also set the length for a 'virtual' sequence (no actual sequence present), but if a sequence is present it's not supposed to lie either (e.g. you can't just set it to anything).
On Jul 17, 2011, at 8:40 AM, Peter Cock wrote:
> This is deliberately giving the length of the string (Biopython does the same).
> Have you considered what would you expect for this example sequence?
> i.e. Where you translate a whole sequence including all the stop
> It is a practical decision to give the length including the stop
> symbols, so that the sequence behaves like a Perl string.
> On 7/17/11, Tao Zhu <tzhu at mail.bnu.edu.cn> wrote:
>> Suppose a protein sequence like:
>> Do you think the length of such sequence is 19 or 20? In my opinion, the
>> star "*" is only a terminal symbol of a protein sequence, so it
>> shouldn't be counted into protein length. But in fact the "length"
>> method of Bio::Seq results in length of 20.
>> Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing
>> 100875, China
>> Email: tzhu at mail.bnu.edu.cn
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l