[Bioperl-l] Bug in Contig.pm? How to compare two sequence objects?

Dan Bolser dan.bolser at gmail.com
Sun Jul 25 12:35:42 EDT 2010


Cheers for the clarification Robson.

How come the 'SYNOPSIS' code does produce warnings about replacing the
seq? (the workaround is easy enough, don't add_seq, but still...)

Since you said 'feel free to act', I have been faffing around here:
http://github.com/dbolser/bioperl-live

I'm not really sure if that is so useful, please advise.

Thanks again for help,
Dan.


P.S.
How come you are not in irc://irc.freenode.net/#bioperl ;-)


On 25 July 2010 15:42, Robson de Souza <robfsouza at gmail.com> wrote:
> Hi Dan,
>
> It is been a long time since I last loooked at this but, if I remember
> correctly, the point is that Bio::
>
> On Sun, Jul 25, 2010 at 9:23 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
>> The following bug report boils down to this question:
>> How should two sequence objects be compared for identity? Does the
>> object override 'eq' or implement an 'identical' method?
>
> I think an 'identical' or 'equal' method would be the best alternative
> since having a full method call would allow passing arguments like
> '-mode => "complete"' to check all sequence features and annotations
> if they exist and '-mode => "basic"' to check id() and seq() values.
> Bio::Assembly::Contig depends mostly on the last one, although only
> id() is tracked most of the time (because of the internal hashes).
>
>> I found the following apparent bug in Contig.pm while executing the
>> documented 'SYNOPSIS' code:
> [snip]
>> It seems to be a bug in the documented behaviour of set_seq_coord:
>>        "If the sequence was previously added using add_seq, its
>> coordinates are changed/set.  Otherwise, add_seq is called and the
>> sequence is added to the contig."
>
> In fact, it should not print warnings all the time....
>
>> The offending line in that function seems to be:
>>  if( ... &&
>>      ($seq ne $self->{'_elem'}{$seqID}{'_seq'}) ) {
>>          ... <spew warnings>
>>  }
>>  $self->add_seq($seq);
>> which compares the *passed* sequence object to the sequence string for
>> the *stored* sequence object of the same name. This comparison is
>> always fails if I understood correctly, therefore set_seq_coord always
>> spews warnings if called after add_seq.
>
> Not the sequence string, but the objects themselves, i.e. the string
> perl uses to represent Bio::LocatableSeq objects... it is a memory
> based version of identical() :)
>
>> Out of curiosity, how come I can't just say:
>> my $ls = Bio::LocatableSeq->
>>  new( -seq      => 'ACCG-T',
>>       -id       => 'r1',
>>       -alphabet => 'dna'
>>       -start    => 3,
>>       -end      => 8,
>>       -strand   => 1
>>     );
>> $c->add_seq( $ls );
>
> Oh, I don't remember but it was either a bad design decision I made 8
> years ago to acommodate the Bio::Align::AlignI interface or a problem
> with Bio::SeqFeature::Collection at that time. Whatever the case, it
> would be nice to change it... you just need to create a
> Bio::SeqFeature::Generic when
> add_seq is called. I just won't have time to do it myself so feel free to act...
>
> Best,
> Robson
>
>> I hope the above report can be of some use.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>




More information about the Bioperl-l mailing list