[Bioperl-l] Annotation-DBLink- version numbers repeating

Hilmar Lapp hlapp at gmx.net
Thu Oct 19 17:11:27 UTC 2006


Actually you did that Jason: http://tinyurl.com/ye2edk

Apparently the motivation was to "parse swissprot fields in genpept  
file (dbsource)"?

It clearly looks wrong to add the version. You've probably had a  
reason why you did this at the time but if we (you :) can't recover  
that I guess it's best to just fix it to do the right thing (in both  
places obviously).

	-hilmar

On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:

> Well there is explicit addition of the version to the primary id so  
> it isn't so much a parsing error as a deliberate decision to append  
> it.
> see Bio::SeqIO::genbank
>
> to make the dblink
>                                               $annotation- 
> >add_Annotation
>                                                     ('dblink',
>                                                       
> Bio::Annotation::DBLink->new
>                                                      (-primary_id  
> => $id . "." . $version,
>                                                       -version =>  
> $version,
>                                                       -database =>  
> $db,
>                                                       -tagname =>  
> 'dblink'));
>
> and the code to print the dblink back out in the writer already  
> assumes the version number is appended...
>
>         foreach my $ref ( $seq->annotation->get_Annotations 
> ('dblink') ) {
>             # if ($ref->comment eq 'DBSOURCE') {
>             $self->_print('DBSOURCE    accession ',
>                           $ref->primary_id, "\n");
>             # }
>         }
>
> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>
>> Here is the overload code:
>>
>> use overload '""' => sub {
>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>> 	|| '' };
>>
>> Except that the last '||' is redundant and unnecessary (it either  
>> does nothing or replaces an empty string with an empty string), I  
>> don't see the potential for duplicating the version number here -  
>> unless primary_id() did that, which I don't see it doing.
>>
>> So, to me this seems to come from a parsing error in the  
>> beginning, rather than an erroneous mangling of version into  
>> primary_id later.
>>
>> Is someone in the position to confirm this?
>>
>> 	-hilmar
>>
>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>
>>> So I'm unsure what we should do here.
>>>
>>> We can certainly fix the problem which you report which is  
>>> relying on
>>> the "" method -- if you were to do instead:
>>> print $_->database, ":", $_->primary_id, "\n";
>>>
>>> you'll get the right answer.  We at a minimum just fix the auto-
>>> string converting method to do The Right Thing.
>>>
>>> But I am not sure if we should keep the version out of the  
>>> primary_id
>>> field.  This will require some rejiggering in several modules  
>>> when it
>>> comes to printing DBlinks and I don't want to do this before the
>>> release. I also am not sure if there was an explicit reason why
>>> someone did put the version information in the primary_id. (I  
>>> hope it
>>> wasn't me because I don't think I'm going to remember why).
>>>
>>> Does anyone else have a strong feeling?
>>>
>>> -jason
>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>
>>>> Hello,
>>>>
>>>> I noticed a little problem with the Annotation "DBLink" from
>>>> GenBank entries
>>>>
>>>> When I run:
>>>>
>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>> $seqio =
>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>> ("dblink");
>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>
>>>> This yields:
>>>>
>>>>    GenBank:AL591065.17.17
>>>>
>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>
>>>> Can others repeat this?
>>>>
>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>> seems to
>>>> be the place where this happens: it has a concatenation which  
>>>> leads to
>>>> that repeated version number.
>>>>
>>>> It this something that I should fix "client-side", so to speak, or
>>>> is it
>>>> worthwhile to add some logic to that concatenation to prevent this?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> Jason Stajich, PhD
>>> Miller Research Fellow
>>> University of California
>>> Dept of Plant and Microbial Biology
>>> 321 Koshland Hall #3102
>>> Berkeley, CA 94720-3102
>>> lab: 510.642.8441
>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================








More information about the Bioperl-l mailing list