[Bioperl-l] IdentifiableI and LSIDs: towards a better future

Gudmundur Arni Thorisson mummi at cshl.org
Mon Mar 10 16:54:28 EST 2003


   A comment on LSIDs and IdentifiableI: The Haplotype Map project  
(http://hapmap.cshl.org/) that I'm working on  will make extensive use  
of LSIDs, from individual genotypes to SNPs and haplotypes. I just  
noticed that IdentifiableI does indeed provide precisely the kind of  
behaviour that I'd be looking for for LSID-enabled thingies  
(namespace(), authority() and friends). I think that IdentifiableI and  
its support for LSIDs is a good step in the direction of globally  
unique and resolvable identifiers for biological objects. I'll post  
some (hopefully helpful!) comments on this subject when I've actually  
tried to make use of IdentifiableI along with other bioperl-modules in  
the project, within a month or two.


     Mummi, CSHL

On Thursday, Feb 27, 2003, at 14:25 America/New_York, Paul Edlefsen  
wrote:

> Ewan Birney wrote:
>
>>  Am I right in thinking that one of your classes is:
>>
>>
>> Uniquely-Identifiable-Object-For-This-Implementation-but-not- 
>> exportable-ids
>>
>> and the other one is
>>
>> Uniquely-Identifiable-Object-For-Planet-Bioinformatics-and-so- 
>> exportable/queryable-ids
>>
>> If I am right, what are your object names? If I am wrong... can you
>> enlighten me...?
>>
> Yes, that's right.  They are called (and this could change if the will  
> of the people desires it) LocallyIdentifiableI and  
> GloballyIdentifiableI.  I would have called LocallyIdentifiableI just  
> 'IdentifiableI', but that's taken, so this will do.  It just has a  
> 'unique_id' method, which *must be undef if the object cannot provide  
> a _unique_ identifier*.  The goal is to have something that the  
> programmer can use instead of memory references to identify objects  
> that are presently in use in a program.  So a SeqFeature's (or a  
> RelRange's, etc) seq_id might be the unique_id of a sequence.  If  
> somebody is able to further guarantee that this unique_id is  
> For-Planet-Bioinformatics-unique, great.  That's where  
> GloballyIdentifiableI comes in.  My concern with the existing  
> IdentifiableI interface was that not all objects are globally  
> identifiable, but most are locally unique, so requiring that all  
> identifiable things be globally identifiable ensures that most things  
> won't implement IdentifiableI (or at least won't do so properly).
>
> There's a couple of cans of worms that I don't want to open right now.
>
> One is what globally identifiable thing to use.  The existing  
> IdentifiableI uses LSIDs.  That's a fine thing and is one way in which  
> an object might be able to provide a global identifier.  The new  
> IdentifiableI ISA GloabllyIdentifiableI and its unique_id method just  
> returns the LSID string.  One goal of making GloballyIdentifiableI  
> just have unique_id, like LocallyIdentifiableI, but document the  
> assertion that *this* unique_id will allow folks to look up the object  
> in the Planet-Bioinformatics realm, is that people might differ on  
> their favorite type of global identifier, or different objects might  
> require different sorts.  Particulars stay out of these interfaces.
>
> Another is "shouldn't *everything* be globally identifiable?"  Yes, I  
> suppose everything should.  Again, I don't think that it really  
> matters to the point at hand, which is that many classes in BioPerl  
> presently have some field which is asserted by the documentation to be  
> unique. The name of this field is different in different classes, and  
> in some cases it gets confusing (I still don't understand the  
> situation in SeqFeatureI -- maybe I'm thick?).  One problem at a time,  
> I say; first we unify the existing notion of a unique identifier  
> (which is, most often, not-necessarily-global), then we allow people  
> to assert that some are indeed global, and then if the world community  
> unifies around some global identifiers then maybe one day all of our  
> objects will be GloballyIdentifiable.  That'd be awesome.
>
> :Paul
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
>
--
----------------------------------------------------
Gudmundur Arni Thorisson, B.Sc.
Haplotype Map project DCC group leader
Steinlab, Cold Spring Harbor Laboratory
w-phone#: 516-367-6904
w-fax#:   516-367-8389
1 Bungtown Road, Williams Bldg
Cold Spring Harbor
11724 New York
USA



More information about the Bioperl-l mailing list