[Biocorba-l] Identifiers was Re: SeqFeature -> get_Primary_Seq (fwd)

Matthew Pocock mrp@sanger.ac.uk
Thu, 15 Feb 2001 10:38:06 +0000


> For example, a sequence with accession X12345 (sequence version 2) in EMBL
> release 37, would have the identifier:
> 
> 	Identifier = "EMBL.37/X12345.2";
> 
> Without versioning information on DB or sequence (which is assumed to
> imply latest versions):
> 
> 	Identifier = "EMBL/X12345";
> 
> You may also imply a local or default database for the sequence:
> 
> 	Identifier = "./X12345";
> 
> You may also specify just the database (with or without version):
> 
> 	Identifier = "EMBL.37";

Looks like an improvement to me - esp if we tacked some extra text on it 
like:

urn://seqdb/EMBL.37/X12345.2

Then, I think it becomes relatively painless to write resolvers for 
these things - perhaps as a part of BioEnv? You can palm each layer in 
the urn off to a different resolver - seqdb is resolved by the master 
registry of resolvers, EMBL by the seqdb resolver and X12345 by the EMBL 
resolver. Also, a sequence doesn't have to carry a reference to its DB 
arround.

Anyway, step 1 - do we all think that a formal ID containing enough info 
to re-fetch the resource is a *good thing*, or is it potentialy a cause 
of great hastle and lots of work?

Matthew