[Bioperl-l] seq namespace method

Hilmar Lapp hlapp@gnf.org
Mon, 15 Jul 2002 14:38:31 -0700


Bio::IdentifiableI is certainly something that's worth introducing. However,

- it should be generic enough to serve as what the name implies and not only sequences. This then probably better implements some standard. I3C is an emerging one, and there are others (what about the BioMoby ID?).

- LSID is still evolving, and I wasn't into implementing a naming/identifier standard, let alone one that isn't even finished nor widely accepted. And frankly, I don't even care a lot at this time.

- Hence, if someone wants to see I3C's LSID implemented in Bioperl I think that'd be really great and appreciated, but I'm definitely not the best person to take this on due to my aforementioned ignorance. Anyone else volunteering to draft interface and a simple implementation and I take it from there?

- I need something that directly or easily maps to biodatabase in Biosql. That sounds like $identifiable->authority() or $identifiable->namespace(). I think (= I want) that authority should be allowed to have various other properties (you may think of address or URL and whatnot), so returning simple strings doesn't necessarily suffice.

- Bio::IdentifiableI according to the latest suggestions is going to largely overlap with different methods in PrimarySeqI and DBLink. You don't want five different methods to check for a possible accession number. So, are we going then to collapse accession(), display_id(), primary_id() implementations into delegations of some sort to $seq->identifier()->identifier('accession') or whatever?

Comments more than welcome.

	-hilmar

> -----Original Message-----
> From: Steve Chervitz [mailto:stevechervitz@yahoo.com]
> Sent: Monday, July 15, 2002 1:23 AM
> To: Hilmar Lapp; OBDA BioSQL (E-mail); BioPerl (E-mail)
> Subject: Re: [Bioperl-l] seq namespace method
> 
> 
> The namespace concept is useful, only I think that the 
> correct place for it is
> at the level of the identifier, not the sequence object, 
> because a namespace
> applies to a name, not a sequence. 
> 
> Bioperl doesn't encapsulate sequence identifiers with 
> objects, but doing so
> would help manage the sequence-identifier relationship and 
> also make it easier
> to work with identifiers in general.
> 
> A Bio::Identifier object could have slots for: namespace, 
> type, version, id,
> and  perhaps, is_unique. It could have a method to stringify 
> itself with a
> specified delimiter and with or without namespace/version/type info.
> 
> Identifiable objects such as sequences could have methods 
> that returned
> Bio::Identifiers such as: all_identifiers(), 
> preferred_identifier(). Perhaps
> there could be a Bio::IdentifiableI interface for this. 
> 
> Identifiable object would not have to store Bio::Identifiers 
> internally. They
> could construct them on the fly (perhaps via an associated 
> Bio::Factory::
> object).
> 
> This would model the object-to-identifier relationship more 
> generally than we
> do  now. For example, PrimarySeqs can have display_id, primary_id, and
> accession_number, which I always find a bit confusing/limiting.
> 
> I know this is more of a substantial undertaking than just 
> adding a namespace()
> method to PrimarySeqI, but could be worth the effort. What do 
> you think?
> 
> Steve
> 
> --- Hilmar Lapp <hlapp@gnf.org> wrote:
> > According to BioSQL, sequences (bioentries) live in a 
> namespace, e.g., the
> > name of the databank that maintains and/or serves them.
> > 
> > None of the Bio:: seq objects/interfaces have a method for that.
> > 
> > I propose to add one, specifically to the lowest level 
> Bio::PrimarySeqI
> > (bioentries are pretty general, and a namespace is needed 
> for /any and all/
> > bioentries). To me, the namespace doesn't have to do much 
> with whether this
> > seq is going to be stored in BioSQL or not. A sequence with 
> an accession
> > number has (implicitly or explicitly) a namespace in which 
> this accession
> > number is valid. PrimarySeqI has an accession.
> > 
> > Anyone has other suggestions, objections?
> > 
> > 	-hilmar
> > -- 
> > -------------------------------------------------------------
> > Hilmar Lapp                            email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > -------------------------------------------------------------
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> 
> 
> =====
> Steve Chervitz
> sac@bioperl.org
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Autos - Get free new car price quotes
> http://autos.yahoo.com
>