[BioSQL-l] synonyms for ontology terms

Hilmar Lapp hlapp at gnf.org
Mon Mar 17 12:21:14 EST 2003


I always thought this should be solvable by treating synonyms as 
first-class terms connected by another relationship type.

However, note first that we'd be the first doing it that way. The 
ontology natives don't treat synonyms as first-class terms (see the 
geneontology schema, the chado schema, or the DAG flat file format for 
that matter), and maybe there are reasons for this. ChrisM, if you get 
a chance, can you comment here?

So, in practice, probably as a consequence of no-one else doing it this 
way, synonyms do not have their own identifiers. Since for public 
ontologies you cannot simply make up identifiers, you inevitably end up 
with second-class citizens among the terms of an ontology, namely those 
you cannot identify.

Another problem that springs to mind is that since a synonym actually 
does not represent a related but the identical concept, every node that 
is rooted at the term also needs to be fathered by the term's 
synonym(s). This is doable, but creates the necessity for special case 
code when reading in ontologies (because those won't have the synonym 
to children relationships).

I guess the point about synonyms is that they being synonyms is a fact, 
not a relationship you may or may not believe in. 
'five_prime_untranslated_region' (SO:0000204) is not related to, but 
the very same thing as '5\'-UTR', and I guess if '5\'-UTR' is allowed 
to be a first-class citizen, you may root concepts off of it that you 
don't connect to 'five_prime_untranslated_region', which can't be right 
by definition.

So, my vote is for a synonym table unless

	- the public ontologies not having done this is due to oversight or 
laziness, and
	- there is a roadmap for how to resolve the second-class citizen issue 
and those issues that it gives rise to.

-hilmar

On Monday, March 17, 2003, at 04:41  AM, Aaron J Mackey wrote:

>
> I did suggest this after initially agreeing with Hilmar's proposal.
>
> The one problem I can see is that this "well-known" predicate needs to
> live in what we've been throwing around as the "CORE" ontology ... is
> there really such a beast?  Can we suck it up from somewhere and 
> declare
> that when we suck up other ontologies that any synonymous relationships
> use the CORE::synonymous predicate?  Perhaps instead there should be a
> "biosql" namespace ontology to define some of these things (as I think
> we're already defining a handful of fuzzy location terms ... ?)
>
> -Aaron
>
> On Mon, 17 Mar 2003, Matthew Pocock wrote:
>
>> Hi Hilmar,
>>
>> Would it not be better to represent this inside the ontology itself? 
>> If
>> we had a well-known predicate 'synonym' then we can use the tripples
>> table to associate a concept with its synonyms. This will be come just
>> as efficient as a term_synonym table once the transient closures table
>> is populated. It also removes one more block of special case code - we
>> can look up synonyms and antonyms and identities and isas and hasas 
>> with
>> the same code without extra tables. My rule of thumb is that any info
>> that relates terms to one another should be in the ontology itself, 
>> and
>> never in an extra table.
>>
>> Matthew
>>
>> Hilmar Lapp wrote:
>>> We need one more table for a full representation of ontology terms to
>>> record their synonyms. The table can be quite minimalist:
>>>
>>>     CREATE TABLE term_synonym (
>>>         synonym       VARCHAR(255) NOT NULL,
>>>         term_id          INT(10) UNSIGNED NOT NULL,
>>>         --
>>>         FOREIGN KEY (term_id) REFERENCES term (term_id),
>>>         UNIQUE (term_id,synonym);
>>>     );
>>>
>>> Has anyone any objections to this or better ideas how to capture
>>> synonyms of terms?
>>>
>>>     -hilmar
>>
>>
>>
>
> -- 
>  Aaron J Mackey
>  Pearson Laboratory
>  University of Virginia
>  (434) 924-2821
>  amackey at virginia.edu
>
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list