[BioSQL-l] Re: gene ontology questions (bug)

Hilmar Lapp hlapp at gnf.org
Thu Sep 18 23:08:58 EDT 2003


On 9/18/03 5:36 AM, "Raphael A. Bauer"
<Raphael.Bauer at informatik.hu-berlin.de> wrote:

> with two times "elastin".. (it seems that there are many terms that have
> the same term name.. also seen in term collagen and so on...)
> 
> and the definition of table term that forbids 2 times the same name(unique):

Correct. There is a UK constraint on term that a name is to be unique within
an ontology. Terms are also looked up utilizing this constraint.

When you first load on ontology the best strategy is to ignore obsoleted
terms, using the option --noobsolete (check the POD of load_ontology.pl, or
use --help).

The following probably doesn't apply to your use case, but for completeness
let me note that the real problem is when you update an ontology and a term
has been obsoleted because it was merged with another term that then gets
the same name. If you use the otherwise recommendable --updobsolete switch,
the obsoleted term would be properly obsoleted in the database, but
inserting the successor fails with a UK violation. Using --delobsolete would
take care of the problem, but you'd lose annotations to the obsoleted term.
Like it or not, but LL and other DBs do contain GO associations to obsoleted
terms, so just aggressively deleting them yields undesirable effects.

To solve this, I actually resorted to extending the constraint to
(name,ontology_id,is_obsolete) in my Oracle version of biosql. Just
extending the constraint isn't really advisable though, because then the
lookup mechanism in the TermAdaptor needs to be adjusted too. I'll probably
end up doing that.

To get back to your concrete problem though, --noobsolete probably does what
you want. 


    -hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------





More information about the BioSQL-l mailing list