[Bioperl-l] Interpro

Elia Stupka elia@ebi.ac.uk
Thu, 21 Jun 2001 11:51:35 +0100 (BST)


> We are faced with an annotation problem here.  We would like to take blast
> hits from an EST pipeline, and annotate them based on GO via Interpro.
> 
> Are there hooks to Interpro from bioperl?  It doesn't look like it.  What
> does Ensembl do?

As far as I know there is no such hooks in bioperl, I can tell you what we
have been playing with within EnsEMBL.

At the moment the only reliable and complete GO mapping there is
interpro-GO. We predict interpro domains on ensembl peptides, and store
them in our database as protein features. We get a file from the interpro
people, which gives us interpro-GO mapping, and we can deduce GO terms for
ensembl peptides through this route. 

However, after doing this I discovered that we can only map about 1/3 of
our proteins this way, which is way too little. This is partly because not
all Intepros have a GO term, and because not all our proteins have
interpro domains.

Swissprot has just signed up for doing "proper" Swissprot to GO mapping,
with curators sitting down and mapping each protein, and also interacting
with GO to create terms when they are missing. So, in EnsEMBL we are going
to wait for their mapping to go ahead (which should be coming in about 8
weeks).

In terms of design all it will involve once it is done is using our
generic objectXref tables and objects, which link any ensembl object to
any external database and accession.

Hope this was helpful, mail me if you want to know more,

Elia



**************************
tel:    +44 1223 49 44 31
mobile: +44 7971 59 03 69
fax:    +44 1223 49 44 68
**************************