[BioPython] Loading keywords into database

Brad Chapman chapmanb@arches.uga.edu
Tue, 4 Jun 2002 11:10:02 -0400


Hi Andreas;

> Here is how i tried to load the keywords into the biosql Database. Is this 
> correct?
> 
> (From my Version of BioSQL.Loader)
> ------------
>     def _load_bioentry_keyword(self, record):
>         """Add keywords into the database"""
>         try:
>             id = self.adaptor.fetch_seqid_by_display_id(self.dbid, record.name)
>             keywords = record.annotations["keywords"]
>             keyword_ont_id =  self._get_ontology_id("keyword")
>             sql = r"INSERT INTO bioentry_qualifier_value VALUES" \
>                   r" (%s, %s, %s)"
>             for k in keywords:
>                 self.adaptor.execute_one(sql, (bioentry_id, keyword_ont_id ,k))
>         except KeyError:
>             pass
> ------------

This looks right in principle . My only suggestions are that you can 
save the call to "fetch_seqid_by_display_id" by just making the 
function look like:

def _load_bioentry_keyword(self, bioentry_id, keywords)

Then you can call it from load_seqrecord with:

keywords = record.annotations.get("keywords", [])
self._load_bioentry_keywords(bioentry_id, keywords)

This is all just semantics, but you are spot on with your use of the
ontology stuff (based on my limited knowledge, of course :-).

> I also noticed that the Loader could be more robust. It chokes if an entry 
> is already in the Database. Not good, if i bulk-load from a file, it crashes 
> somehow in the middle and i want to start again.

Yup, it pretty much assumes you are loading a new record in. I know this
isn't very good and it could definitely use a lot of work to make it
robust. The stuff that's there is pretty basic.

> PS: What is the best place to send this code? The developer list? Or would 
> it be better (if possible) to check this into cvs?

The dev list is the best place to send code (and probably to talk about
more code-specific things like this). I'm happy to check in diffs that
you send me on BioSQL (especially those with corresponding tests :-).

Jeff doles out CVS write accounts, so the other option is to beg him for
one of those :-).

Thanks again for the interest.
Brad
-- 
PGP public key available from http://pgp.mit.edu/