[Bioperl-l] bioperl-db issues

Chris Fields cjfields at uiuc.edu
Wed Feb 22 04:09:18 UTC 2006


I got it worked out.  The Windows installer had picked out lower memory
settings (key buffer 10M, for instance) when I reinstalled, which
drastically slowed everything down.  I reset the settings for a server
environment and it's fine now.  Well, as fine as it will likely get since
I'm running this on a 1.8 GHz P4 with 756 MB RAM, so I'm not expecting it to
actually fly.  It's loading at about two sequences/second.  I'll have to see
if I get a speed improvement when optimizing tables.  I'll add this to the
wiki for installing bioperl-db under Windows.  

Are there optimal settings for using bioperl-db, such as key buffer and sort
buffer size, buffer pool size, etc?  Or do you think I'm likely to run into
a processor speed limit?  Just trying to get a fix on how much memory I
could push towards getting a smaller sequence database loaded, nothing like
swissprot.  I saw something in the mail list about setting
max_allowed_packet and a few other settings but that was about four years
ago.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: drycafe at gmail.com [mailto:drycafe at gmail.com] On Behalf Of Hilmar
> Lapp
> Sent: Tuesday, February 21, 2006 6:44 PM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: bioperl-db issues
> 
> On 2/21/06, Chris Fields <cjfields at uiuc.edu> wrote:
> > [...]
> > I find it odd that it worked well back in December and doesn't work now.
> I
> > updated bioperl and bioperl-db from CVS since then, so have there been
> any
> > changes that may have caused this?  I noticed a few changes here and
> there.
> 
> The changes were fixes to retrieve the rank on persistent annotation
> objects (it was only stored before, but never retrieved). Neither the
> SpeciesAdaptor nor any of the taxonomy queries was affected by this.
> 
> >
> > Here's what I have tried thus far:
> >
> > 1) I reinstalled MySQL.  I thought it might be that I had my database on
> a
> > partitioned drive, so I reinstalled on the main drive.
> >
> > 2) I rebuilt the database from scratch, loading taxonomy fresh, loaded
> the
> > schema, and got the same error when loading (hanging on SpeciesAdaptor.
> > Tried ANALYZE:
> > ------------------------------------
> > mysql> ANALYZE TABLE taxon;
> > +----------------+---------+----------+----------+
> > | Table          | Op      | Msg_type | Msg_text |
> > +----------------+---------+----------+----------+
> > | bioseqdb.taxon | analyze | status   | OK       |
> > +----------------+---------+----------+----------+
> > 1 row in set (0.42 sec)
> >
> > mysql> ANALYZE TABLE taxon_name;
> > +---------------------+---------+----------+----------+
> > | Table               | Op      | Msg_type | Msg_text |
> > +---------------------+---------+----------+----------+
> > | bioseqdb.taxon_name | analyze | status   | OK       |
> > +---------------------+---------+----------+----------+
> > 1 row in set (0.36 sec)
> 
> I'm not sure but you may have to analyze all tables.
> 
> >
> > mysql>
> > ------------------------------------
> > so that's fine.
> >
> > 3) Using EXPLAIN table:
> > ------------------------------------
> > mysql> EXPLAIN taxon;
> 
> Note that you wouldn't use EXPLAIN on a table but on a query instead.
> I.e., copy&paste the offending query into the mysql editor, prefix it
> with EXPLAIN and then see what the results are. It should show whether
> the indexes are being used properly.
> 
> Most likely it doesn't use one of the idnexes that it should be using
> but does a full table scan instead. The explain plan should pinpoint
> that.
> 
> BTW you can also use this to reconfirm the command line observation
> about the query being slow - it should 'hang' in the mysql shell as
> well. If it doesn't then there is something else going on. (if the
> placeholders pose a problem replace them with the actual values as
> given in the log)
> 
> > [..]
> > SpeciesAdaptor: binding UK column 1 to "scientific name" (name_class)
> > SpeciesAdaptor: binding UK column 2 to "208963" (ncbi_taxid)
> > ------------------------------------
> > Which is where it hangs, as before, usually about 2 minutes for each
> > sequence.
> 
> Do you also see a SELECT CLASSIFICATION query succeeding the one above
> (e.g., if you wait)? I'm asking because I'm surprised that that isn't
> the one you're seeing as taking too long, because it has been reported
> earlier to cause such problems with mysql. Alex Zelensky posted what
> he found worked as a fix.
> 
>   -hilmar
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------




More information about the Bioperl-l mailing list