[BioPython] problem loading NCBI_taxonomy database into bioseqdb

Nick Matzke matzke at berkeley.edu
Wed Sep 3 22:45:53 UTC 2008


(Resolved this on the BioSQL list but I figured I would follow up to 
biopython also -- thanks! & sorry for the confusion)



Well, I'm not sure what I did, but some combination of these things 
seems to have worked.

1. moved the site/lib directory (which contains DBI.pm) to the front of 
my PERL5LIB (which goes into @INC)

export 
PERL5LIB=$PERL5LIB:/usr/local/ActivePerl-5.10/site/lib:/usr/local/ActivePerl-5.10/man/man3:/usr/local/ActivePerl-5.10/site/lib/Bundle


2. Checked to make sure DBI & DBD::mysql were installed

==========================================
mws2:/usr/local/ActivePerl-5.10/bin nick$ sudo perl -MCPAN -e 'install DBI'
CPAN: Storable loaded ok (v2.18)
Going to read /usr/local/Metadata
   Database was generated on Mon, 01 Sep 2008 10:02:51 GMT
DBI is up to date (1.607).

mws2:/usr/local/ActivePerl-5.10/bin nick$ sudo perl -MCPAN -e 'install 
DBD::mysql'
CPAN: Storable loaded ok (v2.18)
Going to read /usr/local/Metadata
   Database was generated on Mon, 01 Sep 2008 10:02:51 GMT
DBD::mysql is up to date (4.008).
==========================================


3. (Make sure you have an empty version of the db, at least for me I got 
errors if I had already loaded sequences etc. into it...I got errors 
like this:

==========================================
note: node (28;331111;27;species;;) is retired; failed to delete: Cannot 
delete or update a parent row: a foreign key constraint fails 
(`bioseqdb/bioentry`, CONSTRAINT `FKtaxon_bioentry` FOREIGN KEY 
(`taxon_id`) REFERENCES `taxon` (`taxon_id`))
note: node (70;300268;69;species;;) is retired; failed to delete: Cannot 
delete or update a parent row: a foreign key constraint fails 
(`bioseqdb/bioentry`, CONSTRAINT `FKtaxon_bioentry` FOREIGN KEY 
(`taxon_id`) REFERENCES `taxon` (`taxon_id`))
note: node (77;3002
==========================================


4. Ran it again: (I also added '--host localhost')

mws2:/usr/local/ActivePerl-5.10/bin nick$ sudo perl 
/bioinformatics/pythonstuff/biosql-1.0.0/scripts/load_ncbi_taxonomy.pl 
--dbname bioseqdb --driver mysql --dbuser root --download true --host 
localhost

Loading NCBI taxon database in taxdata:
         ... retrieving all taxon nodes in the database
         ... reading in taxon nodes from nodes.dmp
         ... insert / update / delete taxon nodes
         ... (committing nodes)
         ... rebuilding nested set left/right values
         ... reading in taxon names from names.dmp
         ... deleting old taxon names
         ... inserting new taxon names
         ... cleaning up
Done.



So thanks for the help, something or other worked!
Cheers,
Nick


Peter wrote:
> On Wed, Sep 3, 2008 at 12:19 AM, Nick Matzke <matzke at berkeley.edu> wrote:
>> Hi all,
>>
>> I'm following the BioSQL tutorial at the biopython website
>> (http://www.biopython.org/wiki/BioSQL#NCBI_Taxonomy ).  I can get bioseqdb
>> to work, and the biosql python scripts etc.
>>
>> However I can't get these directions to work in loading the taxonomy
>> database into bioseqdb.  I get: "Can't locate object method "connect" via
>> package "DBI" "
>>
>> I double-checked to make sure I've got DBI in perl (see error message below)
>> but that doesn't seem to help.
> 
> This does sound like a question for the BioSQL mailing list (which I
> see you've now asked on).  I'm no perl expert - so if you can resolve
> this via the BioSQL mailing list, and we can improve the Biopython
> BioSQL wiki page, that would be great.
> 
> Peter
> 

-- 
====================================================
Nicholas J. Matzke
Ph.D. student, Graduate Student Researcher
Huelsenbeck Lab
Center for Theoretical Evolutionary Genomics
4151 VLSB (Valley Life Sciences Building)
Department of Integrative Biology
University of California, Berkeley

Lab websites:
http://ib.berkeley.edu/people/lab_detail.php?lab=54
http://fisher.berkeley.edu/cteg/hlab.html
Dept. personal page: 
http://ib.berkeley.edu/people/students/person_detail.php?person=370
Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html
Lab phone: 510-643-6299
Dept. fax: 510-643-6264
Cell phone: 510-301-0179
Email: matzke at berkeley.edu

Office hours for Bio1B, Spring 2008: Biology: Plants, Evolution, Ecology
VLSB 2013, Monday 1-1:30 (some TA there for all hours during work week)

Mailing address:
Department of Integrative Biology
3060 VLSB #3140
Berkeley, CA 94720-3140
====================================================



More information about the Biopython mailing list