From hlapp at gmx.net Tue Jan 1 18:25:39 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 1 Jan 2008 18:25:39 -0500 Subject: [BioSQL-l] Authority in biodatabase table In-Reply-To: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> References: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> Message-ID: (Sorry for this long-too-late reply. Going through old email that got left unread or unresponded.) Peter - you probably implemented something meanwhile that suits your needs. Just FYI, BioPerl leaves this empty too. The general notion for authority is that of the LSID authority field, but of course you won't be able to parse this out of any input file. The value for SwissProt would be uniprot.org, for example. For NCBI, I'm not sure - NCBI hasn't ever issued any LSIDs, but presumably it would be something like ncbi.nlm.nih.gov. -hilmar On Nov 26, 2007, at 2:10 PM, Peter wrote: > Thank's for all the replies on the db_xref issue. > > Today I'd like to ask if there are any established guidelines for the > biodatabase table - in particular for how to use the "authority" field > in the biodatabase table, and if there is any agreed terminology for > the named "sub databases" defined therein i.e. what should I call them > in our documentation. > > By default, unless the user specifies an authority, we end up with a > NULL when creating entries in the biodatabase table using Biopython. > For example: > >> from BioSQL import BioSeqDatabase > server = BioSeqDatabase.open_database(driver="MySQLdb", user="root", > passwd = "", host = "localhost", db="bioseqdb") > db = server.new_database("orchids", description="Just for testing") > server.adaptor.commit() > > I'd like to give some sensible defaults in any worked examples. Apart >> from simple test cases (like above), sensible examples that came to > mind would be creating a "sub database" to contain: > (*) an entire GenBank release > (*) the latest SwissProt release > > What would you use in these cases. In fact, what does your > biodatabase table contain right now? > > Thank you all, > > Peter > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From biopython at maubp.freeserve.co.uk Wed Jan 2 06:57:46 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 2 Jan 2008 11:57:46 +0000 Subject: [BioSQL-l] [BioPython] Authority in biodatabase table In-Reply-To: References: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> Message-ID: <320fb6e00801020357g724917b5s853d99f2f953753a@mail.gmail.com> On 1/1/08, Hilmar Lapp wrote: > (Sorry for this long-too-late reply. Going through old email that got > left unread or unresponded.) > > Peter - you probably implemented something meanwhile that suits your > needs. Just FYI, BioPerl leaves this empty too. The general notion > for authority is that of the LSID authority field, but of course you > won't be able to parse this out of any input file. The value for > SwissProt would be uniprot.org, for example. For NCBI, I'm not sure - > NCBI hasn't ever issued any LSIDs, but presumably it would be > something like ncbi.nlm.nih.gov. > > -hilmar Thank you Hilmar. It seem's that the current code in Biopython is fine (the authority field is left blank by default, unless the user supplies their own value), and consistent with both BioPerl and BioJava in this regard (thanks Richard). Peter From raoul.bonnal at itb.cnr.it Fri Jan 18 05:11:28 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 18 Jan 2008 11:11:28 +0100 Subject: [BioSQL-l] Gene Cluster Message-ID: <1200651088.7462.20.camel@Graco> Hi all, I need to represent Gene Clusters using BioSQL' schema. Actually, I don't know if it's better to add a feature which contains all the genes involved or add a qualifier to genes, and cds. Feature Keys eligible are misc_feature or source. In this case I need to populate Seqfeature_Relationship to keep track of what gene/CDS is part of which cluster. Second option is adding a qualifier (/note?) to every gene/cds. -- Ra From hlapp at gmx.net Fri Jan 18 20:59:04 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 18 Jan 2008 20:59:04 -0500 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <1200651088.7462.20.camel@Graco> References: <1200651088.7462.20.camel@Graco> Message-ID: <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> Hi Raoul, I have done this in the past using the bioentry_relationship table. This assumes that the "genes" (or, more generally, the members of the cluster) are bioentries, not seqfeatures. If you use BioPerl to read/ process/represent the clusters, and your are using the Bio::ClusterI interface for that (any of the Bio::Cluster::* modules, for example), the Bioperl-db should support mapping those to the schema. Obviously, you can do the same for seqfeatures using feature_relationship. -hilmar On Jan 18, 2008, at 5:11 AM, Raoul Jean Pierre Bonnal wrote: > Hi all, > I need to represent Gene Clusters using BioSQL' schema. > > Actually, I don't know if it's better to add a feature which contains > all the genes involved or add a qualifier to genes, and cds. > > Feature Keys eligible are misc_feature or source. In this case I > need to > populate Seqfeature_Relationship to keep track of what gene/CDS is > part > of which cluster. > Second option is adding a qualifier (/note?) to every gene/cds. > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From raoul.bonnal at itb.cnr.it Mon Jan 21 07:23:59 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Mon, 21 Jan 2008 13:23:59 +0100 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> References: <1200651088.7462.20.camel@Graco> <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> Message-ID: <1200918239.7705.46.camel@Graco> Dear Hilmar, Il giorno ven, 18/01/2008 alle 20.59 -0500, Hilmar Lapp ha scritto: > Hi Raoul, > > I have done this in the past using the bioentry_relationship table. > This assumes that the "genes" (or, more generally, the members of the > cluster) are bioentries, not seqfeatures. If you use BioPerl to read/ > process/represent the clusters, and your are using the Bio::ClusterI > interface for that (any of the Bio::Cluster::* modules, for example), > the Bioperl-db should support mapping those to the schema. Using bioentries is right if you have an entry which represents a single cluser. Unfortunately I have to represent clusters inside "complete" genome, so a bioentry can have multiple clusters. > Obviously, you can do the same for seqfeatures using > feature_relationship. Watching at BioSQL' schema, this seems to be te only solution for representing clusters in a general way. Now which is for you the most correct Feature Key misc_feature or source ? I'm using BioRuby and working on it for interfacing it with BioSQL Skype: ilpuccio -- Ra From hlapp at gmx.net Sat Jan 26 20:08:16 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 26 Jan 2008 20:08:16 -0500 Subject: [BioSQL-l] bioseqDB error In-Reply-To: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> References: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> Message-ID: <1F23172C-4189-40FD-82BD-09395CE5C143@gmx.net> Is this upon loading into a fresh database, or updating an existing one (into which the NCBI taxonomy had been loaded before)? If it is the first, could you try this again with a fresh download from NCBI? The most recent update, which dates from 7.20pm this evening only has a single row for this taxon (which is Mus musculus, BTW), and maybe the one you happened to grab had an error. -hilmar On Jan 25, 2008, at 6:49 PM, snoze pa wrote: > Hi Anyone know why i am getting this error message!! > > Loading NCBI taxon database in taxdata: > ... retrieving all taxon nodes in the database > ... reading in taxon nodes from nodes.dmp > ... insert / update / delete taxon nodes > failed to insert node (10090;10090;10088;species;1;2): Duplicate entry > '10090' for key 2 at load_ncbi_taxonomy.pl line 568 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sun Jan 27 21:48:48 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 27 Jan 2008 21:48:48 -0500 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <1200918239.7705.46.camel@Graco> References: <1200651088.7462.20.camel@Graco> <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> <1200918239.7705.46.camel@Graco> Message-ID: <3094F9D7-2494-42FC-B820-5A006A146B8D@gmx.net> Hi Raoul, On Jan 21, 2008, at 7:23 AM, Raoul Jean Pierre Bonnal wrote: > [...] Using bioentries is right if you have an entry which > represents a single > cluser. Unfortunately I have to represent clusters inside "complete" > genome, so a bioentry can have multiple clusters. Note that relationships between bioentries are no different from those between seqfeatures. A bioentry, or seqfeature, can have relationships with any number of other bioentries and seqfeature, respectively. The uniqueness constraint only says that between any two bioentries, or seqfeatures, there can only be one relationship of one type (but multiple ones if the types are different). > >> Obviously, you can do the same for seqfeatures using >> feature_relationship. > Watching at BioSQL' schema, this seems to be te only solution for > representing clusters in a general way. See above as to whether this is the only way or not. The choice depends on what kind of objects you are trying to relate to each other. > Now which is for you the most correct Feature Key misc_feature or > source ? Do you mean for type of the feature or type of the relationship? I'm a bit confused. For relating two features to each other it doesn't matter what type they are of; however, if they are sequence annotation features, you may want to make sure that the types and the hierarchy are compliant with the Sequence Ontology. > > I'm using BioRuby and working on it for interfacing it with BioSQL Cool! -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jan 28 18:23:46 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 28 Jan 2008 18:23:46 -0500 Subject: [BioSQL-l] bioseqDB error In-Reply-To: <10f848910801281307i125ce285k5b73fd4ae5d4af28@mail.gmail.com> References: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> <10f848910801281307i125ce285k5b73fd4ae5d4af28@mail.gmail.com> Message-ID: <557D19AE-CD1D-4558-8804-830B24F294AE@gmx.net> Hi - you will only need to install Bioperl-db if you intend to use BioPerl to retrieve and store objects from/to the database. Also, Bioperl-db comes with scripts for loading sequences and ontologies (through BioPerl's SeqIO and OntologyIO capabilities). load_ncbi_taxonomy.pl is independent of BioPerl or Bioperl-db (which is why I am pulling the thread back to BioSQL - problems with load_ncbi_taxonomy are unrelated to Bioperl). If you see the error below only after 'installing' Bioperl-db I assume that you also ran the Bioperl-db test suite (as otherwise it is not explainable). Did this not result in any errors? The Bioperl- db tests normally clean up after themselves, and what you report seems to indicate that they are not. Are you using MySQL, and if so, did you make sure to enable InnoDB? -hilmar On Jan 28, 2008, at 4:07 PM, snoze pa wrote: > Still I am getting the same error message.. > > My question is: > > Do i need to install bioperl-DB for biosql? > > When I am using biosql and trying to load NCBI taxonomy then it is > working > fine. but when I am trying to install bioperl-DB then it is giving me > following error message when loading NCBI taxonomy. > > Any help? > > > > Loading NCBI taxon database in taxdata: > ... retrieving all taxon nodes in the database > ... reading in taxon nodes from nodes.dmp > ... insert / update / delete taxon nodes > failed to insert node (10090;10090;10088;species;1;2): Duplicate entry > '10090' for key 2 at load_ncbi_taxonomy.pl line 568 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Jan 1 23:25:39 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 1 Jan 2008 18:25:39 -0500 Subject: [BioSQL-l] Authority in biodatabase table In-Reply-To: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> References: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> Message-ID: (Sorry for this long-too-late reply. Going through old email that got left unread or unresponded.) Peter - you probably implemented something meanwhile that suits your needs. Just FYI, BioPerl leaves this empty too. The general notion for authority is that of the LSID authority field, but of course you won't be able to parse this out of any input file. The value for SwissProt would be uniprot.org, for example. For NCBI, I'm not sure - NCBI hasn't ever issued any LSIDs, but presumably it would be something like ncbi.nlm.nih.gov. -hilmar On Nov 26, 2007, at 2:10 PM, Peter wrote: > Thank's for all the replies on the db_xref issue. > > Today I'd like to ask if there are any established guidelines for the > biodatabase table - in particular for how to use the "authority" field > in the biodatabase table, and if there is any agreed terminology for > the named "sub databases" defined therein i.e. what should I call them > in our documentation. > > By default, unless the user specifies an authority, we end up with a > NULL when creating entries in the biodatabase table using Biopython. > For example: > >> from BioSQL import BioSeqDatabase > server = BioSeqDatabase.open_database(driver="MySQLdb", user="root", > passwd = "", host = "localhost", db="bioseqdb") > db = server.new_database("orchids", description="Just for testing") > server.adaptor.commit() > > I'd like to give some sensible defaults in any worked examples. Apart >> from simple test cases (like above), sensible examples that came to > mind would be creating a "sub database" to contain: > (*) an entire GenBank release > (*) the latest SwissProt release > > What would you use in these cases. In fact, what does your > biodatabase table contain right now? > > Thank you all, > > Peter > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From biopython at maubp.freeserve.co.uk Wed Jan 2 11:57:46 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 2 Jan 2008 11:57:46 +0000 Subject: [BioSQL-l] [BioPython] Authority in biodatabase table In-Reply-To: References: <320fb6e00711261110g63c156a1w8b76a797fe12e2b1@mail.gmail.com> Message-ID: <320fb6e00801020357g724917b5s853d99f2f953753a@mail.gmail.com> On 1/1/08, Hilmar Lapp wrote: > (Sorry for this long-too-late reply. Going through old email that got > left unread or unresponded.) > > Peter - you probably implemented something meanwhile that suits your > needs. Just FYI, BioPerl leaves this empty too. The general notion > for authority is that of the LSID authority field, but of course you > won't be able to parse this out of any input file. The value for > SwissProt would be uniprot.org, for example. For NCBI, I'm not sure - > NCBI hasn't ever issued any LSIDs, but presumably it would be > something like ncbi.nlm.nih.gov. > > -hilmar Thank you Hilmar. It seem's that the current code in Biopython is fine (the authority field is left blank by default, unless the user supplies their own value), and consistent with both BioPerl and BioJava in this regard (thanks Richard). Peter From raoul.bonnal at itb.cnr.it Fri Jan 18 10:11:28 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 18 Jan 2008 11:11:28 +0100 Subject: [BioSQL-l] Gene Cluster Message-ID: <1200651088.7462.20.camel@Graco> Hi all, I need to represent Gene Clusters using BioSQL' schema. Actually, I don't know if it's better to add a feature which contains all the genes involved or add a qualifier to genes, and cds. Feature Keys eligible are misc_feature or source. In this case I need to populate Seqfeature_Relationship to keep track of what gene/CDS is part of which cluster. Second option is adding a qualifier (/note?) to every gene/cds. -- Ra From hlapp at gmx.net Sat Jan 19 01:59:04 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 18 Jan 2008 20:59:04 -0500 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <1200651088.7462.20.camel@Graco> References: <1200651088.7462.20.camel@Graco> Message-ID: <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> Hi Raoul, I have done this in the past using the bioentry_relationship table. This assumes that the "genes" (or, more generally, the members of the cluster) are bioentries, not seqfeatures. If you use BioPerl to read/ process/represent the clusters, and your are using the Bio::ClusterI interface for that (any of the Bio::Cluster::* modules, for example), the Bioperl-db should support mapping those to the schema. Obviously, you can do the same for seqfeatures using feature_relationship. -hilmar On Jan 18, 2008, at 5:11 AM, Raoul Jean Pierre Bonnal wrote: > Hi all, > I need to represent Gene Clusters using BioSQL' schema. > > Actually, I don't know if it's better to add a feature which contains > all the genes involved or add a qualifier to genes, and cds. > > Feature Keys eligible are misc_feature or source. In this case I > need to > populate Seqfeature_Relationship to keep track of what gene/CDS is > part > of which cluster. > Second option is adding a qualifier (/note?) to every gene/cds. > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From raoul.bonnal at itb.cnr.it Mon Jan 21 12:23:59 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Mon, 21 Jan 2008 13:23:59 +0100 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> References: <1200651088.7462.20.camel@Graco> <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> Message-ID: <1200918239.7705.46.camel@Graco> Dear Hilmar, Il giorno ven, 18/01/2008 alle 20.59 -0500, Hilmar Lapp ha scritto: > Hi Raoul, > > I have done this in the past using the bioentry_relationship table. > This assumes that the "genes" (or, more generally, the members of the > cluster) are bioentries, not seqfeatures. If you use BioPerl to read/ > process/represent the clusters, and your are using the Bio::ClusterI > interface for that (any of the Bio::Cluster::* modules, for example), > the Bioperl-db should support mapping those to the schema. Using bioentries is right if you have an entry which represents a single cluser. Unfortunately I have to represent clusters inside "complete" genome, so a bioentry can have multiple clusters. > Obviously, you can do the same for seqfeatures using > feature_relationship. Watching at BioSQL' schema, this seems to be te only solution for representing clusters in a general way. Now which is for you the most correct Feature Key misc_feature or source ? I'm using BioRuby and working on it for interfacing it with BioSQL Skype: ilpuccio -- Ra From hlapp at gmx.net Sun Jan 27 01:08:16 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 26 Jan 2008 20:08:16 -0500 Subject: [BioSQL-l] bioseqDB error In-Reply-To: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> References: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> Message-ID: <1F23172C-4189-40FD-82BD-09395CE5C143@gmx.net> Is this upon loading into a fresh database, or updating an existing one (into which the NCBI taxonomy had been loaded before)? If it is the first, could you try this again with a fresh download from NCBI? The most recent update, which dates from 7.20pm this evening only has a single row for this taxon (which is Mus musculus, BTW), and maybe the one you happened to grab had an error. -hilmar On Jan 25, 2008, at 6:49 PM, snoze pa wrote: > Hi Anyone know why i am getting this error message!! > > Loading NCBI taxon database in taxdata: > ... retrieving all taxon nodes in the database > ... reading in taxon nodes from nodes.dmp > ... insert / update / delete taxon nodes > failed to insert node (10090;10090;10088;species;1;2): Duplicate entry > '10090' for key 2 at load_ncbi_taxonomy.pl line 568 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jan 28 02:48:48 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 27 Jan 2008 21:48:48 -0500 Subject: [BioSQL-l] Gene Cluster In-Reply-To: <1200918239.7705.46.camel@Graco> References: <1200651088.7462.20.camel@Graco> <11F1B68C-7350-4382-B596-9348D5DC0D3F@gmx.net> <1200918239.7705.46.camel@Graco> Message-ID: <3094F9D7-2494-42FC-B820-5A006A146B8D@gmx.net> Hi Raoul, On Jan 21, 2008, at 7:23 AM, Raoul Jean Pierre Bonnal wrote: > [...] Using bioentries is right if you have an entry which > represents a single > cluser. Unfortunately I have to represent clusters inside "complete" > genome, so a bioentry can have multiple clusters. Note that relationships between bioentries are no different from those between seqfeatures. A bioentry, or seqfeature, can have relationships with any number of other bioentries and seqfeature, respectively. The uniqueness constraint only says that between any two bioentries, or seqfeatures, there can only be one relationship of one type (but multiple ones if the types are different). > >> Obviously, you can do the same for seqfeatures using >> feature_relationship. > Watching at BioSQL' schema, this seems to be te only solution for > representing clusters in a general way. See above as to whether this is the only way or not. The choice depends on what kind of objects you are trying to relate to each other. > Now which is for you the most correct Feature Key misc_feature or > source ? Do you mean for type of the feature or type of the relationship? I'm a bit confused. For relating two features to each other it doesn't matter what type they are of; however, if they are sequence annotation features, you may want to make sure that the types and the hierarchy are compliant with the Sequence Ontology. > > I'm using BioRuby and working on it for interfacing it with BioSQL Cool! -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Jan 28 23:23:46 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 28 Jan 2008 18:23:46 -0500 Subject: [BioSQL-l] bioseqDB error In-Reply-To: <10f848910801281307i125ce285k5b73fd4ae5d4af28@mail.gmail.com> References: <10f848910801251549h546bea04p4c71cbb7a48aab5c@mail.gmail.com> <10f848910801281307i125ce285k5b73fd4ae5d4af28@mail.gmail.com> Message-ID: <557D19AE-CD1D-4558-8804-830B24F294AE@gmx.net> Hi - you will only need to install Bioperl-db if you intend to use BioPerl to retrieve and store objects from/to the database. Also, Bioperl-db comes with scripts for loading sequences and ontologies (through BioPerl's SeqIO and OntologyIO capabilities). load_ncbi_taxonomy.pl is independent of BioPerl or Bioperl-db (which is why I am pulling the thread back to BioSQL - problems with load_ncbi_taxonomy are unrelated to Bioperl). If you see the error below only after 'installing' Bioperl-db I assume that you also ran the Bioperl-db test suite (as otherwise it is not explainable). Did this not result in any errors? The Bioperl- db tests normally clean up after themselves, and what you report seems to indicate that they are not. Are you using MySQL, and if so, did you make sure to enable InnoDB? -hilmar On Jan 28, 2008, at 4:07 PM, snoze pa wrote: > Still I am getting the same error message.. > > My question is: > > Do i need to install bioperl-DB for biosql? > > When I am using biosql and trying to load NCBI taxonomy then it is > working > fine. but when I am trying to install bioperl-DB then it is giving me > following error message when loading NCBI taxonomy. > > Any help? > > > > Loading NCBI taxon database in taxdata: > ... retrieving all taxon nodes in the database > ... reading in taxon nodes from nodes.dmp > ... insert / update / delete taxon nodes > failed to insert node (10090;10090;10088;species;1;2): Duplicate entry > '10090' for key 2 at load_ncbi_taxonomy.pl line 568 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : ===========================================================