From jayunit100 at gmail.com Thu Jul 17 14:08:20 2008 From: jayunit100 at gmail.com (Jay Vyas) Date: Thu, 17 Jul 2008 14:08:20 -0400 Subject: [BioSQL-l] (no subject) Message-ID: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> H Hilmer : Where is the data for BIO SQL ? Is there a single non redundant PDB database which is in Bio SQL format? Thanks ! Jay From jimp at compbio.dundee.ac.uk Mon Jul 21 11:00:59 2008 From: jimp at compbio.dundee.ac.uk (James Procter) Date: Mon, 21 Jul 2008 16:00:59 +0100 Subject: [BioSQL-l] (no subject) In-Reply-To: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> Message-ID: <4884A4AB.3050907@compbio.dundee.ac.uk> Hi Jay. Jay Vyas wrote: > H Hilmer : Where is the data for BIO SQL ? Is there a single non > redundant PDB database which is in Bio SQL format? Thanks ! Jay ... BioSQL is a standardised database schema for storing sequence and metadata, rather than a format for distributing sequence and annotation data. I'm not sure if any of the public bioinformatic database repositories serve BioSQL dumps of their data, but it sounds like it might be a good idea :) However, since both BioJava and BioPerl have routines for transparently mapping between Java or Perl objects and the sequences and annotation in any BioSQL database - so its pretty easy for you to make a version of the PDB NR yourself. hope it helps Jim -- ------------------------------------------------------------------- J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk The University of Dundee is a Scottish Registered Charity, No. SC015096. From hlapp at gmx.net Wed Jul 23 20:45:54 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 23 Jul 2008 20:45:54 -0400 Subject: [BioSQL-l] (no subject) In-Reply-To: <4884A4AB.3050907@compbio.dundee.ac.uk> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> Message-ID: <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> Yes indeed. The only thing I would add is that BioSQL at present doesn't encompass structures, so if you also want to store the PDB structures you will have to have your own way of doing this. (That is not to say that BioSQL couldn't start storing structures as well. I haven't really worked with protein structures though, so at a minimum someone else would have to drive that.) -hilmar On Jul 21, 2008, at 11:00 AM, James Procter wrote: > Hi Jay. > > Jay Vyas wrote: >> H Hilmer : Where is the data for BIO SQL ? Is there a single non >> redundant PDB database which is in Bio SQL format? Thanks ! Jay > ... > BioSQL is a standardised database schema for storing sequence and > metadata, rather than a format for distributing sequence and > annotation > data. > > I'm not sure if any of the public bioinformatic database repositories > serve BioSQL dumps of their data, but it sounds like it might be a > good > idea :) However, since both BioJava and BioPerl have routines for > transparently mapping between Java or Perl objects and the sequences > and > annotation in any BioSQL database - so its pretty easy for you to > make a > version of the PDB NR yourself. > > hope it helps > Jim > > -- > ------------------------------------------------------------------- > J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group > Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk > The University of Dundee is a Scottish Registered Charity, No. > SC015096. > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jimp at compbio.dundee.ac.uk Thu Jul 24 11:32:55 2008 From: jimp at compbio.dundee.ac.uk (James Procter) Date: Thu, 24 Jul 2008 16:32:55 +0100 Subject: [BioSQL-l] BioSQL at BOSC08 - Was Re: (no subject) In-Reply-To: <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> Message-ID: <4888A0A7.7090001@compbio.dundee.ac.uk> Hi Hilmar! Hilmar Lapp wrote: > Yes indeed. The only thing I would add is that BioSQL at present doesn't > encompass structures, so if you also want to store the PDB structures > you will have to have your own way of doing this. very true. It's quite tempting to have a go at that, but I have other priorities first (alignments, perhaps ?). As a general question to the list - Were there any issues like this raised or discussed during or after the BioSQL presentation at BOSC 08 ? Jim ps. I've only recently started to use BioSQL (after being entirely ignorant of its existence until a few months ago) and was very glad to here that it is still a going concern - so cheers to you, Hilmar, and the rest of the BioSQL people ! -- ------------------------------------------------------------------- J. B. Procter (ENFIN/VAMSAS) Barton Bioinformatics Research Group Phone/Fax:+44(0)1382 388734/345764 http://www.compbio.dundee.ac.uk The University of Dundee is a Scottish Registered Charity, No. SC015096. From hlapp at gmx.net Thu Jul 24 11:59:03 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 24 Jul 2008 11:59:03 -0400 Subject: [BioSQL-l] BioSQL at BOSC08 - Was Re: (no subject) In-Reply-To: <4888A0A7.7090001@compbio.dundee.ac.uk> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> <4888A0A7.7090001@compbio.dundee.ac.uk> Message-ID: <342088F2-B4DE-4D57-ABE8-6431DA535370@gmx.net> On Jul 24, 2008, at 11:32 AM, James Procter wrote: > As a general question to the list - Were there any issues like this > raised or discussed during or after the BioSQL presentation at BOSC > 08 ? If you mean whether there were any questions about storing structure in BioSQL, no. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From kaustubhp_in at yahoo.com Fri Jul 25 10:21:54 2008 From: kaustubhp_in at yahoo.com (Kaustubh Patil) Date: Fri, 25 Jul 2008 07:21:54 -0700 (PDT) Subject: [BioSQL-l] problem loading sequences Message-ID: <829881.20266.qm@web65608.mail.ac4.yahoo.com> Hi, I am trying to load NCBI bacterial genomes into a BioSQL database, using the script from bioperl-db? package. I get error about duplicate entry. It will be very nice if you can point out the problem. The details are as follows. Any help is appreciated. Cheers, kaustubh The command I run; bp_load_seqdatabase.pl --host XXX --dbname XXX --dbuser XXXl --dbpass XXX --format genbank --namespace genbank_bacteria */*.gbk I get the following output; Loading Acaryochloris_marina_MBIC11017/NC_009925.gbk ... Loading Acaryochloris_marina_MBIC11017/NC_009926.gbk ... -------------------- WARNING --------------------- MSG: insert in Bio::DB::BioSQL::ReferenceAdaptor (driver) failed, values were ("","Direct Submission","Submitted (17-OCT-2007) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA","CRC-459492F4E0CAB94B","1","374161","") FKs () Duplicate entry 'CRC-459492F4E0CAB94B' for key 3 --------------------------------------------------- Could not store NC_009926: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: create: object (Bio::Annotation::Reference) failed to insert or to be found by unique key STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/local/share/perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:217 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/local/share/perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: Bio::DB::BioSQL::SeqAdaptor::store_children /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/SeqAdaptor.pm:224 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/local/share/perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/local/share/perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: /usr/local/bin/bp_load_seqdatabase.pl:623 ----------------------------------------------------------- ?at /usr/local/bin/bp_load_seqdatabase.pl line 636 From hlapp at gmx.net Fri Jul 25 10:42:19 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 25 Jul 2008 10:42:19 -0400 Subject: [BioSQL-l] BioSQL at BOSC08 - Was Re: (no subject) In-Reply-To: <4888AEC2.8060008@compbio.dundee.ac.uk> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> <4888A0A7.7090001@compbio.dundee.ac.uk> <342088F2-B4DE-4D57-ABE8-6431DA535370@gmx.net> <4888AEC2.8060008@compbio.dundee.ac.uk> Message-ID: On Jul 24, 2008, at 12:33 PM, James Procter wrote: > Hilmar Lapp wrote: >> >> On Jul 24, 2008, at 11:32 AM, James Procter wrote: >> >>> As a general question to the list - Were there any issues like this >>> raised or discussed during or after the BioSQL presentation at >>> BOSC 08 ? >> >> >> If you mean whether there were any questions about storing >> structure in >> BioSQL, no. > well - I was enquiring more generally - was there any discussion about > extending the BioSQL model for other kinds of bioinformatic objects ? This was among the examples I showed for actual usage. In fact, I suspect that that's a fairly common usage pattern of BioSQL. Many of the papers citing BioSQL are by groups who have data persistence needs and work on sequences and annotation plus something custom they generate themselves. They use BioSQL as their sequence and annotation "module" that instantly solves that piece and how to get data in, and they add custom tables to model whatever types of data or results they have that are associated with that. One person wanted to accommodate pathway data (if I recall correctly?). I have so far been rather reluctant to add data types to the model that aren't really supported by at least one of the Bio* toolkits, as the main goal of BioSQL is to provide interoperable persistence for the Bio* toolkits rather than being a generic schema for all kinds of data. So I said I would look at it as soon as at least one toolkit has an object model for that kind of data. I believe there are projects that aim to fill the latter need already and do so pretty well (Chado, as an example), and so rather than duplicating these efforts I thought that supporting Bio* persistence should take higher priority. That said, I'm open to any kind of feedback or thoughts; please post if you have a different opinion, and I'd be interested to hear what people have to say on this. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jimp at compbio.dundee.ac.uk Fri Jul 25 11:04:06 2008 From: jimp at compbio.dundee.ac.uk (James Procter) Date: Fri, 25 Jul 2008 16:04:06 +0100 Subject: [BioSQL-l] BioSQL at BOSC08 - Was Re: (no subject) In-Reply-To: References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> <4888A0A7.7090001@compbio.dundee.ac.uk> <342088F2-B4DE-4D57-ABE8-6431DA535370@gmx.net> <4888AEC2.8060008@compbio.dundee.ac.uk> Message-ID: <4889EB66.2010700@compbio.dundee.ac.uk> Hilmar Lapp wrote: >>> If you mean whether there were any questions about storing structure in >>> BioSQL, no. >> well - I was enquiring more generally - was there any discussion about >> extending the BioSQL model for other kinds of bioinformatic objects ? > > This was among the examples I showed for actual usage. In fact, I > suspect that that's a fairly common usage pattern of BioSQL. Many of the > papers citing BioSQL are by groups who have data persistence needs and > work on sequences and annotation plus something custom they generate > themselves. They use BioSQL as their sequence and annotation "module" > that instantly solves that piece and how to get data in, and they add > custom tables to model whatever types of data or results they have that > are associated with that. This is exactly the way I came to start using BioSQL (there are only so many times the wheel should be re-invented, IMHO). > One person wanted to accommodate pathway data (if I recall correctly?). > I have so far been rather reluctant to add data types to the model that > aren't really supported by at least one of the Bio* toolkits, as the > main goal of BioSQL is to provide interoperable persistence for the Bio* > toolkits rather than being a generic schema for all kinds of data. So I > said I would look at it as soon as at least one toolkit has an object > model for that kind of data. > > I believe there are projects that aim to fill the latter need already > and do so pretty well (Chado, as an example), and so rather than > duplicating these efforts I thought that supporting Bio* persistence > should take higher priority. I have a rather similar opinion to yours, actually. Part of BioSQL's attractiveness is its flexibility - which may be reduced if additional tables are included since they will place more constraints on the schema which in turn have a knock on effect on the interoperation between BioSQL and the other Bio* object models. I've recently been involved in a data-model based interoperability project, and I can't say our experiences with trying to develop a unified persistence model between heterogeneous datamodels were joyful. I'd suggest that a wiki page is set up to describe any ad-hoc 'extensions' that BioSQL users think might be useful to the community. If/When I get round to making any extensions myself then I'll add them to that page, too. Jim. From hlapp at gmx.net Thu Jul 31 23:13:34 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 31 Jul 2008 23:13:34 -0400 Subject: [BioSQL-l] BioSQL at BOSC08 - Was Re: (no subject) In-Reply-To: <4889EB66.2010700@compbio.dundee.ac.uk> References: <79ceddbc0807171108qe17f5a4g13730eeca90c1f2f@mail.gmail.com> <4884A4AB.3050907@compbio.dundee.ac.uk> <7B81518B-1F70-4382-BAF5-E04B6B062CBC@gmx.net> <4888A0A7.7090001@compbio.dundee.ac.uk> <342088F2-B4DE-4D57-ABE8-6431DA535370@gmx.net> <4888AEC2.8060008@compbio.dundee.ac.uk> <4889EB66.2010700@compbio.dundee.ac.uk> Message-ID: <02C35A12-7F3A-4C2B-9266-B5A863FF328B@gmx.net> Hi James, On Jul 25, 2008, at 11:04 AM, James Procter wrote: > I'd suggest that a wiki page is set up to describe any ad-hoc > 'extensions' that BioSQL users think might be useful to the > community. If/When I get round to making any extensions myself then > I'll > add them to that page, too. actually that page exists already: http://www.biosql.org/wiki/Extensions Right now all that's there is the fledgling PhyloDB module that's part of the svn repository (though not yet of a release). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : ===========================================================