From p.j.a.cock at googlemail.com Thu Apr 24 04:31:35 2014 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 24 Apr 2014 09:31:35 +0100 Subject: [BioSQL-l] [Bioperl-l] Writing and retrieving Genbank files from BioSQL In-Reply-To: <035D9884-D673-4B54-97FB-E9404C40E2A0@illinois.edu> References: <948ace19-ea00-493b-821f-92836ac3b02a@googlegroups.com> <5357E7DD.8060609@gmail.com> <5357F141.4010505@gmail.com> <035D9884-D673-4B54-97FB-E9404C40E2A0@illinois.edu> Message-ID: CC'ing Hilmar and the BioSQL list. Yes, it might be nice to extend the BioSQL schema for these fields (circular, molecule type, etc). Right now, whatever bioperl-db does is the effective standard, so any assistance specifying what exactly it does would be enough to make Biopython do the same. Given we didn't get a GSoC project student to work on BioSQL and SQLite, perhaps a plan B is needed? Thanks, Peter. On Wednesday, April 23, 2014, Fields, Christopher J wrote: > I do think Roy is correct in saying this could be set as annotation. In > my opinion I would like Seq-pertinent information (alphabet, circular, mol > type) at the level of the sequence. At least, to me that makes more sense, > as it describes information directly relevant about the sequence itself, > whereas to me annotation are more about the sequence record (pubs, > taxonomy, etc). But I?m probably splitting hairs. > > BTW I agree re: SQLite support in bioperl-db. However, bioperl-db uses a > home-grown ORM, so this would entail creating a SQLite-specific DB loader. > The intent has been to move bioperl-db over to a consistent ORM > (DBIx::Class) that would easily allow this, but that GSoC project didn?t > have takers :P > > chris > > On Apr 23, 2014, at 11:58 AM, Roy Chaudhuri > > wrote: > > > Hi Peter, > > > > Just found your message on the Google group. > > > > I think the correct way to deal with this would be to modify the BioSQL > schema to include columns for is_circular and molecule type in the bioentry > table. > > > > However, if the BioSQL schema is considered immutable, a workaround > would be for BioPerl, BioPython etc. to agree on a standard way of storing > the information in the existing BioSQL schema. > > > > I'd suggest we do this with two annotation tags: > > "is_circular", with a value of BioPerl $seq->is_circular (1 or NULL) > > "molecule", with a value of BioPerl $seq->molecule (DNA, RNA etc.) > > > > Once a sequence is removed from the database, these annotation tags > could be removed and put in the correct place in the BioPerl/BioPython > object model. > > > > Cheers, > > Roy. > > > > > Thank you for chasing this issue Rik :) > > > > > > From the Biopython point of view, all I really need to know > > > is where the linear/circular and molecule type information > > > from the GenBank LOCUS line end up in the BioSQL tables > > > (to make Biopython put it in the same place). > > > > > > https://redmine.open-bio.org/issues/2578 > > > > > > Sadly I don't currently have a working BioSQL + BioPerl test > > > setup (it would be great if BioPerl could add SQLite support - > > > which would make it easy to do cross project testing). > > > > > > Thanks, > > > > > > Peter > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From p.j.a.cock at googlemail.com Thu Apr 24 08:31:35 2014 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 24 Apr 2014 09:31:35 +0100 Subject: [BioSQL-l] [Bioperl-l] Writing and retrieving Genbank files from BioSQL In-Reply-To: <035D9884-D673-4B54-97FB-E9404C40E2A0@illinois.edu> References: <948ace19-ea00-493b-821f-92836ac3b02a@googlegroups.com> <5357E7DD.8060609@gmail.com> <5357F141.4010505@gmail.com> <035D9884-D673-4B54-97FB-E9404C40E2A0@illinois.edu> Message-ID: CC'ing Hilmar and the BioSQL list. Yes, it might be nice to extend the BioSQL schema for these fields (circular, molecule type, etc). Right now, whatever bioperl-db does is the effective standard, so any assistance specifying what exactly it does would be enough to make Biopython do the same. Given we didn't get a GSoC project student to work on BioSQL and SQLite, perhaps a plan B is needed? Thanks, Peter. On Wednesday, April 23, 2014, Fields, Christopher J wrote: > I do think Roy is correct in saying this could be set as annotation. In > my opinion I would like Seq-pertinent information (alphabet, circular, mol > type) at the level of the sequence. At least, to me that makes more sense, > as it describes information directly relevant about the sequence itself, > whereas to me annotation are more about the sequence record (pubs, > taxonomy, etc). But I?m probably splitting hairs. > > BTW I agree re: SQLite support in bioperl-db. However, bioperl-db uses a > home-grown ORM, so this would entail creating a SQLite-specific DB loader. > The intent has been to move bioperl-db over to a consistent ORM > (DBIx::Class) that would easily allow this, but that GSoC project didn?t > have takers :P > > chris > > On Apr 23, 2014, at 11:58 AM, Roy Chaudhuri > > wrote: > > > Hi Peter, > > > > Just found your message on the Google group. > > > > I think the correct way to deal with this would be to modify the BioSQL > schema to include columns for is_circular and molecule type in the bioentry > table. > > > > However, if the BioSQL schema is considered immutable, a workaround > would be for BioPerl, BioPython etc. to agree on a standard way of storing > the information in the existing BioSQL schema. > > > > I'd suggest we do this with two annotation tags: > > "is_circular", with a value of BioPerl $seq->is_circular (1 or NULL) > > "molecule", with a value of BioPerl $seq->molecule (DNA, RNA etc.) > > > > Once a sequence is removed from the database, these annotation tags > could be removed and put in the correct place in the BioPerl/BioPython > object model. > > > > Cheers, > > Roy. > > > > > Thank you for chasing this issue Rik :) > > > > > > From the Biopython point of view, all I really need to know > > > is where the linear/circular and molecule type information > > > from the GenBank LOCUS line end up in the BioSQL tables > > > (to make Biopython put it in the same place). > > > > > > https://redmine.open-bio.org/issues/2578 > > > > > > Sadly I don't currently have a working BioSQL + BioPerl test > > > setup (it would be great if BioPerl could add SQLite support - > > > which would make it easy to do cross project testing). > > > > > > Thanks, > > > > > > Peter > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > >