From darin.london at duke.edu Tue Mar 6 11:03:59 2007 From: darin.london at duke.edu (Darin London) Date: Tue, 06 Mar 2007 11:03:59 -0500 Subject: [BioSQL-l] Announcing BOSC 2007 Message-ID: <45ED90EF.7030000@duke.edu> The BOSC Organizing Committee are proud to announce BOSC 2007, occurring in Vienna, Austria on July 19th, 20th. The conference this year promises to be exciting, as the BOSC developers attempt to define and solve currently intractable problems in Bioinformatics. Please refer to the following website for complete information, and requests for submissions. Thank you, and we hope to see you in Vienna. http://open-bio.org/wiki/BOSC_2007 The BOSC organizing Committee Please pass this email on to anyone that would be interested. From samborsky_d at yahoo.com Thu Mar 8 17:19:20 2007 From: samborsky_d at yahoo.com (Dmitry Samborskiy) Date: Fri, 9 Mar 2007 01:19:20 +0300 (MSK) Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization Message-ID: <992313.41554.qm@web30609.mail.mud.yahoo.com> Hello Everybody, I use BioPerl-DB and BioSQL project. I suggest to make one extra index to optimize the search by FT tag values. On MySQL it could be done with CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value(64)); Best regards, Dmitry Samborskiy ________________________________________________________ ?? ??? ? Yahoo!? ????????? ??????????? ? ??????????. Yahoo! ?????! http://ru.mail.yahoo.com From hlapp at gmx.net Sat Mar 10 12:54:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 10 Mar 2007 12:54:17 -0500 Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization In-Reply-To: <992313.41554.qm@web30609.mail.mud.yahoo.com> References: <992313.41554.qm@web30609.mail.mud.yahoo.com> Message-ID: Is this a new MySQL 5.0 feature? Does it mean take the first 64 chars and discard the rest for indexing? Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. -hilmar On Mar 8, 2007, at 5:19 PM, Dmitry Samborskiy wrote: > Hello Everybody, > > I use BioPerl-DB and BioSQL project. > I suggest to make one extra index to optimize the search by FT tag > values. > > On MySQL it could be done with > > CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value > (64)); > > Best regards, > Dmitry Samborskiy > > > > > > > > ________________________________________________________ > ?? ??? ? Yahoo!? > ????????? ??????????? ? ??????????. > Yahoo! ?????! http://ru.mail.yahoo.com > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Mar 10 13:28:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 10 Mar 2007 13:28:47 -0500 Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization In-Reply-To: <248017.48917.qm@web30613.mail.mud.yahoo.com> References: <248017.48917.qm@web30613.mail.mud.yahoo.com> Message-ID: <5F3DC5C7-B6A5-409F-B098-39CAB63B2DA4@gmx.net> On Mar 10, 2007, at 1:15 PM, Dmitry Samborskiy wrote: > Hilmar Lapp wrote: > >> Is this a new MySQL 5.0 feature? > > No, it seems that it was even before v.4.1, see: > > http://dev.mysql.com/doc/refman/4.1/en/create-index.html > > And it works with all base DB types: MyISAM, InnoDB, or BDB. Interesting. I missed that. That's what happens if you don't use MySQL :-) > >> Does it mean take the first 64 chars >> and discard the rest for indexing? > > Exactly. Below is the citation from the page mentioned above: > >> For CHAR, VARCHAR, BINARY, and VARBINARY columns, indexes can be >> created > that > use only the leading part of column values, using col_name > (length) > syntax to > specify an index prefix length. BLOB and TEXT columns > also can be > indexed, but > a prefix length *must* be given. > > I used 64 because it's quite enough in almost all cases (since > index is just a > optimization tool for making search faster). > >> Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. > > Me too. I use mysql-4.1.16-1.FC4.1 (Fedora Core 4). > > I don't know how to figure the same index in Oracle/PosgreSQL. > But I hope it's possible... You cannot index LOB columns in Oracle. You can index an initial substring of a text field in both Oracle and PostgreSQL because you can have function indexes. They will only be used though if the query uses the exact same function. -hilmar > > Thanks for your attention. > > Best wishes, > Dmitry Samborskiy > > > > --- Hilmar Lapp wrote: > >> Is this a new MySQL 5.0 feature? Does it mean take the first 64 chars >> and discard the rest for indexing? >> >> Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. >> >> -hilmar >> >> On Mar 8, 2007, at 5:19 PM, Dmitry Samborskiy wrote: >> >>> Hello Everybody, >>> >>> I use BioPerl-DB and BioSQL project. >>> I suggest to make one extra index to optimize the search by FT tag >>> values. >>> >>> On MySQL it could be done with >>> >>> CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value >>> (64)); >>> >>> Best regards, >>> Dmitry Samborskiy >>> >>> >>> >>> >>> >>> >>> >>> ________________________________________________________ >>> ?? ??? ? Yahoo!? >>> ????????? ??????????? ? ??????????. >>> Yahoo! ?????! http://ru.mail.yahoo.com >>> _______________________________________________ >>> BioSQL-l mailing list >>> BioSQL-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biosql-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > > > > > > ________________________________________________________ > ?? ??? ? Yahoo!? > ????????? ??????????? ? ??????????. > Yahoo! ?????! http://ru.mail.yahoo.com -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Mar 16 15:44:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 16 Mar 2007 14:44:54 -0500 Subject: [BioSQL-l] Bio::Annotation::StructuredValue Message-ID: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in SwissProt)? I am thinking of using StructuredValue, Data::Stag, or Class::Meta for some of my RNA structural data work but didn't know if StructuredValues would persist via bioperl-db. I also noticed there is an outstanding BioPerl bug (http:// bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested reimplementing StructuredValueto use Data::Stag, so I thought I might give it a try. chris From hlapp at gmx.net Sat Mar 17 10:26:04 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 17 Mar 2007 10:26:04 -0400 Subject: [BioSQL-l] Bio::Annotation::StructuredValue In-Reply-To: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> References: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> Message-ID: <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> On Mar 16, 2007, at 3:44 PM, Chris Fields wrote: > Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in > SwissProt)? It does b/c B::A::StructuredValue ISA B::A::SimpleValue and it handles the latter. This isn't ideal because if you're like me you'd want all the individual values to each translate to its own row. I was using a SeqProcessor to convert the StructuredValue objects into arrays of SimpleValue objects. Obviously, this will lose the structure between them (i.e., in reality it's not just a flat array), but for enabling indexed searches it works well. With Uniprot no longer collapsing per sequence, the thing that gets lost is the semantic context of each token, but as you found out correctly it gets lost at the bioperl level already. > I am thinking of using StructuredValue, Data::Stag, or > Class::Meta for some of my RNA structural data work but didn't know > if StructuredValues would persist via bioperl-db. At this point they are either flattened out (through the overridden value() method), or you convert them upfront into an array, using a SeqProcessor. BioSQL has no provision for storing the fact that a number of tag/ value associations (which is what B::A::SimpleValues are) comprise of a "bag" of annotation that belongs together. You could, however, persist that through embedding the tags in an ontology (tags are ontology terms) that captures that (through rel.ships). > > I also noticed there is an outstanding BioPerl bug (http:// > bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested > reimplementing StructuredValueto use Data::Stag, so I thought I might > give it a try. Sounds good :-) I hope the above makes some sense. Let me know if not. -hilmar > > chris > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Mar 22 18:29:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 22 Mar 2007 17:29:15 -0500 Subject: [BioSQL-l] [Bioperl-l] Bio::Annotation::StructuredValue In-Reply-To: <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> References: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> Message-ID: <70781A22-76D3-4B71-8364-1FAED30179CE@uiuc.edu> On Mar 17, 2007, at 9:26 AM, Hilmar Lapp wrote: > On Mar 16, 2007, at 3:44 PM, Chris Fields wrote: > >> Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in >> SwissProt)? > > It does b/c B::A::StructuredValue ISA B::A::SimpleValue and it > handles the latter. > > This isn't ideal because if you're like me you'd want all the > individual values to each translate to its own row. I was using a > SeqProcessor to convert the StructuredValue objects into arrays of > SimpleValue objects. > > Obviously, this will lose the structure between them (i.e., in > reality it's not just a flat array), but for enabling indexed > searches it works well. > > With Uniprot no longer collapsing per sequence, the thing that gets > lost is the semantic context of each token, but as you found out > correctly it gets lost at the bioperl level already. Yes, unfortunately, though the use of an ontology would help as you suggest below. >> I am thinking of using StructuredValue, Data::Stag, or >> Class::Meta for some of my RNA structural data work but didn't know >> if StructuredValues would persist via bioperl-db. > > At this point they are either flattened out (through the overridden > value() method), or you convert them upfront into an array, using a > SeqProcessor. > > BioSQL has no provision for storing the fact that a number of tag/ > value associations (which is what B::A::SimpleValues are) comprise of > a "bag" of annotation that belongs together. > > You could, however, persist that through embedding the tags in an > ontology (tags are ontology terms) that captures that (through > rel.ships). I will likely use this approach, though there are no applicable SO/GO terms that I can use so I'll have to roll my own for now. I may use something similar to the RNAML tags for sec. structure. >> I also noticed there is an outstanding BioPerl bug (http:// >> bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested >> reimplementing StructuredValueto use Data::Stag, so I thought I might >> give it a try. > > Sounds good :-) > > I hope the above makes some sense. Let me know if not. > > -hilmar Makes perfect sense! Just needed to run it by someone on the BioSQL end since I'll want to make my data a bit more persistent. I think I will go with Bio::StructuredValue implementing Data::Stag since it has pretty much everything I need. chris From cjfields at uiuc.edu Tue Mar 27 17:17:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 27 Mar 2007 16:17:51 -0500 Subject: [BioSQL-l] Question about BioPerl Bugzilla Bug 2213 In-Reply-To: <7600DB635DB2234CBBED9CDD990C4DB0023A28@exchmail.CMH.Internal> References: <7600DB635DB2234CBBED9CDD990C4DB0023A28@exchmail.CMH.Internal> Message-ID: <4E406CFC-A7B8-4F15-AE90-192791AA507F@uiuc.edu> First, you should direct this to the mail list in case anyone else can add to this. I may not be able to get to this anytime real soon. From the bug report: "The postprocessing in SpeciesAdaptor does mess things up in some cases. The issue is directly related to recent changes in Bio::Species and and could be taken care of by simply not running any postprocessing and foregoing the lineage checking altogether in Bio::Species::classification(), where the exception occurs. However, I believe doing so may break functionality with older bioperl-db/BioSQL installations since data is stored based on the older Bio::Species system (single-name genus and species). Maybe Hilmar can comment?" As noted in the bug report this is still considered a developer series; even though most of the core modules work well together there are still some interoperability issues present (as this bug demonstrates). Maybe having a BioSQL TaxonAdaptor module would be a workaround; Bio::Species is-now-a Bio::Taxon (whereas pre-1.5.2 versions aren't), so if we had a module that stored data in the newer context it might work around this. Hilmar? chris On Mar 27, 2007, at 3:42 PM, Carrel, Michael, G wrote: > Chris, > > > > I am trying to apply this patch to my BioPerl-DB 1.5.2 code and > don't understand what the changes are in the Bio/DB/BioSQL/ > SpeciesAdaptor.pm code. What does the "+=pod" text mean? Same for > "+=cut"? Are we commenting out lines 256 through 280? > > > > The text says that "massaging" code was commented out, but I don't > understand exactly what lines are commented out. Please explain in > more detail what the changes are in the SpeciesAdaptor.pm file. > > > > I believe I understand the changes in the Bio/DB/BioSQL/mysql/ > SpeciesAdaptorDriver.pm code file...commenting out the one line > > ( #$clf[0]->[0] = $obj->binomial(); ). > > > > Thank you, > > > > Mike Carrel > > Network Analyst > > 816-234-1571 > > mgcarrel at cmh.edu > > > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From darin.london at duke.edu Tue Mar 6 16:03:59 2007 From: darin.london at duke.edu (Darin London) Date: Tue, 06 Mar 2007 11:03:59 -0500 Subject: [BioSQL-l] Announcing BOSC 2007 Message-ID: <45ED90EF.7030000@duke.edu> The BOSC Organizing Committee are proud to announce BOSC 2007, occurring in Vienna, Austria on July 19th, 20th. The conference this year promises to be exciting, as the BOSC developers attempt to define and solve currently intractable problems in Bioinformatics. Please refer to the following website for complete information, and requests for submissions. Thank you, and we hope to see you in Vienna. http://open-bio.org/wiki/BOSC_2007 The BOSC organizing Committee Please pass this email on to anyone that would be interested. From samborsky_d at yahoo.com Thu Mar 8 22:19:20 2007 From: samborsky_d at yahoo.com (Dmitry Samborskiy) Date: Fri, 9 Mar 2007 01:19:20 +0300 (MSK) Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization Message-ID: <992313.41554.qm@web30609.mail.mud.yahoo.com> Hello Everybody, I use BioPerl-DB and BioSQL project. I suggest to make one extra index to optimize the search by FT tag values. On MySQL it could be done with CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value(64)); Best regards, Dmitry Samborskiy ________________________________________________________ ?? ??? ? Yahoo!? ????????? ??????????? ? ??????????. Yahoo! ?????! http://ru.mail.yahoo.com From hlapp at gmx.net Sat Mar 10 17:54:17 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 10 Mar 2007 12:54:17 -0500 Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization In-Reply-To: <992313.41554.qm@web30609.mail.mud.yahoo.com> References: <992313.41554.qm@web30609.mail.mud.yahoo.com> Message-ID: Is this a new MySQL 5.0 feature? Does it mean take the first 64 chars and discard the rest for indexing? Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. -hilmar On Mar 8, 2007, at 5:19 PM, Dmitry Samborskiy wrote: > Hello Everybody, > > I use BioPerl-DB and BioSQL project. > I suggest to make one extra index to optimize the search by FT tag > values. > > On MySQL it could be done with > > CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value > (64)); > > Best regards, > Dmitry Samborskiy > > > > > > > > ________________________________________________________ > ?? ??? ? Yahoo!? > ????????? ??????????? ? ??????????. > Yahoo! ?????! http://ru.mail.yahoo.com > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Sat Mar 10 18:28:47 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 10 Mar 2007 13:28:47 -0500 Subject: [BioSQL-l] BioSQL seqfeature_qualifier_value optimization In-Reply-To: <248017.48917.qm@web30613.mail.mud.yahoo.com> References: <248017.48917.qm@web30613.mail.mud.yahoo.com> Message-ID: <5F3DC5C7-B6A5-409F-B098-39CAB63B2DA4@gmx.net> On Mar 10, 2007, at 1:15 PM, Dmitry Samborskiy wrote: > Hilmar Lapp wrote: > >> Is this a new MySQL 5.0 feature? > > No, it seems that it was even before v.4.1, see: > > http://dev.mysql.com/doc/refman/4.1/en/create-index.html > > And it works with all base DB types: MyISAM, InnoDB, or BDB. Interesting. I missed that. That's what happens if you don't use MySQL :-) > >> Does it mean take the first 64 chars >> and discard the rest for indexing? > > Exactly. Below is the citation from the page mentioned above: > >> For CHAR, VARCHAR, BINARY, and VARBINARY columns, indexes can be >> created > that > use only the leading part of column values, using col_name > (length) > syntax to > specify an index prefix length. BLOB and TEXT columns > also can be > indexed, but > a prefix length *must* be given. > > I used 64 because it's quite enough in almost all cases (since > index is just a > optimization tool for making search faster). > >> Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. > > Me too. I use mysql-4.1.16-1.FC4.1 (Fedora Core 4). > > I don't know how to figure the same index in Oracle/PosgreSQL. > But I hope it's possible... You cannot index LOB columns in Oracle. You can index an initial substring of a text field in both Oracle and PostgreSQL because you can have function indexes. They will only be used though if the query uses the exact same function. -hilmar > > Thanks for your attention. > > Best wishes, > Dmitry Samborskiy > > > > --- Hilmar Lapp wrote: > >> Is this a new MySQL 5.0 feature? Does it mean take the first 64 chars >> and discard the rest for indexing? >> >> Sorry, I'm no longer up-to-date w/ MySQL's feature list since 4.1.x. >> >> -hilmar >> >> On Mar 8, 2007, at 5:19 PM, Dmitry Samborskiy wrote: >> >>> Hello Everybody, >>> >>> I use BioPerl-DB and BioSQL project. >>> I suggest to make one extra index to optimize the search by FT tag >>> values. >>> >>> On MySQL it could be done with >>> >>> CREATE INDEX value_ind ON seqfeature_qualifier_value (term_id, value >>> (64)); >>> >>> Best regards, >>> Dmitry Samborskiy >>> >>> >>> >>> >>> >>> >>> >>> ________________________________________________________ >>> ?? ??? ? Yahoo!? >>> ????????? ??????????? ? ??????????. >>> Yahoo! ?????! http://ru.mail.yahoo.com >>> _______________________________________________ >>> BioSQL-l mailing list >>> BioSQL-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biosql-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > > > > > > ________________________________________________________ > ?? ??? ? Yahoo!? > ????????? ??????????? ? ??????????. > Yahoo! ?????! http://ru.mail.yahoo.com -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Mar 16 19:44:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 16 Mar 2007 14:44:54 -0500 Subject: [BioSQL-l] Bio::Annotation::StructuredValue Message-ID: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in SwissProt)? I am thinking of using StructuredValue, Data::Stag, or Class::Meta for some of my RNA structural data work but didn't know if StructuredValues would persist via bioperl-db. I also noticed there is an outstanding BioPerl bug (http:// bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested reimplementing StructuredValueto use Data::Stag, so I thought I might give it a try. chris From hlapp at gmx.net Sat Mar 17 14:26:04 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 17 Mar 2007 10:26:04 -0400 Subject: [BioSQL-l] Bio::Annotation::StructuredValue In-Reply-To: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> References: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> Message-ID: <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> On Mar 16, 2007, at 3:44 PM, Chris Fields wrote: > Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in > SwissProt)? It does b/c B::A::StructuredValue ISA B::A::SimpleValue and it handles the latter. This isn't ideal because if you're like me you'd want all the individual values to each translate to its own row. I was using a SeqProcessor to convert the StructuredValue objects into arrays of SimpleValue objects. Obviously, this will lose the structure between them (i.e., in reality it's not just a flat array), but for enabling indexed searches it works well. With Uniprot no longer collapsing per sequence, the thing that gets lost is the semantic context of each token, but as you found out correctly it gets lost at the bioperl level already. > I am thinking of using StructuredValue, Data::Stag, or > Class::Meta for some of my RNA structural data work but didn't know > if StructuredValues would persist via bioperl-db. At this point they are either flattened out (through the overridden value() method), or you convert them upfront into an array, using a SeqProcessor. BioSQL has no provision for storing the fact that a number of tag/ value associations (which is what B::A::SimpleValues are) comprise of a "bag" of annotation that belongs together. You could, however, persist that through embedding the tags in an ontology (tags are ontology terms) that captures that (through rel.ships). > > I also noticed there is an outstanding BioPerl bug (http:// > bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested > reimplementing StructuredValueto use Data::Stag, so I thought I might > give it a try. Sounds good :-) I hope the above makes some sense. Let me know if not. -hilmar > > chris > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Mar 22 22:29:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 22 Mar 2007 17:29:15 -0500 Subject: [BioSQL-l] [Bioperl-l] Bio::Annotation::StructuredValue In-Reply-To: <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> References: <8C776F12-19CF-4511-91F2-ED9640FB995C@uiuc.edu> <1F43323F-22D8-4A1D-A62E-46E60A59D97C@gmx.net> Message-ID: <70781A22-76D3-4B71-8364-1FAED30179CE@uiuc.edu> On Mar 17, 2007, at 9:26 AM, Hilmar Lapp wrote: > On Mar 16, 2007, at 3:44 PM, Chris Fields wrote: > >> Does bioperl-db store Bio::Annotation::StructuredValue (i.e. in >> SwissProt)? > > It does b/c B::A::StructuredValue ISA B::A::SimpleValue and it > handles the latter. > > This isn't ideal because if you're like me you'd want all the > individual values to each translate to its own row. I was using a > SeqProcessor to convert the StructuredValue objects into arrays of > SimpleValue objects. > > Obviously, this will lose the structure between them (i.e., in > reality it's not just a flat array), but for enabling indexed > searches it works well. > > With Uniprot no longer collapsing per sequence, the thing that gets > lost is the semantic context of each token, but as you found out > correctly it gets lost at the bioperl level already. Yes, unfortunately, though the use of an ontology would help as you suggest below. >> I am thinking of using StructuredValue, Data::Stag, or >> Class::Meta for some of my RNA structural data work but didn't know >> if StructuredValues would persist via bioperl-db. > > At this point they are either flattened out (through the overridden > value() method), or you convert them upfront into an array, using a > SeqProcessor. > > BioSQL has no provision for storing the fact that a number of tag/ > value associations (which is what B::A::SimpleValues are) comprise of > a "bag" of annotation that belongs together. > > You could, however, persist that through embedding the tags in an > ontology (tags are ontology terms) that captures that (through > rel.ships). I will likely use this approach, though there are no applicable SO/GO terms that I can use so I'll have to roll my own for now. I may use something similar to the RNAML tags for sec. structure. >> I also noticed there is an outstanding BioPerl bug (http:// >> bugzilla.open-bio.org/show_bug.cgi?id=1825) where Hilmar suggested >> reimplementing StructuredValueto use Data::Stag, so I thought I might >> give it a try. > > Sounds good :-) > > I hope the above makes some sense. Let me know if not. > > -hilmar Makes perfect sense! Just needed to run it by someone on the BioSQL end since I'll want to make my data a bit more persistent. I think I will go with Bio::StructuredValue implementing Data::Stag since it has pretty much everything I need. chris From cjfields at uiuc.edu Tue Mar 27 21:17:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 27 Mar 2007 16:17:51 -0500 Subject: [BioSQL-l] Question about BioPerl Bugzilla Bug 2213 In-Reply-To: <7600DB635DB2234CBBED9CDD990C4DB0023A28@exchmail.CMH.Internal> References: <7600DB635DB2234CBBED9CDD990C4DB0023A28@exchmail.CMH.Internal> Message-ID: <4E406CFC-A7B8-4F15-AE90-192791AA507F@uiuc.edu> First, you should direct this to the mail list in case anyone else can add to this. I may not be able to get to this anytime real soon. From the bug report: "The postprocessing in SpeciesAdaptor does mess things up in some cases. The issue is directly related to recent changes in Bio::Species and and could be taken care of by simply not running any postprocessing and foregoing the lineage checking altogether in Bio::Species::classification(), where the exception occurs. However, I believe doing so may break functionality with older bioperl-db/BioSQL installations since data is stored based on the older Bio::Species system (single-name genus and species). Maybe Hilmar can comment?" As noted in the bug report this is still considered a developer series; even though most of the core modules work well together there are still some interoperability issues present (as this bug demonstrates). Maybe having a BioSQL TaxonAdaptor module would be a workaround; Bio::Species is-now-a Bio::Taxon (whereas pre-1.5.2 versions aren't), so if we had a module that stored data in the newer context it might work around this. Hilmar? chris On Mar 27, 2007, at 3:42 PM, Carrel, Michael, G wrote: > Chris, > > > > I am trying to apply this patch to my BioPerl-DB 1.5.2 code and > don't understand what the changes are in the Bio/DB/BioSQL/ > SpeciesAdaptor.pm code. What does the "+=pod" text mean? Same for > "+=cut"? Are we commenting out lines 256 through 280? > > > > The text says that "massaging" code was commented out, but I don't > understand exactly what lines are commented out. Please explain in > more detail what the changes are in the SpeciesAdaptor.pm file. > > > > I believe I understand the changes in the Bio/DB/BioSQL/mysql/ > SpeciesAdaptorDriver.pm code file...commenting out the one line > > ( #$clf[0]->[0] = $obj->binomial(); ). > > > > Thank you, > > > > Mike Carrel > > Network Analyst > > 816-234-1571 > > mgcarrel at cmh.edu > > > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign