From biopython at maubp.freeserve.co.uk Fri Oct 3 12:18:26 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 3 Oct 2008 17:18:26 +0100 Subject: [BioSQL-l] parent_taxon_id of a root node Message-ID: <320fb6e00810030918u7dac6493wc017b4cc69ba2bc2@mail.gmail.com> Hello all, I was puzzled to find the BioSQL script load_ncbi_taxonomy.pl will set the parent_taxon_id of the NCBI root node in the taxon table to point to itself. I would have expected this to be NULL indicating no parent. If someone is using the database directly, extracting a lineage could trigger an infinite loop. Can anyone explain the rational here? Note that when Biopython adds entries to the taxon table, it uses NULL for a root node. When retrieving sequences from a BioSQL database, Biopython does cope with a root node with a NULL parent or a self-parent - would it safe to assume BioPerl and Java can also cope with both situations? Thanks, Peter From raoul.bonnal at itb.cnr.it Fri Oct 3 06:50:24 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 03 Oct 2008 12:50:24 +0200 Subject: [BioSQL-l] class taxon_name and genbank Message-ID: <1223031024.8716.32.camel@454-2> Dear all, accession AB030700 SOURCE Rattus norvegicus (Norway rat) ORGANISM Rattus norvegicus (Norway rat) looking into taxon_name I can find only: Rattus norvegicus where did you store "(Norway rat)" ? NCBI identify ?"(Norway rat)" as "Genbank common name: Norway rat" -- Ra From biopython at maubp.freeserve.co.uk Sat Oct 4 09:24:45 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 4 Oct 2008 14:24:45 +0100 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <1223031024.8716.32.camel@454-2> References: <1223031024.8716.32.camel@454-2> Message-ID: <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> On Fri, Oct 3, 2008 at 11:50 AM, Raoul Jean Pierre Bonnal wrote: > Dear all, > accession AB030700 > > SOURCE Rattus norvegicus (Norway rat) > ORGANISM Rattus norvegicus (Norway rat) > > looking into taxon_name I can find only: > > Rattus norvegicus > > where did you store "(Norway rat)" ? > NCBI identify ?"(Norway rat)" as > > "Genbank common name: Norway rat" Hi Ra, How did you load this GenBank file into BioSQL? My guess is with BioPerl. Also, did you pre-load the NCBI taxonomy using the BioSQL script load_ncbi_taxonomy.pl? This will probably have an impact too. In any case, I suggest you look at the taxon and taxon_name tables in BioSQL. This can store the scientific name plus the common name. I would expect both "Rattus norvegicus" and "Norway rat" to be there for your entry, but without known more of the specifics its hard to guess. Peter From raoul.bonnal at itb.cnr.it Sat Oct 4 10:56:00 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Sat, 04 Oct 2008 16:56:00 +0200 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> References: <1223031024.8716.32.camel@454-2> <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> Message-ID: <1223132160.6832.6.camel@454-2> Dear Peter, Il giorno sab, 04/10/2008 alle 14.24 +0100, Peter ha scritto: > How did you load this GenBank file into BioSQL? My guess is with > BioPerl. No, I'm following the development of the bioruby interface. I'm doing import/export test. My mistake I found as you suggested. -- Ra From hlapp at gmx.net Sat Oct 4 12:37:41 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 4 Oct 2008 12:37:41 -0400 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <1223031024.8716.32.camel@454-2> References: <1223031024.8716.32.camel@454-2> Message-ID: <47D6D8CD-8276-4AE4-8B92-45689D5AFC3A@gmx.net> Hi Raoul, the BioSQL taxonomy tables are modeled directly based on the NCBI taxonomy model, which has nodes, each of which represents a taxon (where a tip, e.g., species, or an internal node, e.g. family), and one or more names for each node. The name_class column in taxon_name gives the type of the name. NCBI uses 'scientific name' as the name_class for the currently valid latin name (binomial). 'Norway rat' would have name_class 'common name'. There are others, such as misspellings etc. If you use the load_ncbi_taxonomy.pl script that comes with BioSQL, that's how the taxonomy will be loaded. Does this answer your question? -hilmar On Oct 3, 2008, at 6:50 AM, Raoul Jean Pierre Bonnal wrote: > Dear all, > accession AB030700 > > SOURCE Rattus norvegicus (Norway rat) > ORGANISM Rattus norvegicus (Norway rat) > > looking into taxon_name I can find only: > > Rattus norvegicus > > > where did you store "(Norway rat)" ? > NCBI identify "(Norway rat)" as > > "Genbank common name: Norway rat" > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From raoul.bonnal at itb.cnr.it Fri Oct 10 01:51:33 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 10 Oct 2008 07:51:33 +0200 Subject: [BioSQL-l] Location with lt, gt Message-ID: <1223617893.8716.4.camel@454-2> Dear all, how do you store this kind of location ? ------------------------------------ CDS <1..>314 ------------------------------------ /codon_start=1 /db_xref="SPTREMBL:Q9QYG8" /transl_table=1 /gene="uk" /product="uridine kinase" /protein_id="BAA83085.1" /translation="KLFVDTDADTRLSRRVLRDISERGRDLEQILSQYITFVKPAFEE FCLPTKKYADVIIPRGADNLVAINLIVQHIQDILNGGLSKRQTNGYLNGYTPSRKRQA SES" -- Ra From holland at eaglegenomics.com Fri Oct 10 05:46:12 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 10 Oct 2008 10:46:12 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223617893.8716.4.camel@454-2> References: <1223617893.8716.4.camel@454-2> Message-ID: BioJava refers to the < and > notation as fuzzy locations. See: http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. cheers, Richard 2008/10/10 Raoul Jean Pierre Bonnal : > Dear all, > how do you store this kind of location ? > ------------------------------------ > CDS <1..>314 > ------------------------------------ > > /codon_start=1 > /db_xref="SPTREMBL:Q9QYG8" > /transl_table=1 > /gene="uk" > /product="uridine kinase" > /protein_id="BAA83085.1" > /translation="KLFVDTDADTRLSRRVLRDISERGRDLEQILSQYITFVKPAFEE > > FCLPTKKYADVIIPRGADNLVAINLIVQHIQDILNGGLSKRQTNGYLNGYTPSRKRQA > SES" > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From biopython at maubp.freeserve.co.uk Fri Oct 10 05:58:44 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 10 Oct 2008 10:58:44 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: References: <1223617893.8716.4.camel@454-2> Message-ID: <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland wrote: > > BioJava refers to the < and > notation as fuzzy locations. See: > > http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. > Biopython also calls them fuzzy locations and has special location objects to hold this in memory. Raoul, were you asking how these are stored in the tables in BioSQL itself? Peter From raoul.bonnal at itb.cnr.it Fri Oct 10 06:40:04 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 10 Oct 2008 12:40:04 +0200 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> Message-ID: <1223635204.6290.7.camel@454-2> Il giorno ven, 10/10/2008 alle 10.58 +0100, Peter ha scritto: > On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland > wrote: > > > > BioJava refers to the < and > notation as fuzzy locations. See: > > > > http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. > > > > Biopython also calls them fuzzy locations and has special location > objects to hold this in memory. Ok BioRuby too but ... > Raoul, were you asking how these are stored in the tables in BioSQL itself? YES. -- Ra From biopython at maubp.freeserve.co.uk Fri Oct 10 07:11:09 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 10 Oct 2008 12:11:09 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223635204.6290.7.camel@454-2> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> <1223635204.6290.7.camel@454-2> Message-ID: <320fb6e00810100411j210f7f52qabb5a8ab6c1f3a34@mail.gmail.com> >> Raoul, were you asking how these are stored in the tables in BioSQL itself? > YES. Looking at the BioSQL schema, the location_qualifier_value table is used to record if a location table entry is fuzzy - but this is not something I am personally familiar with. Peter From hlapp at gmx.net Sun Oct 12 21:55:04 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 12 Oct 2008 21:55:04 -0400 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223635204.6290.7.camel@454-2> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> <1223635204.6290.7.camel@454-2> Message-ID: <6710FC24-2D32-493C-954C-E87EBC22041D@gmx.net> Sorry for the delay in responding. Peter's suggestion of location_qualifier_value is good but only half of the truth. More specifically, location_qualifier_value can be used to store the minimum and maximum start and end coordinates for the start and end positions of a location. The model that BioPerl uses for fuzzy locations is in Bio::Location::FuzzyLocationI. Basically, on top of the canonical start and end, a fuzzy location has a min_start, max_start, start_pos_type ('BEFORE', 'AFTER', 'EXACT','WITHIN', 'BETWEEN','UNCERTAIN') and similarly for the end position. How to determine the canonical start and positions from these is left to a CoordinatePolicy object, i.e., subject to different possible interpretations (e.g., conservative, maximum range, etc). In addition, locations have a type: 'EXACT', 'WITHIN', or 'IN- BETWEEN'. This would be set using the term_id foreign key from the location to the term table. Does this make sense? I have to admit that Bioperl-db doesn't implement round-tripping of fuzzy locations yet, so there actually isn't a reference implementation yet that you could look at (but Biojava sounded like it roundtrips those?). -hilmar On Oct 10, 2008, at 6:40 AM, Raoul Jean Pierre Bonnal wrote: > Il giorno ven, 10/10/2008 alle 10.58 +0100, Peter ha scritto: >> On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland >> wrote: >>> >>> BioJava refers to the < and > notation as fuzzy locations. See: >>> >>> http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects >>> . >>> >> >> Biopython also calls them fuzzy locations and has special location >> objects to hold this in memory. > Ok BioRuby too but ... > >> Raoul, were you asking how these are stored in the tables in BioSQL >> itself? > YES. > > -- > Ra > > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From johnsonm at gmail.com Mon Oct 13 14:29:18 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 13 Oct 2008 13:29:18 -0500 Subject: [BioSQL-l] SeqFeature scores Message-ID: I just noticed that instances of Bio::SeqFeature::Generic round-tripped through BioSQL (via bioperl-db) seem to be loosing their scores. Stored thusly: my $dbadp = Bio::DB::BioDB->new( -database => 'biosql', -user => $user, -pass => $pass, -dbname => $oracle_instance, -driver => 'Oracle' ); my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); my $seq = Bio::Seq->new( -id => 'DEBUG001', -accession_number => 'DBG001', -desc => 'Debug Sequence', -seq => 'GATTACA', -namespace => 'DEBUG', ); my $feature = Bio::SeqFeature::Generic->new( -seq_id => 'DEBUG001', -display_name => 'FEAT0001', -primary => 'debug', -source => 'mjohnson', -start => 1, -end => 1000, -strand => 1, -score => 100 ); $seq->add_SeqFeature($feature); my $pseq = $dbadp->create_persistent($seq); $pseq->store(); $adp->commit(); Fetched thusly: my $dbadp = Bio::DB::BioDB->new( -database => 'biosql', -user => $user, -pass => $pass, -dbname => $oracle_instance, -driver => 'Oracle' ); my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); my $query = Bio::DB::Query::BioQuery->new(); $query->datacollections([ "Bio::PrimarySeqI s", ]); $query->where(["s.display_id = 'DEBUG001'"]); my $result = $adp->find_by_query($query); while (my $seq = $result->next_object()) { my @features = $seq->get_SeqFeatures(); foreach my $feature (@features) { print join( "\t", $feature->display_name(), $feature->score() ), "\n"; } } It seems that $feature->score() is returning undef. I foresee three possibilities, in order of decreasing liklihood: 1) I'm doing something dumb 2) I'm expecting the wrong behaviour 3) I'm seeing some kind of odd bug I'm not opposed to spending some time in the debugger if it's 3, but I think 1 or 2 is far more likely. Comments/advice/rotten vegetables? From hlapp at gmx.net Mon Oct 13 16:03:32 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 13 Oct 2008 16:03:32 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: Message-ID: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Hi Mark, this was a known bug at some point. In fact, I can't swear that it was solved, so if you are running the latest bioperl-db version then I think it hasn't been. Can you confirm? -hilmar On Oct 13, 2008, at 2:29 PM, Mark Johnson wrote: > I just noticed that instances of Bio::SeqFeature::Generic > round-tripped through BioSQL (via bioperl-db) seem to be loosing their > scores. > > Stored thusly: > > my $dbadp = Bio::DB::BioDB->new( > -database => 'biosql', > -user => $user, > -pass => $pass, > -dbname => $oracle_instance, > -driver => 'Oracle' > ); > > my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); > > my $seq = Bio::Seq->new( > -id => 'DEBUG001', > -accession_number => 'DBG001', > -desc => 'Debug Sequence', > -seq => 'GATTACA', > -namespace => 'DEBUG', > ); > > my $feature = Bio::SeqFeature::Generic->new( > -seq_id => > 'DEBUG001', > -display_name => > 'FEAT0001', > -primary => > 'debug', > -source => > 'mjohnson', > -start => 1, > -end => 1000, > -strand => 1, > -score => 100 > ); > > $seq->add_SeqFeature($feature); > > my $pseq = $dbadp->create_persistent($seq); > > $pseq->store(); > $adp->commit(); > > Fetched thusly: > > my $dbadp = Bio::DB::BioDB->new( > -database => 'biosql', > -user => $user, > -pass => $pass, > -dbname => $oracle_instance, > -driver => 'Oracle' > ); > > my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); > > my $query = Bio::DB::Query::BioQuery->new(); > > $query->datacollections([ > "Bio::PrimarySeqI s", > ]); > > $query->where(["s.display_id = 'DEBUG001'"]); > > my $result = $adp->find_by_query($query); > > while (my $seq = $result->next_object()) { > > my @features = $seq->get_SeqFeatures(); > > foreach my $feature (@features) { > > print join( > "\t", > $feature->display_name(), > $feature->score() > ), "\n"; > > } > > } > > It seems that $feature->score() is returning undef. > > I foresee three possibilities, in order of decreasing liklihood: > > 1) I'm doing something dumb > 2) I'm expecting the wrong behaviour > 3) I'm seeing some kind of odd bug > > I'm not opposed to spending some time in the debugger if it's 3, but I > think 1 or 2 is far more likely. > > Comments/advice/rotten vegetables? > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From johnsonm at gmail.com Tue Oct 14 12:03:03 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 14 Oct 2008 11:03:03 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Message-ID: On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: > Hi Mark, this was a known bug at some point. In fact, I can't swear that it > was solved, so if you are running the latest bioperl-db version then I think > it hasn't been. Can you confirm? > > -hilmar I just pointed my scriptage to a freshly updated bioperl-live and bioperl-db (updated via svn a few minutes ago). I'm seeing the same problem. So, unless I'm hallucinating, this is still an issue. Shall I open a new bug in bugzilla, or is there an old one you'd like to reopen? From cjfields at illinois.edu Tue Oct 14 13:28:02 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Oct 2008 12:28:02 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Message-ID: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: > On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >> Hi Mark, this was a known bug at some point. In fact, I can't swear >> that it >> was solved, so if you are running the latest bioperl-db version >> then I think >> it hasn't been. Can you confirm? >> >> -hilmar > > I just pointed my scriptage to a freshly updated bioperl-live and > bioperl-db (updated via svn a few minutes ago). I'm seeing the same > problem. So, unless I'm hallucinating, this is still an issue. Shall > I open a new bug in bugzilla, or is there an old one you'd like to > reopen? I think open a new bug so we can track it. Personally I'm not sure how we'd store score data in BioSQL. Is 'score' within the schema? I suppose we could add it as a specific tag value but that seems potentially hackish and prone to naming conflicts. chris From johnsonm at gmail.com Tue Oct 14 15:14:04 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 14 Oct 2008 14:14:04 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields wrote: > On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: > >> On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >>> >>> Hi Mark, this was a known bug at some point. In fact, I can't swear that >>> it >>> was solved, so if you are running the latest bioperl-db version then I >>> think >>> it hasn't been. Can you confirm? >>> >>> -hilmar >> >> I just pointed my scriptage to a freshly updated bioperl-live and >> bioperl-db (updated via svn a few minutes ago). I'm seeing the same >> problem. So, unless I'm hallucinating, this is still an issue. Shall >> I open a new bug in bugzilla, or is there an old one you'd like to >> reopen? > > I think open a new bug so we can track it. Personally I'm not sure how we'd > store score data in BioSQL. Is 'score' within the schema? I suppose we > could add it as a specific tag value but that seems potentially hackish and > prone to naming conflicts. > > chris Well, if there is no provision for storing score in the schema, it's not a bug, it's an enhancement request. If it worked in the past or is supposed to work, it's a bug. As to where it ends up in the schema, I had been wondering about that. I presumed it ended up with the rest of the tags...and I wonder if that's not the most appropriate place for it. It's an attribute that not all features have. In fact, I don't see score as part of SeqFeatureI. Bio::SeqFeature::Generic has it's own accessor and storage. Speaking of which, I just tried to go browsing the deobfuscator and it seems to be horked (internal server error). Is there a better contact other than dag at sonsorol.org (server administrator)? Is there a bugzilla queue for website issues? From cjfields at illinois.edu Tue Oct 14 15:38:05 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Oct 2008 14:38:05 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: <4C9FB143-1BE7-441B-B904-B41C2815447A@illinois.edu> On Oct 14, 2008, at 2:14 PM, Mark Johnson wrote: > On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields > wrote: >> On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: >> >>> On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >>>> >>>> Hi Mark, this was a known bug at some point. In fact, I can't >>>> swear that >>>> it >>>> was solved, so if you are running the latest bioperl-db version >>>> then I >>>> think >>>> it hasn't been. Can you confirm? >>>> >>>> -hilmar >>> >>> I just pointed my scriptage to a freshly updated bioperl-live and >>> bioperl-db (updated via svn a few minutes ago). I'm seeing the same >>> problem. So, unless I'm hallucinating, this is still an issue. >>> Shall >>> I open a new bug in bugzilla, or is there an old one you'd like to >>> reopen? >> >> I think open a new bug so we can track it. Personally I'm not sure >> how we'd >> store score data in BioSQL. Is 'score' within the schema? I >> suppose we >> could add it as a specific tag value but that seems potentially >> hackish and >> prone to naming conflicts. >> >> chris > > Well, if there is no provision for storing score in the schema, it's > not a bug, it's an enhancement request. If it worked in the past or > is supposed to work, it's a bug. As to where it ends up in the > schema, I had been wondering about that. I presumed it ended up with > the rest of the tags...and I wonder if that's not the most appropriate > place for it. It's an attribute that not all features have. In fact, > I don't see score as part of SeqFeatureI. Bio::SeqFeature::Generic > has it's own accessor and storage. Speaking of which, I just tried to > go browsing the deobfuscator and it seems to be horked (internal > server error). Is there a better contact other than dag at sonsorol.org > (server administrator)? Is there a bugzilla queue for website > issues? I would support adding score in as a tag (particularly seeing as it's not part of SeqFeatureI) but I think it requires a bit more discussion. I wouldn't be surprised if it was added as a method primarily to match GFF-like feature data with score information. Hilmar, thoughts? For the website issues you can submit support issues directly to support at helpdesk.open-bio.org (it will probably be faster). chris From johnsonm at gmail.com Wed Oct 15 17:50:53 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Wed, 15 Oct 2008 16:50:53 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields wrote: > I think open a new bug so we can track it. Personally I'm not sure how we'd > store score data in BioSQL. Is 'score' within the schema? I suppose we > could add it as a specific tag value but that seems potentially hackish and > prone to naming conflicts. > > chris Should I open it against BioSQL or bioperl-db? I'd guess the latter unless we need schema changes? From hlapp at gmx.net Thu Oct 16 18:03:26 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 16 Oct 2008 18:03:26 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: Sorry for being slow to respond to this. Please file it against Bioperl-db. It needs to be stored in seqfeature_qualifier_value as the score really only applies as an attribute to computed features. The reason it doesn't get stored right now is that BioPerl treats it as a first-class property rather than an attribute. So Bioperl-db needs to map between the two models. -hilmar On Oct 15, 2008, at 5:50 PM, Mark Johnson wrote: > On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields > wrote: > >> I think open a new bug so we can track it. Personally I'm not sure >> how we'd >> store score data in BioSQL. Is 'score' within the schema? I >> suppose we >> could add it as a specific tag value but that seems potentially >> hackish and >> prone to naming conflicts. >> >> chris > > Should I open it against BioSQL or bioperl-db? I'd guess the latter > unless we need schema changes? > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Oct 16 18:12:13 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 16 Oct 2008 18:12:13 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> On Oct 14, 2008, at 1:28 PM, Chris Fields wrote: > Personally I'm not sure how we'd store score data in BioSQL. Is > 'score' within the schema? I suppose we could add it as a specific > tag value but that seems potentially hackish and prone to naming > conflicts. Well, yes there might be naming conflicts, if someone wanted to add a tag to a seqfeature called 'score', for example. But would there be cases where it would make sense to have value stored in the feature's $feat->score() method *and* a (semantically) different one as the 'score' tag's value? Quite frankly I'd be hard- pressed to come up with a scenario where that might make sense. So as far as I am concerned I wouldn't actually have a problem with changing the implementation of score() to store/pull the value to/from the tag/value hash. In fact, that's what B::SF::Similarity does for the attributes it adds methods for (such as bits, significance, etc). Thoughts? I'm copying this to the Bioperl list as really it is a BioPerl/Bioperl-db issue. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Thu Oct 16 20:32:07 2008 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 16 Oct 2008 19:32:07 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> Message-ID: <126B4381-F15B-44FF-B91E-3A19523224B5@illinois.edu> On Oct 16, 2008, at 5:12 PM, Hilmar Lapp wrote: > > On Oct 14, 2008, at 1:28 PM, Chris Fields wrote: > >> Personally I'm not sure how we'd store score data in BioSQL. Is >> 'score' within the schema? I suppose we could add it as a specific >> tag value but that seems potentially hackish and prone to naming >> conflicts. > > Well, yes there might be naming conflicts, if someone wanted to add > a tag to a seqfeature called 'score', for example. > > But would there be cases where it would make sense to have value > stored in the feature's $feat->score() method *and* a (semantically) > different one as the 'score' tag's value? Quite frankly I'd be hard- > pressed to come up with a scenario where that might make sense. Agreed. I think it's a very unlikely scenario, frankly, but never hurts to bring it up. > So as far as I am concerned I wouldn't actually have a problem with > changing the implementation of score() to store/pull the value to/ > from the tag/value hash. In fact, that's what B::SF::Similarity does > for the attributes it adds methods for (such as bits, significance, > etc). > > Thoughts? I'm copying this to the Bioperl list as really it is a > BioPerl/Bioperl-db issue. > > -hilmar Makes sense to me. I think anything else not supported in BioSQL/ bioperl-db should likewise be stored in the tag/value hash, but score() is the only one that comes to mind at the moment. I'll make the change once the bug report is filed so we can track any problems I encounter. chris From biopython at maubp.freeserve.co.uk Fri Oct 3 16:18:26 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 3 Oct 2008 17:18:26 +0100 Subject: [BioSQL-l] parent_taxon_id of a root node Message-ID: <320fb6e00810030918u7dac6493wc017b4cc69ba2bc2@mail.gmail.com> Hello all, I was puzzled to find the BioSQL script load_ncbi_taxonomy.pl will set the parent_taxon_id of the NCBI root node in the taxon table to point to itself. I would have expected this to be NULL indicating no parent. If someone is using the database directly, extracting a lineage could trigger an infinite loop. Can anyone explain the rational here? Note that when Biopython adds entries to the taxon table, it uses NULL for a root node. When retrieving sequences from a BioSQL database, Biopython does cope with a root node with a NULL parent or a self-parent - would it safe to assume BioPerl and Java can also cope with both situations? Thanks, Peter From raoul.bonnal at itb.cnr.it Fri Oct 3 10:50:24 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 03 Oct 2008 12:50:24 +0200 Subject: [BioSQL-l] class taxon_name and genbank Message-ID: <1223031024.8716.32.camel@454-2> Dear all, accession AB030700 SOURCE Rattus norvegicus (Norway rat) ORGANISM Rattus norvegicus (Norway rat) looking into taxon_name I can find only: Rattus norvegicus where did you store "(Norway rat)" ? NCBI identify ?"(Norway rat)" as "Genbank common name: Norway rat" -- Ra From biopython at maubp.freeserve.co.uk Sat Oct 4 13:24:45 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 4 Oct 2008 14:24:45 +0100 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <1223031024.8716.32.camel@454-2> References: <1223031024.8716.32.camel@454-2> Message-ID: <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> On Fri, Oct 3, 2008 at 11:50 AM, Raoul Jean Pierre Bonnal wrote: > Dear all, > accession AB030700 > > SOURCE Rattus norvegicus (Norway rat) > ORGANISM Rattus norvegicus (Norway rat) > > looking into taxon_name I can find only: > > Rattus norvegicus > > where did you store "(Norway rat)" ? > NCBI identify ?"(Norway rat)" as > > "Genbank common name: Norway rat" Hi Ra, How did you load this GenBank file into BioSQL? My guess is with BioPerl. Also, did you pre-load the NCBI taxonomy using the BioSQL script load_ncbi_taxonomy.pl? This will probably have an impact too. In any case, I suggest you look at the taxon and taxon_name tables in BioSQL. This can store the scientific name plus the common name. I would expect both "Rattus norvegicus" and "Norway rat" to be there for your entry, but without known more of the specifics its hard to guess. Peter From raoul.bonnal at itb.cnr.it Sat Oct 4 14:56:00 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Sat, 04 Oct 2008 16:56:00 +0200 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> References: <1223031024.8716.32.camel@454-2> <320fb6e00810040624t3278386bmd05063ded122223a@mail.gmail.com> Message-ID: <1223132160.6832.6.camel@454-2> Dear Peter, Il giorno sab, 04/10/2008 alle 14.24 +0100, Peter ha scritto: > How did you load this GenBank file into BioSQL? My guess is with > BioPerl. No, I'm following the development of the bioruby interface. I'm doing import/export test. My mistake I found as you suggested. -- Ra From hlapp at gmx.net Sat Oct 4 16:37:41 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sat, 4 Oct 2008 12:37:41 -0400 Subject: [BioSQL-l] class taxon_name and genbank In-Reply-To: <1223031024.8716.32.camel@454-2> References: <1223031024.8716.32.camel@454-2> Message-ID: <47D6D8CD-8276-4AE4-8B92-45689D5AFC3A@gmx.net> Hi Raoul, the BioSQL taxonomy tables are modeled directly based on the NCBI taxonomy model, which has nodes, each of which represents a taxon (where a tip, e.g., species, or an internal node, e.g. family), and one or more names for each node. The name_class column in taxon_name gives the type of the name. NCBI uses 'scientific name' as the name_class for the currently valid latin name (binomial). 'Norway rat' would have name_class 'common name'. There are others, such as misspellings etc. If you use the load_ncbi_taxonomy.pl script that comes with BioSQL, that's how the taxonomy will be loaded. Does this answer your question? -hilmar On Oct 3, 2008, at 6:50 AM, Raoul Jean Pierre Bonnal wrote: > Dear all, > accession AB030700 > > SOURCE Rattus norvegicus (Norway rat) > ORGANISM Rattus norvegicus (Norway rat) > > looking into taxon_name I can find only: > > Rattus norvegicus > > > where did you store "(Norway rat)" ? > NCBI identify "(Norway rat)" as > > "Genbank common name: Norway rat" > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From raoul.bonnal at itb.cnr.it Fri Oct 10 05:51:33 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 10 Oct 2008 07:51:33 +0200 Subject: [BioSQL-l] Location with lt, gt Message-ID: <1223617893.8716.4.camel@454-2> Dear all, how do you store this kind of location ? ------------------------------------ CDS <1..>314 ------------------------------------ /codon_start=1 /db_xref="SPTREMBL:Q9QYG8" /transl_table=1 /gene="uk" /product="uridine kinase" /protein_id="BAA83085.1" /translation="KLFVDTDADTRLSRRVLRDISERGRDLEQILSQYITFVKPAFEE FCLPTKKYADVIIPRGADNLVAINLIVQHIQDILNGGLSKRQTNGYLNGYTPSRKRQA SES" -- Ra From holland at eaglegenomics.com Fri Oct 10 09:46:12 2008 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 10 Oct 2008 10:46:12 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223617893.8716.4.camel@454-2> References: <1223617893.8716.4.camel@454-2> Message-ID: BioJava refers to the < and > notation as fuzzy locations. See: http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. cheers, Richard 2008/10/10 Raoul Jean Pierre Bonnal : > Dear all, > how do you store this kind of location ? > ------------------------------------ > CDS <1..>314 > ------------------------------------ > > /codon_start=1 > /db_xref="SPTREMBL:Q9QYG8" > /transl_table=1 > /gene="uk" > /product="uridine kinase" > /protein_id="BAA83085.1" > /translation="KLFVDTDADTRLSRRVLRDISERGRDLEQILSQYITFVKPAFEE > > FCLPTKKYADVIIPRGADNLVAINLIVQHIQDILNGGLSKRQTNGYLNGYTPSRKRQA > SES" > > > -- > Ra > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From biopython at maubp.freeserve.co.uk Fri Oct 10 09:58:44 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 10 Oct 2008 10:58:44 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: References: <1223617893.8716.4.camel@454-2> Message-ID: <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland wrote: > > BioJava refers to the < and > notation as fuzzy locations. See: > > http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. > Biopython also calls them fuzzy locations and has special location objects to hold this in memory. Raoul, were you asking how these are stored in the tables in BioSQL itself? Peter From raoul.bonnal at itb.cnr.it Fri Oct 10 10:40:04 2008 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 10 Oct 2008 12:40:04 +0200 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> Message-ID: <1223635204.6290.7.camel@454-2> Il giorno ven, 10/10/2008 alle 10.58 +0100, Peter ha scritto: > On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland > wrote: > > > > BioJava refers to the < and > notation as fuzzy locations. See: > > > > http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects. > > > > Biopython also calls them fuzzy locations and has special location > objects to hold this in memory. Ok BioRuby too but ... > Raoul, were you asking how these are stored in the tables in BioSQL itself? YES. -- Ra From biopython at maubp.freeserve.co.uk Fri Oct 10 11:11:09 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 10 Oct 2008 12:11:09 +0100 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223635204.6290.7.camel@454-2> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> <1223635204.6290.7.camel@454-2> Message-ID: <320fb6e00810100411j210f7f52qabb5a8ab6c1f3a34@mail.gmail.com> >> Raoul, were you asking how these are stored in the tables in BioSQL itself? > YES. Looking at the BioSQL schema, the location_qualifier_value table is used to record if a location table entry is fuzzy - but this is not something I am personally familiar with. Peter From hlapp at gmx.net Mon Oct 13 01:55:04 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Sun, 12 Oct 2008 21:55:04 -0400 Subject: [BioSQL-l] Location with lt, gt In-Reply-To: <1223635204.6290.7.camel@454-2> References: <1223617893.8716.4.camel@454-2> <320fb6e00810100258l4f6e2ecbh814d36142992c0a@mail.gmail.com> <1223635204.6290.7.camel@454-2> Message-ID: <6710FC24-2D32-493C-954C-E87EBC22041D@gmx.net> Sorry for the delay in responding. Peter's suggestion of location_qualifier_value is good but only half of the truth. More specifically, location_qualifier_value can be used to store the minimum and maximum start and end coordinates for the start and end positions of a location. The model that BioPerl uses for fuzzy locations is in Bio::Location::FuzzyLocationI. Basically, on top of the canonical start and end, a fuzzy location has a min_start, max_start, start_pos_type ('BEFORE', 'AFTER', 'EXACT','WITHIN', 'BETWEEN','UNCERTAIN') and similarly for the end position. How to determine the canonical start and positions from these is left to a CoordinatePolicy object, i.e., subject to different possible interpretations (e.g., conservative, maximum range, etc). In addition, locations have a type: 'EXACT', 'WITHIN', or 'IN- BETWEEN'. This would be set using the term_id foreign key from the location to the term table. Does this make sense? I have to admit that Bioperl-db doesn't implement round-tripping of fuzzy locations yet, so there actually isn't a reference implementation yet that you could look at (but Biojava sounded like it roundtrips those?). -hilmar On Oct 10, 2008, at 6:40 AM, Raoul Jean Pierre Bonnal wrote: > Il giorno ven, 10/10/2008 alle 10.58 +0100, Peter ha scritto: >> On Fri, Oct 10, 2008 at 10:46 AM, Richard Holland >> wrote: >>> >>> BioJava refers to the < and > notation as fuzzy locations. See: >>> >>> http://www.biojava.org/wiki/BioJava:BioJavaXDocs#Working_with_RichLocation_objects >>> . >>> >> >> Biopython also calls them fuzzy locations and has special location >> objects to hold this in memory. > Ok BioRuby too but ... > >> Raoul, were you asking how these are stored in the tables in BioSQL >> itself? > YES. > > -- > Ra > > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From johnsonm at gmail.com Mon Oct 13 18:29:18 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Mon, 13 Oct 2008 13:29:18 -0500 Subject: [BioSQL-l] SeqFeature scores Message-ID: I just noticed that instances of Bio::SeqFeature::Generic round-tripped through BioSQL (via bioperl-db) seem to be loosing their scores. Stored thusly: my $dbadp = Bio::DB::BioDB->new( -database => 'biosql', -user => $user, -pass => $pass, -dbname => $oracle_instance, -driver => 'Oracle' ); my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); my $seq = Bio::Seq->new( -id => 'DEBUG001', -accession_number => 'DBG001', -desc => 'Debug Sequence', -seq => 'GATTACA', -namespace => 'DEBUG', ); my $feature = Bio::SeqFeature::Generic->new( -seq_id => 'DEBUG001', -display_name => 'FEAT0001', -primary => 'debug', -source => 'mjohnson', -start => 1, -end => 1000, -strand => 1, -score => 100 ); $seq->add_SeqFeature($feature); my $pseq = $dbadp->create_persistent($seq); $pseq->store(); $adp->commit(); Fetched thusly: my $dbadp = Bio::DB::BioDB->new( -database => 'biosql', -user => $user, -pass => $pass, -dbname => $oracle_instance, -driver => 'Oracle' ); my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); my $query = Bio::DB::Query::BioQuery->new(); $query->datacollections([ "Bio::PrimarySeqI s", ]); $query->where(["s.display_id = 'DEBUG001'"]); my $result = $adp->find_by_query($query); while (my $seq = $result->next_object()) { my @features = $seq->get_SeqFeatures(); foreach my $feature (@features) { print join( "\t", $feature->display_name(), $feature->score() ), "\n"; } } It seems that $feature->score() is returning undef. I foresee three possibilities, in order of decreasing liklihood: 1) I'm doing something dumb 2) I'm expecting the wrong behaviour 3) I'm seeing some kind of odd bug I'm not opposed to spending some time in the debugger if it's 3, but I think 1 or 2 is far more likely. Comments/advice/rotten vegetables? From hlapp at gmx.net Mon Oct 13 20:03:32 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 13 Oct 2008 16:03:32 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: Message-ID: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Hi Mark, this was a known bug at some point. In fact, I can't swear that it was solved, so if you are running the latest bioperl-db version then I think it hasn't been. Can you confirm? -hilmar On Oct 13, 2008, at 2:29 PM, Mark Johnson wrote: > I just noticed that instances of Bio::SeqFeature::Generic > round-tripped through BioSQL (via bioperl-db) seem to be loosing their > scores. > > Stored thusly: > > my $dbadp = Bio::DB::BioDB->new( > -database => 'biosql', > -user => $user, > -pass => $pass, > -dbname => $oracle_instance, > -driver => 'Oracle' > ); > > my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); > > my $seq = Bio::Seq->new( > -id => 'DEBUG001', > -accession_number => 'DBG001', > -desc => 'Debug Sequence', > -seq => 'GATTACA', > -namespace => 'DEBUG', > ); > > my $feature = Bio::SeqFeature::Generic->new( > -seq_id => > 'DEBUG001', > -display_name => > 'FEAT0001', > -primary => > 'debug', > -source => > 'mjohnson', > -start => 1, > -end => 1000, > -strand => 1, > -score => 100 > ); > > $seq->add_SeqFeature($feature); > > my $pseq = $dbadp->create_persistent($seq); > > $pseq->store(); > $adp->commit(); > > Fetched thusly: > > my $dbadp = Bio::DB::BioDB->new( > -database => 'biosql', > -user => $user, > -pass => $pass, > -dbname => $oracle_instance, > -driver => 'Oracle' > ); > > my $adp = $dbadp->get_object_adaptor("Bio::SeqI"); > > my $query = Bio::DB::Query::BioQuery->new(); > > $query->datacollections([ > "Bio::PrimarySeqI s", > ]); > > $query->where(["s.display_id = 'DEBUG001'"]); > > my $result = $adp->find_by_query($query); > > while (my $seq = $result->next_object()) { > > my @features = $seq->get_SeqFeatures(); > > foreach my $feature (@features) { > > print join( > "\t", > $feature->display_name(), > $feature->score() > ), "\n"; > > } > > } > > It seems that $feature->score() is returning undef. > > I foresee three possibilities, in order of decreasing liklihood: > > 1) I'm doing something dumb > 2) I'm expecting the wrong behaviour > 3) I'm seeing some kind of odd bug > > I'm not opposed to spending some time in the debugger if it's 3, but I > think 1 or 2 is far more likely. > > Comments/advice/rotten vegetables? > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From johnsonm at gmail.com Tue Oct 14 16:03:03 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 14 Oct 2008 11:03:03 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Message-ID: On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: > Hi Mark, this was a known bug at some point. In fact, I can't swear that it > was solved, so if you are running the latest bioperl-db version then I think > it hasn't been. Can you confirm? > > -hilmar I just pointed my scriptage to a freshly updated bioperl-live and bioperl-db (updated via svn a few minutes ago). I'm seeing the same problem. So, unless I'm hallucinating, this is still an issue. Shall I open a new bug in bugzilla, or is there an old one you'd like to reopen? From cjfields at illinois.edu Tue Oct 14 17:28:02 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Oct 2008 12:28:02 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> Message-ID: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: > On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >> Hi Mark, this was a known bug at some point. In fact, I can't swear >> that it >> was solved, so if you are running the latest bioperl-db version >> then I think >> it hasn't been. Can you confirm? >> >> -hilmar > > I just pointed my scriptage to a freshly updated bioperl-live and > bioperl-db (updated via svn a few minutes ago). I'm seeing the same > problem. So, unless I'm hallucinating, this is still an issue. Shall > I open a new bug in bugzilla, or is there an old one you'd like to > reopen? I think open a new bug so we can track it. Personally I'm not sure how we'd store score data in BioSQL. Is 'score' within the schema? I suppose we could add it as a specific tag value but that seems potentially hackish and prone to naming conflicts. chris From johnsonm at gmail.com Tue Oct 14 19:14:04 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Tue, 14 Oct 2008 14:14:04 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields wrote: > On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: > >> On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >>> >>> Hi Mark, this was a known bug at some point. In fact, I can't swear that >>> it >>> was solved, so if you are running the latest bioperl-db version then I >>> think >>> it hasn't been. Can you confirm? >>> >>> -hilmar >> >> I just pointed my scriptage to a freshly updated bioperl-live and >> bioperl-db (updated via svn a few minutes ago). I'm seeing the same >> problem. So, unless I'm hallucinating, this is still an issue. Shall >> I open a new bug in bugzilla, or is there an old one you'd like to >> reopen? > > I think open a new bug so we can track it. Personally I'm not sure how we'd > store score data in BioSQL. Is 'score' within the schema? I suppose we > could add it as a specific tag value but that seems potentially hackish and > prone to naming conflicts. > > chris Well, if there is no provision for storing score in the schema, it's not a bug, it's an enhancement request. If it worked in the past or is supposed to work, it's a bug. As to where it ends up in the schema, I had been wondering about that. I presumed it ended up with the rest of the tags...and I wonder if that's not the most appropriate place for it. It's an attribute that not all features have. In fact, I don't see score as part of SeqFeatureI. Bio::SeqFeature::Generic has it's own accessor and storage. Speaking of which, I just tried to go browsing the deobfuscator and it seems to be horked (internal server error). Is there a better contact other than dag at sonsorol.org (server administrator)? Is there a bugzilla queue for website issues? From cjfields at illinois.edu Tue Oct 14 19:38:05 2008 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 14 Oct 2008 14:38:05 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: <4C9FB143-1BE7-441B-B904-B41C2815447A@illinois.edu> On Oct 14, 2008, at 2:14 PM, Mark Johnson wrote: > On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields > wrote: >> On Oct 14, 2008, at 11:03 AM, Mark Johnson wrote: >> >>> On Mon, Oct 13, 2008 at 3:03 PM, Hilmar Lapp wrote: >>>> >>>> Hi Mark, this was a known bug at some point. In fact, I can't >>>> swear that >>>> it >>>> was solved, so if you are running the latest bioperl-db version >>>> then I >>>> think >>>> it hasn't been. Can you confirm? >>>> >>>> -hilmar >>> >>> I just pointed my scriptage to a freshly updated bioperl-live and >>> bioperl-db (updated via svn a few minutes ago). I'm seeing the same >>> problem. So, unless I'm hallucinating, this is still an issue. >>> Shall >>> I open a new bug in bugzilla, or is there an old one you'd like to >>> reopen? >> >> I think open a new bug so we can track it. Personally I'm not sure >> how we'd >> store score data in BioSQL. Is 'score' within the schema? I >> suppose we >> could add it as a specific tag value but that seems potentially >> hackish and >> prone to naming conflicts. >> >> chris > > Well, if there is no provision for storing score in the schema, it's > not a bug, it's an enhancement request. If it worked in the past or > is supposed to work, it's a bug. As to where it ends up in the > schema, I had been wondering about that. I presumed it ended up with > the rest of the tags...and I wonder if that's not the most appropriate > place for it. It's an attribute that not all features have. In fact, > I don't see score as part of SeqFeatureI. Bio::SeqFeature::Generic > has it's own accessor and storage. Speaking of which, I just tried to > go browsing the deobfuscator and it seems to be horked (internal > server error). Is there a better contact other than dag at sonsorol.org > (server administrator)? Is there a bugzilla queue for website > issues? I would support adding score in as a tag (particularly seeing as it's not part of SeqFeatureI) but I think it requires a bit more discussion. I wouldn't be surprised if it was added as a method primarily to match GFF-like feature data with score information. Hilmar, thoughts? For the website issues you can submit support issues directly to support at helpdesk.open-bio.org (it will probably be faster). chris From johnsonm at gmail.com Wed Oct 15 21:50:53 2008 From: johnsonm at gmail.com (Mark Johnson) Date: Wed, 15 Oct 2008 16:50:53 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields wrote: > I think open a new bug so we can track it. Personally I'm not sure how we'd > store score data in BioSQL. Is 'score' within the schema? I suppose we > could add it as a specific tag value but that seems potentially hackish and > prone to naming conflicts. > > chris Should I open it against BioSQL or bioperl-db? I'd guess the latter unless we need schema changes? From hlapp at gmx.net Thu Oct 16 22:03:26 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 16 Oct 2008 18:03:26 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: Sorry for being slow to respond to this. Please file it against Bioperl-db. It needs to be stored in seqfeature_qualifier_value as the score really only applies as an attribute to computed features. The reason it doesn't get stored right now is that BioPerl treats it as a first-class property rather than an attribute. So Bioperl-db needs to map between the two models. -hilmar On Oct 15, 2008, at 5:50 PM, Mark Johnson wrote: > On Tue, Oct 14, 2008 at 12:28 PM, Chris Fields > wrote: > >> I think open a new bug so we can track it. Personally I'm not sure >> how we'd >> store score data in BioSQL. Is 'score' within the schema? I >> suppose we >> could add it as a specific tag value but that seems potentially >> hackish and >> prone to naming conflicts. >> >> chris > > Should I open it against BioSQL or bioperl-db? I'd guess the latter > unless we need schema changes? > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Thu Oct 16 22:12:13 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 16 Oct 2008 18:12:13 -0400 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> Message-ID: <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> On Oct 14, 2008, at 1:28 PM, Chris Fields wrote: > Personally I'm not sure how we'd store score data in BioSQL. Is > 'score' within the schema? I suppose we could add it as a specific > tag value but that seems potentially hackish and prone to naming > conflicts. Well, yes there might be naming conflicts, if someone wanted to add a tag to a seqfeature called 'score', for example. But would there be cases where it would make sense to have value stored in the feature's $feat->score() method *and* a (semantically) different one as the 'score' tag's value? Quite frankly I'd be hard- pressed to come up with a scenario where that might make sense. So as far as I am concerned I wouldn't actually have a problem with changing the implementation of score() to store/pull the value to/from the tag/value hash. In fact, that's what B::SF::Similarity does for the attributes it adds methods for (such as bits, significance, etc). Thoughts? I'm copying this to the Bioperl list as really it is a BioPerl/Bioperl-db issue. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at illinois.edu Fri Oct 17 00:32:07 2008 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 16 Oct 2008 19:32:07 -0500 Subject: [BioSQL-l] SeqFeature scores In-Reply-To: <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> References: <0FC73752-0290-4937-AE6B-974CA53A7865@gmx.net> <16770479-9818-4EF4-8AB4-7BCECB5FC482@illinois.edu> <4B3C9707-11C8-49AC-834E-1961F5816920@gmx.net> Message-ID: <126B4381-F15B-44FF-B91E-3A19523224B5@illinois.edu> On Oct 16, 2008, at 5:12 PM, Hilmar Lapp wrote: > > On Oct 14, 2008, at 1:28 PM, Chris Fields wrote: > >> Personally I'm not sure how we'd store score data in BioSQL. Is >> 'score' within the schema? I suppose we could add it as a specific >> tag value but that seems potentially hackish and prone to naming >> conflicts. > > Well, yes there might be naming conflicts, if someone wanted to add > a tag to a seqfeature called 'score', for example. > > But would there be cases where it would make sense to have value > stored in the feature's $feat->score() method *and* a (semantically) > different one as the 'score' tag's value? Quite frankly I'd be hard- > pressed to come up with a scenario where that might make sense. Agreed. I think it's a very unlikely scenario, frankly, but never hurts to bring it up. > So as far as I am concerned I wouldn't actually have a problem with > changing the implementation of score() to store/pull the value to/ > from the tag/value hash. In fact, that's what B::SF::Similarity does > for the attributes it adds methods for (such as bits, significance, > etc). > > Thoughts? I'm copying this to the Bioperl list as really it is a > BioPerl/Bioperl-db issue. > > -hilmar Makes sense to me. I think anything else not supported in BioSQL/ bioperl-db should likewise be stored in the tag/value hash, but score() is the only one that comes to mind at the moment. I'll make the change once the bug report is filed so we can track any problems I encounter. chris