[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Lincoln Stein lincoln.stein at gmail.com
Wed Feb 1 17:57:33 UTC 2012


Is this exception appearing at the get_features_by_name() call, or later on
in the script? I don't see how it can be generated in response to
get_features_by_name().

Lincoln

On Wed, Feb 1, 2012 at 11:51 AM, Vivek Krishnakumar <
vivekkrishnakumar at gmail.com> wrote:

> Hi Scott,
>
> Thanks very much for your suggestions. Looks I did miss it somehow
> (confusion was caused because I was using both bioperl-l at googlegroups and
> bioperl-l at open-bio)
>
> Anyway, I had modified my function exactly like your suggestion:
> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
> 'gene');
>
> But doing so just returns the following error:
>
> -------------------- EXCEPTION --------------------
> MSG: segment() called in a scalar context but multiple features match.
> Either call in a list context or narrow your search using the -types
> or -class arguments
>
> STACK Bio::DB::SeqFeature::Store::segment
> /usr/local/packages/perl-5.10.1/lib/5.10.1/Bio/DB/SeqFeature/Store.pm:1322
> STACK main::get_annotation_db_features
> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:899
> STACK main::structural_annotation
> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:660
> STACK toplevel /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:119
> -------------------------------------------
>
> which would suggest to oneself that there are several such features with
> the same ID. But in fact, I was able to verify by querying the database
> that I have only one such locus.
>
> As for your question regarding how $locus is populated, it is populated
> from a CGI parameter passed to the script. I know that I am only passing it
> one locus ID. And as I mentioned earlier in this thread, the warning
> statement I inserted before making the function call shows me that there is
> only one ID in the $locus variable.
>
> My last resort now is to try as you suggested and modify my GFF3 file and
> embed the -class => 'Gene' into the "Name" attribute. While doing so,
> should I also embed the 'mRNA' class into the "Name" attribute of the mRNA
> feature like so:
>
> chr2    working_models  mRNA    30427563    30429139    .   -   .
> ID=mrna_36255;Parent=gene_35804;Name=mRNA:Medtr2g097580.1;conf_class=F
>
> Subsequently, should I modify the function call to include the 'class':
> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
> 'gene', -class => 'Gene');
>
> Thank you.
> Vivek
>
> On Wed, Feb 1, 2012 at 11:26 AM, Scott Cain <scott at scottcain.net> wrote:
>
> > Hi Vivek,
> >
> > I responded to your original email and I suspect you may have missed
> > it.  I'll copy it below.  Another few things: how does $locus get
> > populated?  Are you sure what you expect to be there is?
> >
> > Also, to answer your question about the other bioperl modules you're
> > using: no, I don't think that's interfering.
> >
> > Scott
> >
> > ------------------------------------------------
> > Hello Vivek,
> >
> > In your GFF3, you don't have any features of class "Gene".  In GFF2,
> > the class was the text string that started the ninth column, like
> > this:
> >
> > chr2   .  gene    30427563    30429139    .   -   .  Gene 35804
> >
> > where the class would be Gene and the name (also called group) would
> > be 35804.  Class is not a particularly well defined concept in GFF3,
> > so the easiest way to restore functionality to your script is to
> > change the call from this:
> >
> >  my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => $locus);
> >
> > to this:
> >
> >  my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus,
> > -type => 'gene');
> >
> > I believe (though haven't tested it myself in a very long time) that
> > you can embed class in the name of the feature, like this:
> >
> > chr2  .  gene    30427563    30429139    .   -   .
>  Name=Gene:Medtr2g097580
> >
> > which may or may not be easier, depending on your data and your code
> base.
> >
> > Scott
> >
> >
> > On Wed, Feb 1, 2012 at 10:50 AM, Vivek Krishnakumar
> > <vivekkrishnakumar at gmail.com> wrote:
> > > Hi Lincoln,
> > >
> > > Thanks very much for your suggestions. Not sure how the single
> quotation
> > > marks appeared around the $locus variable. But looks like it was only
> in
> > > the email. Fortunately did not have quotes around the variable in my
> > > original code.
> > >
> > > Now, when I switch over to 'get_features_by_name()', my script does not
> > run
> > > to completion.
> > >
> > > I want to mention that this snippet of code is part of a larger CGI
> > script
> > > that interfaces with the SeqFeature backend DB. When I modify the
> > function
> > > call to $gff_dbh->get_features_by_name($locus), the script just runs
> > > indefinitely and returns absolutely nothing. I did put in a warn
> > statement
> > > to see if the correct locus ID is being passed to the function (I am
> able
> > > to see the warning message in my apache error log), which seems to be
> > fine.
> > > But the moment it reaches the function call step, the CGI script
> freezes
> > up
> > > and I am unable to do anything. It just ends up as a rogue process
> owned
> > by
> > > the 'daemon' user and continues to use up a lot of memory.
> > >
> > > I am using the following BioPerl modules in this CGI script:
> > > use Bio::SeqIO;
> > > use Bio::SearchIO;
> > > use Bio::DB::SeqFeature::Store;
> > > use Bio::SeqFeature::Generic;
> > > use Bio::Graphics;
> > > use Bio::Graphics::Feature;
> > >
> > > Could any of these be interfering with get_features_by_name()?
> > >
> > > Thank you.
> > > Vivek
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> >
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                   scott at scottcain
> > dot net
> > GMOD Coordinator (http://gmod.org/)                     216-392-3087
> > Ontario Institute for Cancer Research
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>



More information about the Bioperl-l mailing list