[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Vivek Krishnakumar vivekkrishnakumar at gmail.com
Mon Jan 30 20:34:45 UTC 2012


Hello,

I have the following code snippet which is supposed to retrieve a certain 
gene "locus" feature from the backend database and use that to get the 
'start' and 'end' coordinate of the feature based on the 'strand' on which 
it is present.

my ($locus_obj, $gene_models) = get_annotation_db_features($locus, 
$gff_dbh);

sub get_annotation_db_features {
    my ($locus, $gff_dbh) = @_;

    my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => '$locus');

    my ($end5, $end3) = $locus_obj->strand == 1
      ? ($locus_obj->start, $locus_obj->end)
      : ($locus_obj->end, $locus_obj->start);

    my $segment = $gff_dbh->segment($locus_obj->refseq, $end5, $end3);
    my @gene_models =
      $segment->features('processed_transcript:working_models', -attributes 
=> { 'Gene' => $locus });

    #will have to sort the gene models
    return ($locus_obj, \@gene_models);
}

Also, here is a snippet of the GFF3 file that is used to populate the 
backend database:
##gff-version 3
chr2    working_models  gene    30427563    30429139    .   -   .   
ID=gene_35804;Note=Zinc transporter;Name=Medtr2g097580
chr2    working_models  mRNA    30427563    30429139    .   -   .   
ID=mrna_36255;Parent=gene_35804;Name=Medtr2g097580.1;conf_class=F
chr2    working_models  exon    30428491    30429139    .   -   .   
ID=exon_120028;Parent=mrna_36255
chr2    working_models  exon    30427563    30428147    .   -   .   
ID=exon_120029;Parent=mrna_36255
chr2    working_models  CDS 30428491    30429109    .   -   0   
ID=cds_120028;Parent=mrna_36255
chr2    working_models  CDS 30427756    30428147    .   -   2   
ID=cds_120029;Parent=mrna_36255

Considering that this gene locus is unique in the entire GFF file, If this 
above GFF is loaded into the SeqFeature::Store database, you would expect 
that running the following query should yield a count of "1":
SELECT count(f.id<https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2ff.id>) FROM 
feature as f, name as n WHERE (n.id<https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2fn.id>
=f.id<https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2ff.id%2f>
 AND n.name<https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2fn.name> = 
'Medtr2g097580' AND n.display_name > 0);

And in my case, it does yield "1".

But if I use the function *get_feature_by_name*, retrieve the locus object 
and try to get the strand of the locus, I get the following error:

[error] Can't call method "strand" on an undefined value at get_db_features.pl line 910


As I specified earlier, I do not have any problems if the backend is 
Bio::DB::GFF.

As we know that *get_feature_by_name* is in place for backward 
compatibility, I even tried modifying the code snippet to call '*get_feature
s_by_name*' instead and then *shift* out the first locus object from the 
list of matching features and use that, but no matter what, I do not get 
any locus object back from this subroutine!

Could someone please guide me in the right direction and let me know if I 
am making any mistakes here when migrating from GFF2 to GFF3?

Thanks in advance.
~ Vivek



More information about the Bioperl-l mailing list