[Bioperl-l] not getting all exons back when using Bio::DB::GFF

Niels Klitgord niels_klitgord at dfci.harvard.edu
Thu Mar 9 10:23:44 EST 2006


Hello,
  Perhaps this is because I'm not using Bio::DB::GFF correctly.  Perhaps 
I missed a previous post on this, if so I apologize.  But on occasion 
when I try to get all the exons of an orf from a segment I wind up 
missing 1.  I am using  wormbase  WS150 release GFF, and am using 
GFF.pm,v 1.102 perl modual (maybe  I need to upgrade?).
This is the code I was using:

#!/usr/local/bin/perl -w
use strict;
use Bio::DB::GFF;

my $GFFdb = new Bio::DB::GFF(-adaptor=>'dbi::mysqlopt',
              -dsn=>'dbi:mysql:gff150;host=dome',
              -user=>'niels')or die("can't open gffDB");

my $gene = 'C08C3.3';

my @seg = $GFFdb->segment(-name=>$gene,  -class=>'CDS');

print "$gene CDS: ", $seg[0]->abs_start, "\t", $seg[0]->abs_stop, "\n";

my @all_exons = $seg[0]->features( 'exon:curated' );
foreach my $k (sort { $a->start <=> $b->start} @all_exons) {
    print "feature: ", $k->class, "\t", $k->type, "\t", $k->name, "\t", 
$k->abs_start, "\t",  $k->abs_stop, "\n"
}

and get:
C08C3.3 CDS: 7783311    7777192
feature: CDS    exon:curated    C08C3.3 7782898 7782816
feature: CDS    exon:curated    C08C3.3 7782130 7782027
feature: CDS    exon:curated    C08C3.3 7778462 7778314
feature: CDS    exon:curated    C08C3.3 7777314 7777192

However in the raw gff file we see  (and also in the mysql ):
CHROMOSOME_III  curated exon    7777192 7777314 .   -   .   CDS 
"C08C3.3"  <found>
CHROMOSOME_III  curated exon    7778314 7778462 .   -   .   CDS 
"C08C3.3" <found>
CHROMOSOME_III  curated exon    7782027 7782130 .   -   .   CDS 
"C08C3.3" <found>
CHROMOSOME_III  curated exon    7782816 7782898 .   -   .   CDS 
"C08C3.3" <found>

CHROMOSOME_III  curated exon    7783168 7783311 .   -   .   CDS 
"C08C3.3" <not found!!>

Am I just using this wrong, or should the last entry be returned as well?
Much thanks in advance,
  Niels



More information about the Bioperl-l mailing list