[Bioperl-l] Error loading GFF3: MSG: xxx doesn't have a primary id ...

Dan Bolser dan.bolser at gmail.com
Fri May 22 11:38:38 UTC 2009


Hi,

I'm using Bio::DB::SeqFeature::Store::GFF3Loader to load GFF into a
DB::SeqFeature::Store database.

I first load in a set of 'clones' in a GFF file that looks like this...

S.lycopersicum-chr4     SGN:chr04.v14.agp       cloned_genomic_insert
 7400895 7558294 .       -       .
ID=C04SLm0125H12.1;Alias=89;Ontology_term=SO:0000914
S.lycopersicum-chr4     SGN:chr04.v14.agp       cloned_genomic_insert
 7558295 7620759 .       +       .
ID=C04HBa0002B09.1;Alias=90;Ontology_term=SO:0000914
S.lycopersicum-chr4     SGN:chr04.v14.agp       cloned_genomic_insert
 7670760 7801908 .       +       .
ID=C04HBa0077O05.2;Alias=92;Ontology_term=SO:0000914


And then I load a bunch of Blast hits from those clones in a GFF file
that looks like this...

S.lycopersicum-chr4     BLASTN  match_part      14263569
14263620        56.0    -       0       Target=BAC10.Contig16 314
365;score=56.0;Parent=C04HBa0107N23.1
S.lycopersicum-chr4     BLASTN  match_part      7565714 7565734 42.1
 +       0       Target=BAC10.Contig16 199
219;score=42.1;Parent=C04HBa0002B09.1
S.lycopersicum-chr4     BLASTN  match_part      4309103 4309134 48.1
 -       0       Target=BAC10.Contig18 1704
1735;score=48.1;Parent=C04HBa0308B07.2


I'm not 100% sure I got the "tags" part of the latter GFF correct.


I'm getting the following error loading the second GFF file:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: C04HBa0002B09.1 doesn't have a primary id
STACK: Error::throw
STACK: Bio::Root::Root::throw ~/perl5/lib/perl5/Bio/Root/Root.pm:368
STACK: Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree_in_tables
~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/GFF3Loader.pm:685
STACK: Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree
~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/GFF3Loader.pm:664
STACK: Bio::DB::SeqFeature::Store::GFF3Loader::finish_load
~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/GFF3Loader.pm:318
STACK: Bio::DB::SeqFeature::Store::Loader::load_fh
~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/Loader.pm:325
STACK: Bio::DB::SeqFeature::Store::Loader::load
~/perl5/lib/perl5/Bio/DB/SeqFeature/Store/Loader.pm:222
STACK: ~/BiO/Util/my_seqfeature_load.plx:44
-----------------------------------------------------------


As you can see the ID C04HBa0002B09.1 (from the Parent tag of the
second GFF) *does* exist in the first GFF.


The features are apparently loaded correctly, and calling 'reindex' on
the database seems to run without error. I tried to look into the
above code, but I'm confused by all the calls to the Load 'Helper'.

a) is this the problem of my GFF?
b) is this important? (the features are apparently loaded)
c) can you fix it? ;-)


Thanks for any tips,
Dan.



More information about the Bioperl-l mailing list