[Bioperl-l] bp_seqfeature_load / Bio::DB::SeqFeature::Store::GFF3Loader problems augmenting Flybase annotation

Cook, Malcolm MEC at stowers-institute.org
Tue Dec 19 19:57:48 UTC 2006


Lincoln and fellow Bio::DB::SeqFeature travelers,

I find that using bp_seqfeature_load.PLS to load subfeatures of genes
already loaded using bp_seqfeature_load.PLS fails with 

------------- EXCEPTION  -------------
MSG: FBgn0017545 doesn't have a primary id
STACK
Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree_in_tables
/home/mec/cvs/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm:682
STACK Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree
/home/mec/cvs/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm:663
STACK Bio::DB::SeqFeature::Store::GFF3Loader::finish_load
/home/mec/cvs/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm:372
STACK Bio::DB::SeqFeature::Store::GFF3Loader::load_fh
/home/mec/cvs/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm:345
STACK Bio::DB::SeqFeature::Store::GFF3Loader::load
/home/mec/cvs/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm:242
STACK toplevel
/home/mec/cvs/bioperl-live/scripts/Bio-SeqFeature-Store/bp_seqfeature_lo
ad.PLS:76

Where FBgn0017545 is the ID of a gene previously loaded.

I am unsure how to remedy my situation and welcome any advise on correct
or improved approach to my problem.

Here's some detail if it helps.  I am developing a pipeline to design a
microarray probes capable of distinguishing among splice variants in
drosophila (using latest Flybase dmel_r5.1 annotation).  So I

1) load a filtered selection of Flybase annotation using
bp_seqfeature_load.  (for testing purposes, I am using a single gene's
worth of annotation, FBgn0017545.gff, attached).  This is done as
follows:

	> bp_seqfeature_load.PLS  --create FBgn0017545.gff 

2) analyze all the genes in the database, and create GFF3 output each
feature of which has a 'Parent' that is a previously loaded gene (i.e.
FBgn0017545).  (These features represent the unique introns, splice
sites, and exonic design targets. Output of this analysis,
FBgn0017545_matd.gff, is also attached)

3) load these analysis results into the same database, as follows:

	> bp_seqfeature_load.PLS          FBgn0017545_matd.gff

It is at this point that I get the above error.

However, I don't get any error and the data loads fine if I load the two
files together, as follows:

	> bp_seqfeature_load.PLS --create <(cat FBgn0017545.gff
FBgn0017545_matd.gff)

So, I suspect that either I am misunderstanding when/how to use
bp_seqfeature_load.PLS or else this use case has not yet arisen and must
be provided for somehow.

I am running against bioperl-live

Thanks for your thoughts and assistance,

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
 




More information about the Bioperl-l mailing list