[Bioperl-l] struggling with Bio::FeatureIO and Bio::SeqFeature::Annotated

Allen Day allenday at ucla.edu
Tue Jan 25 04:45:28 EST 2005


On Tue, 25 Jan 2005, Marc Logghe wrote:

> Hi Allen,
> Thanks for the fixes !

no problem.  let me know if you find more stuff like this, i'm trying to
clean up all the calls to SeqFeatureI inheritors to use the interface
methods rather than subclass-specific methods.

> Like you suggested, I got the tag values when using stringification overload, so that is solved (I don't want to commit that myself though, seems too tricky to me ;-).
> What is not so nice is that I loose my splitted features:
>      gene            join(8311..8422,8852..8887,8940..9090,9142..9233,
>                      9721..9848,10296..10714,10835..10934,11584..11706)
>                      /gene="R12H7.1"
>      CDS             join(8311..8422,8852..8887,8940..9090,9142..9233,
>                      9721..9848,10296..10714,10835..10934,11584..11706)
> 
> 
> becomes now:
> 
>      gene            8311..8422
>                      /note="frame=."
>                      /gene="R12H7.1"
>      CDS             8311..8422
> 
> I tried to solve this issue by using the unflattener, but that did not work out quite well neither :-(
> My actual question is now: is there a way, using whatever system, to preserve the split feature structure ? That was actually what I was trying to do in the first place: reconstruct the original feature object starting from gff. Any ideas on that ?

oh.  i don't know anything about this.  never had to deal with split
locations before.  is this concept equivalent to a GFF3 Target attribute?  
maybe Scott Cain or Chris Mungall have something to say here.  i think
Scott is back from vacation tomorrow.

> 
> Also, do you think it will be possible to convert the Bio::SeqFeature::Annotated features into persistent ones so that these can be stored in BioSQL ? I'll try to test that out today.

no idea.  my guess is not without substantial effort.

-allen

> Cheers,
> Marc
> 
> 
> 
> 
> > -----Original Message-----
> > From: Allen Day [mailto:allenday at ucla.edu]
> > Sent: Tuesday, January 25, 2005 12:55 AM
> > To: Marc Logghe
> > Cc: Bioperl (E-mail)
> > Subject: Re: [Bioperl-l] struggling with Bio::FeatureIO and
> > Bio::SeqFeature::Annotated
> > 
> > 
> > Marc,
> > 
> > The problem was that Bio::SeqIO::FTHelper was making calls 
> > assuming it had 
> > a Bio::SeqFeature::Generic instance.  I've updated it to make calls 
> > compliant with the Bio::SeqFeatureI interface, and the script 
> > below now 
> > at least runs using "option 1".
> > 
> > "option 2" will not work, at least for now, because 
> > Bio::DB::GenBank is
> > creating a SeqIO that holds Bio::SeqFeature::Generic objects, 
> > and these
> > difficult to deal with because the internal data structures 
> > are different
> > than a Bio::SeqFeature::Annotated.  I like the technique used below to
> > bridge to Bio::FeatureIO via a Bio::Tools::GFF intermediary -- very
> > clever.
> > 
> > You'll also notice that the GenBank-formatted file output by 
> > the script 
> > doesn't look quite right, the FEATURES section looks kind of like:
> > 
> > FEATURES             Location/Qualifiers
> >      Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)1..20975
> >                      
> > /source="Bio::Annotation::SimpleValue=HASH(0x9bcdbe0)"
> >                      
> > /mol_type="Bio::Annotation::SimpleValue=HASH(0xa3dab1c)"
> >                      
> > /seq_id="Bio::Annotation::SimpleValue=HASH(0xa214de0)"
> >                      
> > /score="Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
> >                      
> > /frame="Bio::Annotation::SimpleValue=HASH(0xa439b04)"
> >                      /chad="Bio::Annotation::Comment=HASH(0xa3da9b4)"
> >                      
> > /note="score=Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
> >                      
> > /note="frame=Bio::Annotation::SimpleValue=HASH(0xa439b04)"
> >                      
> > /db_xref="Bio::Annotation::SimpleValue=HASH(0xa3daaf8)"
> >                      
> > /clone="Bio::Annotation::SimpleValue=HASH(0xa3dab28)"
> >                      
> > /strain="Bio::Annotation::SimpleValue=HASH(0xa3dabb8)"
> >                      
> > /phase="Bio::Annotation::SimpleValue=HASH(0xa3d935c)"
> >                      
> > /chromosome="Bio::Annotation::SimpleValue=HASH(0xa3dac00)"
> >                      
> > /type="Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)"
> >                      
> > /organism="Bio::Annotation::SimpleValue=HASH(0xa3dac48)"
> > 
> > because Bio::SeqFeautre::Annotated holds annotations as 
> > objects pointers
> > rather than strings.  We can fix this with a stringification 
> > overload, but
> > I noticed that the code exists to do this in the Bio::Annotation::*
> > classes but is commented out, and I'm not sure why.  Maybe 
> > Hilmar can shed
> > some light on this.
> > 
> > -Allen
> > 
> > 
> > 
> > On Mon, 24 Jan 2005, Marc Logghe wrote:
> > 
> > > Hi all,
> > > I have some problems with Bio::FeatureIO and 
> > Bio::SeqFeature::Annotated. But maybe these modules are not 
> > designed for the things I had in mind.
> > > My initial goal seemed pretty straightforward. It turned 
> > out differently.
> > > I have a gff file containing features of bunch of 
> > bioentries sitting in BioSQL.
> > > I wanted to turn the gff into feature objects, add them to 
> > the bioentries, and save them back into the database.
> > > As a test I fetch a genbank record, strip the features and 
> > convert them to gff. The gff is again converted to features 
> > and added to the stripped seq object.
> > > The test script looks like this:
> > > ========================================================
> > > #!/usr/bin/perl
> > > use strict;
> > > use Bio::SeqIO;
> > > use Bio::Tools::GFF;
> > > use Bio::FeatureIO;
> > > use IO::String;
> > > use Bio::DB::GenBank;
> > > 
> > > use Data::Dumper;
> > > 
> > > *Bio::SeqFeature::Annotated::all_tags = 
> > \*Bio::SeqFeature::Annotated::get_all_tags;
> > > 
> > > my $gff;
> > > my $gffio = IO::String->new($gff);
> > > 
> > > my $db = Bio::DB::GenBank->new;
> > > my $sout = Bio::SeqIO->new(-fh => \*STDOUT, -format => 'genbank');
> > > my $seq = $db->get_Seq_by_acc('Z50755');
> > > 
> > > my @feat = $seq->remove_SeqFeatures;
> > > 
> > > # writing option 1
> > > my $fout = Bio::Tools::GFF->new(-fh => $gffio, -gff_version => 3);
> > > # writing option 2
> > > my $fout = Bio::FeatureIO->new(-fh => $gffio, -format => 
> > 'gff', -version => 3);
> > > 
> > > $fout->write_feature(@feat);
> > > 
> > > $gffio = IO::String->new($gff);
> > > 
> > > my $fin = Bio::FeatureIO->new(-fh => $gffio, -format => 
> > 'gff', -version => 3);
> > > 
> > > while (my $feat = $fin->next_feature)
> > > {
> > >  $seq->add_SeqFeature($feat);
> > > }
> > > print Data::Dumper->Dump([$seq],['seq']);
> > > 
> > > $sout->write_seq($seq);
> > > ========================================================
> > > 
> > > First, I had an issue when writing the features to gff 
> > using Bio::FeatureIO (writing option 2):
> > > 
> > > ------------- EXCEPTION: Bio::Root::Exception -------------
> > > MSG: only Bio::SeqFeature::Annotated objects are writeable
> > > STACK: Error::throw
> > > STACK: Bio::Root::Root::throw 
> > /home/marcl/src/bioperl/bioperl-live/Bio/Root/Root.pm:328
> > > STACK: Bio::FeatureIO::gff::write_feature 
> > /home/marcl/src/bioperl/bioperl-live/Bio/FeatureIO/gff.pm:259
> > > STACK: ./test.pl:25
> > > -----------------------------------------------------------
> > > 
> > > Therefore, I used Bio::Tools::GFF to write (writing option 
> > 1). But then, I run into troubles when it comes to dumping 
> > the sequence into genbank format:
> > > Can't locate object method "all_tags" via package 
> > "Bio::SeqFeature::Annotated" at 
> > /home/marcl/src/bioperl/bioperl-live/Bio/SeqIO/FTHelper.pm 
> > line 212, <GEN1> line 52.
> > > 
> > > I tried to fix this by adding the line
> > > *Bio::SeqFeature::Annotated::all_tags = 
> > \*Bio::SeqFeature::Annotated::get_all_tags;
> > >  
> > > But in vain:
> > > Can't locate object method "get_all_tags" via package 
> > "Bio::Annotation::Collection" at 
> > /home/marcl/src/bioperl/bioperl-live/Bio/SeqFeature/Annotated.
> > pm line 547, <GEN1> line 52.
> > > 
> > > Regards,
> > > Marc
> > > 
> > > 
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > 
> > 
> 


More information about the Bioperl-l mailing list