[Bioperl-l] <no subject>

Lincoln Stein lstein at cshl.edu
Mon Mar 6 16:31:47 UTC 2006


Hi,

Since I wrote the last message I have done some more testing and have 
determined that the flybase GFF3 files cannot be stored in Bio::DB::GFF due 
to limitations in the Bio::DB::GFF data model. The issue is that Bio::DB::GFF 
can only store one level of parentage, and not the two levels needed by 
flybase genes.

Here is a quick fix to preprocess the gff3 files so that they can be used by 
Bio::DB::GFF:

	while (<>) {
		my @fields = split "\t";
		next unless $fields[2] eq 'mRNA';
		s/Parent=([^;]+)/Gene=$1/;
	} continue {
		print;
	}

This turns the "Parent" field of mRNA lines into a "Gene" attribute. You can 
then find all transcripts corresponding to a particular gene in much the way 
you tried earlier:

 my $tcs = $tg->features(-types =>'processed_transcript',
                                         -attributes => {Gene=> $gene},
                                         -iterator => 1);

I am going back to work on Bio::DB::GFF3, which will fix this problem.

Lincoln

On Monday 06 March 2006 00:02, Marco Blanchette wrote:
> Dear all--
>
> I am trying to forge my first bioperl weapons with the
> Bio::DB::GFF and Bio::Graphics modules. My goal is to display genes with
> their underlying mRNAs and later on add addition useful info (ie binding
> site for our preferred proteins).
>
> I loaded the GadFly gff3 annotation in a mysql database using
> bulk_load_gff.pl and I am trying to pass a Bio::SeqFeatureI to the
> Bio::Graphics::add_feature method.
>
> My understanding is that:
> my $tcs = $tg->features(-types =>'processed_transcript',
>                                         -attributes => {Parent => $gene},
>                                         -iterator => 1);
>
> Produces a Bio::SeqIO object that can be iterate through the next_seq
> method to get a Bio::Seq object that could be used to extract a
> Bio::SeqFeatureI by using the get_SeqFeatures method.
>
> Somehow, my script does not produce the expected results. Could somebody
> put me on back on the right track.
>
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use Bio::DB::GFF;
> use Bio::Graphics;
>
> my $dmdb = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
>                                    -dsn => "chr4",
>                                    );
>
>
> my @genes = ('CG2041'); ##a gene on the fourth chromosome
>
> foreach my $gene (@genes){
>
>     my $geneseg = $dmdb->segment(-name => $gene, -merge);
>
>     if ($geneseg){
>
>     my @tgs = $geneseg->features(-types => 'gene');
>
>     for my $tg (@tgs){
>
>         my $length = $tg->length();
>
>         my $panel = Bio::Graphics::Panel->new(-length => $length, -width 
> => 800);
>
>         my $track = $panel->add_track(    -glyph => 'generic',
>                                         -label  => 1);
>
>         my $tcs = $tg->features(-types =>'processed_transcript',
>                                         -attributes => {Parent => $gene},
>                                         -iterator => 1);
>
>         while ( my $tc = $tcs->next_seq ){
>             $track->add_feature($tc->get_SeqFeatures);
>         }
>
>         print $panel->png;
>     }
> }
> }
>
> Many thanks
>
>
> Marco Blanchette, Ph.D.
>
> mblanche at berkeley.edu
>
> Donald C. Rio's lab
> Department of Molecular and Cell Biology
> 16 Barker Hall
> University of California
> Berkeley, CA 94720-3204
>
> Tel: (510) 642-1084
> Cell: (510) 847-0996
> Fax: (510) 642-6062
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
FOR URGENT MESSAGES & SCHEDULING, 
PLEASE CONTACT MY ASSISTANT, 
SANDRA MICHELSEN, AT michelse at cshl.edu



More information about the Bioperl-l mailing list