[Bioperl-l] reading gff3?

Smithies, Russell Russell.Smithies at agresearch.co.nz
Fri Mar 13 00:44:14 UTC 2009


Thanx guys,
I was fairly sure it _should_ work  :-)

The trick is to cal next_seq() AFTER you read all the features.
Also, the gff output from GBrowse doesn't work with Bio::FeatureIO as it has a few extra pragmas and is missing the ##FASTA line.

Here's what I ended up with:

-------------------------
#!perl -w
use Bio::FeatureIO;

my $gff_in  = Bio::FeatureIO->new(-file => "test.gff" , -format => "GFF");
my $seq_out = Bio::SeqIO->new(-fh => \*STDOUT, -format => "fasta");

while ( my $feat = $gff_in->next_feature() ) {
    push (@cds, $feat) if $feat->primary_tag =~ /CDS/;
}

## MUST be after you've read the features!!
my $seq = $gff_in->next_seq();

foreach $c (@cds){
	$seqobj = Bio::PrimarySeq->new (
		-seq => $seq->subseq($c->location),
		-id  => join("_",$c->primary_tag,$c->start, $c->end),
		);
	$seq_out->write_seq($seqobj);
}

-----------------------------------

--Russell



> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Friday, 13 March 2009 12:20 p.m.
> To: BioPerl List
> Subject: Re: [Bioperl-l] reading gff3?
> 
> http://bioperl.org/cgi-
> bin/deob_interface.cgi?Search=Search&module=Bio%3A%3AFeatureIO%3A%3Agff&sort_o
> rder=by+method&search_string=Bio%3A%3AFeatureIO%3A%3Agff
> 
> Method: fasta_mode
> 
> And comment in the next_seq() method:
> 
> "access the FASTA section (if any) at the end of the GFF stream.  note that
> this method
> will return undef if not all features in the stream have been handled"
> 
> >From a quick read through the code, it seems that once you've gotten all the
> features, you should be able to call next_seq() to get the fasta information.
> 
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> > Smithies, Russell
> > Sent: Thursday, March 12, 2009 3:42 PM
> > To: 'BioPerl List'
> > Subject: [Bioperl-l] reading gff3?
> >
> > What's the trick to reading the fasta attached to gff files?
> > Bio:FeatureIO and Bio::Tools::GFF both seem to ignore it
> > (unless I'm doing it wrong)
> >
> > What I'm trying to do is read in a gff3 file (with attached
> > fasta) then get the sequence for the CDS features contained within.
> >
> > Any ideas?
> >
> > Thanx,
> >
> > --Russell
> >
> >
> > Russell Smithies
> > Bioinformatics Applications Developer
> > T +64 3 489 9085
> > E  russell.smithies at agresearch.co.nz
> > Invermay  Research Centre
> > Puddle Alley,
> > Mosgiel,
> > New Zealand
> > T  +64 3 489 3809
> > F  +64 3 489 9174
> > www.agresearch.co.nz
> >
> > Toitu te whenua, Toitu te tangata
> > Sustain the land, Sustain the people
> >
> >
> > ==============================================================
> > =========
> > Attention: The information contained in this message and/or
> > attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
> > privileged
> > material. Any review, retransmission, dissemination or other
> > use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by
> > AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > ==============================================================
> > =========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list