[Bioperl-l] Extracting sequences from GFF3

Scott Cain scott at scottcain.net
Sat Sep 18 11:13:23 UTC 2010


Hi Dave,

I would use Bio::DB::SeqFeature::Store (either with a database on the
backend or a flat file if a database isn't warranted):

  my $db      = Bio::DB::SeqFeature::Store->new( -adaptor => 'memory',
                                                        -dir =>
'path/to/file' );

  # Warning: this returns a string, and not a PrimarySeq object
  my $sequence = $db->fetch_sequence('Chr1',5000=>6000);

Scott


On Sat, Sep 18, 2010 at 11:45 AM, David Breimann
<david.breimann at gmail.com> wrote:
> As you know, GFF3 files can contain FASTA sequences after the features.
>
> How do I extract a specific FASTA sequence given it's ID?
>
> I tried:
>
> use Bio::Tools::GFF;
> use Data::Dumper;
>
> my $gffio = Bio::Tools::GFF->new(
>    -file =>
>        "/path/to/file.gff",
>    -gff_version => 3
> );
>
> print Dumper $gffio->get_seqs();
>
> but $gffio->get_seqs() seems to return nothing, although the GFF3 has
> sequences and is also valid.
>
> By the way, I am able to parse the features themselves (using
> $gffio->next_feature()).
>
>
> Thanks,
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research



More information about the Bioperl-l mailing list