[Bioperl-l] parse genbank file

Andrew Walsh walsh at cenix-bioscience.com
Wed Aug 3 03:13:47 EDT 2005


Hello,

There is only 1 'sequence' in the file (namely, NC_003212).  The genes 
are actually features on the sequence.  So, you would have to get the 
'gene' sequence features for the sequence.

e.g.

my $gene_seq_feats = get_list_seq_feats_by_primary_tag($seq_obj, 'gene');

sub get_list_seq_feats_by_primary_tag {
     my ($seq_obj, $tag) = @_;
     ref $seq_obj or
         confess "Seq obj not defined!";
     my @features = $seq_obj->top_SeqFeatures();
     my @list = ();
     for my $feat (@features) {
         if ($feat->primary_tag eq $tag) {
             push @list, $feat;
         }
     }
     return \@list
}

HTH,

Andrew


Guido Dieterich wrote:
> Hi,
> 
> I want to parse a genbank file (Listeria Innocua)!
> 
> this is a part of the code ...
> <code>
> 
> my $file = "NC_003212.gbk";
> 
> my $stream = Bio::SeqIO->new(-file => $file, -format => 'GenBank');
> 
>     while( my $seq = $stream->next_seq ) {
> 
>         print $seq->display_id;
> 
> }
> 
> </code>
> 
> 
> output:
> 
> NC_003212
> 
> I just get the NC ID for this file, but not for the genes within ...
> 
> 
> ?????
> 
> Greetings
> 
> Guido
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


-- 
------------------------------------------------------------------
Andrew Walsh, M.Sc.
Bioinformatics Software Engineer
IT Unit
Cenix BioScience GmbH
Tatzberg 47
01307 Dresden
Germany
Tel. +49-351-4173 137
Fax  +49-351-4173 109

public key: http://www.cenix-bioscience.com/public_keys/walsh.gpg
------------------------------------------------------------------



More information about the Bioperl-l mailing list