[Bioperl-l] need help in parsing KEGG data

Mark Johnson johnsonm at gmail.com
Mon Aug 18 13:26:26 UTC 2008


On Mon, Aug 18, 2008 at 6:45 AM, neeti somaiya <neetisomaiya at gmail.com> wrote:

> I am fetching data from the ent gene file of KEGG which is available here :
> ftp://ftp.genome.jp/pub/kegg/genes/organisms/hsa/H.sapiens.ent
>
> I am using Bio::SeqIO with file format of type KEGG. I am trying to fetch
> gene names and pathways in which they participate. I am getting the gene
> names fine. But this method
>
> "for my $pathway ( $seq->annotation->get_Annotations('pathway') ){
> }"
>
> does'nt seem to be working. I am not able to get the data of the pathways in
> which the gene is involved.
>
> Can someone please suggest how I can get the pathway data of genes from the
> KEGG ent file??

What exactly do you mean by "doesn't seem to be working" and what
version of BioPerl are you using?  The code below seems to function as
expected with BioPerl 1.5.2, producing output like this:

hsa04612  Antigen processing and presentation
hsa01430  Cell Communication
hsa04020  Calcium signaling pathway
hsa04080  Neuroactive ligand-receptor interaction
hsa04540  Gap junction
...
...
...

#!/wherever/bin/perl

use strict;
use warnings;

use Bio::SeqIO;


my $seqio = Bio::SeqIO->new(-format => 'kegg', -file => $ARGV[0]);

while (my $seq = $seqio->next_seq()) {

    foreach my $pathway ($seq->annotation->get_Annotations('pathway')) {

        ## $pathway should be a Bio::Annotation::Comment
        print $pathway->text(), "\n";

    }

}



More information about the Bioperl-l mailing list