[Bioperl-l] Parsing entrezgene with Bio::SeqIO

Stefan Kirov skirov at utk.edu
Thu Mar 16 16:29:10 UTC 2006

Do this:
my @dblinks=$ann->get_Annotations('dblink');
foreach my $link (@dblinks) {
    next unless ($dblink->database eq 'KEGG");
    print $dblink->primary_id,"\t",$dblink->url,"\n";
This works for me, hopefully it will for you too. Let me know if 
something is not right.

Liisa Koski wrote:

>I'm using Bio::SeqIO to parse the EntrezGene file Homo_sapiens (from 
>I'm using bioperl-1.5.1.
>I want to extract the KEGG annotations.
>See code below.
>use Bio::SeqIO;
>use Bio::ASN1::EntrezGene;
>my $seqio = Bio::SeqIO->new(-format => 'entrezgene',
>                                             -file => 'Homo_sapiens');
>while (my $gene = $seqio->next_seq){
>    print "\n",$gene->id, "\t", $gene->accession_number, "\n";
>    my $ann = $gene->annotation();
>    foreach my $key ( $ann->get_all_annotation_keys() ) {
>        my @values = $ann->get_Annotations($key);
>        foreach my $value ( @values ) {
>            print $key, "\t", "=", "\t", $value->as_text,"\n";
>        }
>    }
>Unfortunately the only KEGG annotation I see in the results looks like:
>dblink  =       Direct database link to  in database KEGG 
>(Notice the space between 'to  in')
>Anyone have any ideas how to get the KEGG annotation results?
>Note: I also tried parsing the file 
>but I got the below error:
>./entrez_gene_seqio.pl Homo_sapiens.ags
>Data Error: none conforming data found on line 1 in Homo_sapiens.ags!
>first 20 (or till end of input) characters including the non-conforming data:
> at /netshare/home/koski/perl_modules/bioperl-live/Bio/SeqIO/entrezgene.pm 
>line 138
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org

More information about the Bioperl-l mailing list