[Bioperl-l] parsing entrezgene file (lost data)

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Jul 5 22:26:41 UTC 2011


Bio::ASN1::EntrezGene is not the easiest to work with but you can access everything if you try hard enough.
I used it last year from transforming ASN.1 gene records from NCBI into fully annotated Wiki pages and it was very successful though I got sick of typing so many curly brackets ;-)


--Russell


> -----Original Message-----
> From: carandraug at gmail.com [mailto:carandraug at gmail.com] On Behalf Of
> Carnë Draug
> Sent: Wednesday, 6 July 2011 10:16 a.m.
> To: Smithies, Russell
> Cc: bioperl mailing list
> Subject: Re: [Bioperl-l] parsing entrezgene file (lost data)
> 
> On 5 July 2011 22:55, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
> > It is in there, just takes a bit of getting at.
> > Frequent use of Data::Dumper to work out where you are helps.
> >
> >
> >
> > use warnings;
> > use strict;
> > use Bio::ASN1::EntrezGene;
> > use Data::Dumper;
> >
> > my $parser = Bio::ASN1::EntrezGene->new('file' => "entrezgene.asn");
> > while(my $result = $parser->next_seq){
> >    $result = $result->[0] if(ref($result) eq 'ARRAY');
> >    foreach my $l (@{$result->{locus}}){
> >        foreach my $p (@{$l->{products}}){
> >
> >          my $nuc_gi = $p->{seqs}->[0]->{whole}->[0]->{gi};
> >          my $nuc_acc = $p->{accession};
> >
> >          my $prot_gi = $p->{products}->[0]->{seqs}->[0]->{whole}-
> >[0]->{gi};
> >          my $prot_acc = $p->{products}->[0]->{accession};
> >
> >          print "$nuc_gi, $nuc_acc\t$prot_gi, $prot_acc \n";
> >        }
> >    }
> > }
> >
> 
> Hmm.. I see it now but it's still not there when using the Bio::SeqIO
> module (I just tried with Bio::ASN1::EntrezGene as in your example and
> I can see it now). I thought that using the specific module was not
> recommended.
> 
> I just cloned the bioperl repo but the modules code is too much for
> me. It seems that Bio::SeqIO uses the Bio::SeqIO::entrezgene module
> instead of Bio::ASN1::EntrezGene . But then Bio::SeqIO::entrezgene
> does use Bio::ASN1::EntrezGene on the initializing method (this is the
> line from the module code)
> 
>     $self->{_parser} = Bio::ASN1::EntrezGene->new( file => $param{-
> file} );
> 
> So I have no idea what's wrong. Still, it's nice to have a workaround
> for now. Thank you,
> 
> Carnë
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================




More information about the Bioperl-l mailing list