[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/931

Joel Martin j_martin@mhgc.lbl.gov
Wed, 21 Mar 2001 14:01:40 -0800 (PST)


NT contigs contain sequence, but entrez doesn't display it in the default
genbank format. You have to request fasta format, calling

$dbobj->request_format("fasta");

before

my $seq = $dbobj->get_Seq_by_acc($accession);

makes it work for me.

Joel

On Wed, 21 Mar 2001, Jason Stajich wrote:

> Most NT contigs do not contain any sequence, they are just an
> annotation  with references to clones.  So if you look at the sequence on
> NCBI there  is no sequence so bioperl is really not sure what to do with
> this.  Admittedly it should not balk, but that is the reason it is not
> working for the NT accessions you list.  If it really did work in 0.6.2
> then it is probably because we were using a different CGI script to query
> -- I guess I really don't know what all the appropriate web querying
> points are for entrez so I only followed the instructions on ncbi site and
> that is the info you are getting back.  You might try querying with
> the batch mode and see if it does anything different.
> 
> <<jason wishing there was a simple ncbi corba db model that we could just
> query>>
> 
> -Jason
> 
> 
> 
> On Wed, 21 Mar 2001 bioperl-bugs@bioperl.org wrote:
> 
> > JitterBug notification
> >
> > new message incoming/931
> >
> > Message summary for PR#931
> > 	From: Joe Ryan <jfryan@nhgri.nih.gov>
> > 	Subject: bug in entrez retrieval
> > 	Date: Wed, 21 Mar 2001 11:58:33 -0500
> > 	0 replies 	0 followups
> >
> > ====> ORIGINAL MESSAGE FOLLOWS <====
> >
> > >From jfryan@nhgri.nih.gov Wed Mar 21 11:58:36 2001
> > Received: from kronos.nhgri.nih.gov (nhgri.nih.gov [165.112.191.6] (may be forged))
> > 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f2LGwZ225701
> > 	for <bioperl-bugs@bio.perl.org>; Wed, 21 Mar 2001 11:58:36 -0500
> > Received: (from jfryan@localhost)
> > 	by kronos.nhgri.nih.gov (8.10.0/8.10.0) id f2LGwXD21860;
> > 	Wed, 21 Mar 2001 11:58:33 -0500 (EST)
> > Date: Wed, 21 Mar 2001 11:58:33 -0500
> > From: Joe Ryan <jfryan@nhgri.nih.gov>
> > To: bioperl-bugs@bio.perl.org
> > Subject: bug in entrez retrieval
> > Message-ID: <20010321115833.E12346@nhgri.nih.gov>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > X-Mailer: Mutt 1.0i
> >
> > Dear bioperl developers,
> >
> > I have recently started having problems with some code which uses
> > bioperl to retrieve sequences from Entrez.  Some (probably most)
> > accessions, work fine, but some true accessions are not being retrieved.
> >
> > Before upgrading from version .62 of bioperl we were having problems
> > with using Accession numbers with versions.   (e.g. asking for
> > NT_004705.1 would return NT_004705.2 which was the latest version
> > of the sequence).
> >
> > After upgrading to version .70 we now have a bunch of accessions
> > that fail completely.
> >
> > The following is some code which shows the problem.
> >
> > NAME: get_nt_length.pl
> > ---------------------------------------------------------------------------
> > #!/usr/local/bin/perl -w
> >
> > use strict;
> > use Bio::DB::GenBank;
> > use Bio::SeqIO;
> >
> > my $accession = shift @ARGV;
> > my $out = Bio::SeqIO->new('-fh' => \*STDOUT, '-format' => 'Fasta');
> > my $dbobj = Bio::DB::GenBank->new();
> > my $seq = $dbobj->get_Seq_by_acc($accession);
> > my $length = length $seq->seq();
> > print "$accession has $length base pairs\n";
> >
> > ---------------------------------------------------------------------------
> >
> > The following accessions work: AF284033, NM_002739, BG370814
> > The following fail: NT_004705, NT_019547
> >
> > Could someone let me know if this is a known bug and if there is
> > an estimated time that this will be fixed.  Or if I am doing something
> > wrong on my end.  I may be able to delve into the code a bit, if
> > it looks like none of you will be able to get to it soon.  If someone
> > wants to point me to the module that I should check that would save
> > me some time.  I was also considering using "idfetch" from the NCBI
> > toolkit as an alternative.
> >
> > Thanks,
> > Joe
> > --
> > Joseph Ryan <jfryan@nhgri.nih.gov>
> > National Human Genome Research Institute
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-guts-l mailing list
> > Bioperl-guts-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-guts-l
> >
> 
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center
> http://www.chg.duke.edu/
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>