[Bioperl-l] Re: [Bioperl-guts-l] problems with Bio::DB::GenBank

Jason Eric Stajich jason@cgt.mc.duke.edu
Thu, 20 Sep 2001 15:59:31 -0400 (EDT)


(Bioperl guts is for CVS msgs and bioperl administration stuff bioperl-l
is better place to ask questions.)

Bio::DB::GenBank is always going to be problematic because of the way we
connect to NCBI.  HTTP connections are frequently dropped and their site
frequently fails to return proper data.  I am at a loss for a solution at
this point other than building a netentrez wrapper/XS implementation
(Johnathan Epstein has mentioned he might be able to lead this effort) as
the problem is really on NCBI side dropping connections periodically. You
can use the Bio::DB::EMBL which will be slower from California, but has
typically been much more reliable.

I would suggest building the GenBank object as

my $db = new Bio::DB::GenBank(-verbose => 1);

to get more verbose output.

As an aside you will also get shutoff periodically from NCBI if you make a
lot of queries to their site (ie running this script repeatedly) - I think
selectively by IP address but I've not really seen documentation for this
just reported empirical evidence from other people playing around.

-jason
On Thu, 20 Sep 2001, Brian C. Thomas wrote:

> Hi
>
> I am having a bit of a problem with Bio::DB::GenBank.
> I have used this module numerous times in the past, and I don't know
> why I am getting this response now.
>
> Here's the code...
>
> ------------------------------------------
> #!/usr/bin/perl -w
> use Bio::DB::GenBank;
>
> $gb = new Bio::DB::GenBank;
> my($id) = "AI834759";
> $seqobj = $gb->get_Seq_by_id($id);
> print $seqobj->seq() . "\n";
> ------------------------------------------
>
> here's the output...
>
> ------------------------------------------
> > perl /tmp/test_webget.pl
> Can't call method "seq" on an undefined value at /tmp/test_webget.pl line 7.
> ------------------------------------------
>
> when I add in $gb->request_format('fasta'), I get this output...
> ------------------------------------------
> > perl /tmp/test_webget.pl
> -------------------- EXCEPTION --------------------
> MSG: Attempting to set the sequence to [<html] which does not look healthy
> STACK Bio::PrimarySeq::seq /usr/lib/perl5/site_perl/Bio/PrimarySeq.pm:243
> STACK Bio::PrimarySeq::new /usr/lib/perl5/site_perl/Bio/PrimarySeq.pm:218
> STACK Bio::Seq::new /usr/lib/perl5/site_perl/Bio/Seq.pm:132
> STACK Bio::SeqIO::fasta::next_primary_seq /usr/lib/perl5/site_perl/Bio/SeqIO/fasta.pm:130
> STACK Bio::SeqIO::fasta::next_seq /usr/lib/perl5/site_perl/Bio/SeqIO/fasta.pm:85
> STACK Bio::DB::WebDBSeqI::get_Seq_by_id /usr/lib/perl5/site_perl/Bio/DB/WebDBSeqI.pm:141
> STACK toplevel /tmp/test_webget.pl:7
> -------------------------------------------
> ------------------------------------------
>
> Any thoughts?
>
> Thanks,
>
> BCT
> _______________________________________________
> Bioperl-guts-l mailing list
> Bioperl-guts-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-guts-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu