[Bioperl-l] Limit on number of Bio::Seqs returned from Bio::DB::GenBank::get_Stream_by_batch

CHALFANT_CHRIS_M@Lilly.com CHALFANT_CHRIS_M@Lilly.com
Mon, 29 Apr 2002 09:36:07 -0500


Ewan and Lincoln,

Thanks for the replies.  We do download and index the Genbank flatfiles 
and thus we are able to serve most of our record requests that way.  There 
are still cases, however, when we are unable to find a record in our local 
system (our local system is out of sync or unavailable, the accession 
number requested is invalid, the record has been removed from Genbank at 
the submitter's request, etc).  In this case, we would like to fall back 
to the web.  We specifically fall back to the web for GenPept records as 
we don't have access to these flat file Genbank records.

I would like to stick with BioPerl, but may implement the Boulder::Genbank 
fix (in fact, its what I used before BioPerl 1.0) in the short term.  Is 
there any intention to beef up Bio::DB::GenBank to use batch retrieval for 
requests on the order of 10,000-20,000 records?  Or to fold in the code 
from Boulder::Genbank?

Chris





Lincoln Stein <lstein@cshl.org>
Sent by: bioperl-l-admin@bioperl.org
04/26/2002 09:14 AM
Please respond to lstein

 
        To:     Ewan Birney <birney@ebi.ac.uk>, CHALFANT_CHRIS_M@lilly.com
        cc:     bioperl-l@bioperl.org
        Subject:        Re: [Bioperl-l] Limit on number of Bio::Seqs returned from 
Bio::DB::GenBank::get_Stream_by_batch



Genbank/Entrez has a habit of imposing arbitrary limits on downloads, 
complicating Bio::DB::GenBank and brethren considerably. 

This is not an approved, bioperl-compliant, or even supported solution, 
but 
as a temporary measure you can use the Boulder::Genbank modules 
(http://stein.cshl.org/software/boulder/) to get the genbank entries you need 
if you don't want to download the entire genbank distribution.  This 
module 
jumps through several hoops in order to reissue requests when Entrez cuts 
you 
off.  You can then feed the genbank flat files that Boulder returns to 
Bio::SeqIO to get proper bioperl objects.

Lincoln

On Thursday 25 April 2002 17:05, Ewan Birney wrote:
> On Thu, 25 Apr 2002 CHALFANT_CHRIS_M@Lilly.com wrote:
> > Is there an upper limit on the number of sequences returned by
> > Bio::DB::GenBank::get_Stream_by_batch?  I seem to be limited to 20. Is
> > there a way to increase this limit?
>
> It is probab;y best to download all teh data locally and use soemthing
> like Bio::Index::GenBank.
>
> > Chris
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> -----------------------------------------------------------------
> Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
> <birney@ebi.ac.uk>.
> -----------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l