[Bioperl-l] Batch retrieval partially implemented in Bio::DB::GenBank/GenPept

Chris Fields cjfields at uiuc.edu
Wed May 3 21:09:36 UTC 2006


Just wanted to let you guys know I have added a few bits and pieces to
Bio::DB::Gen*  and BioLLDB::NCBIHelper for batch retrieval using
epost/efetch.  I didn't want to break anything too severely so you can only
use this at the moment using get_seq_stream (i.e. NOT through get_Stream*
methods yet).  I also added tests to DB.t, a few each for protein and
nucleotide retrieval using batch mode and so far they all pass fine.  

I haven't tested the upper sequence limit for this yet to see if it's at all
comparable to just using efetch but it seems a bit faster.  The eutils
coursebook states that one should only post ~500 at a time (I think you can
get a bit higher though).

Also, at the moment it only works at the moment for GI's (NOT accessions,
which apparently epost does not accept).  If we want to continue using this
method for retrieval then we may need a workaround for accs.

CJF

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 





More information about the Bioperl-l mailing list