[Bioperl-l] Bio::DB::GenBank batch mode usage
Chris Fields
cjfields at illinois.edu
Thu Jul 2 19:29:29 UTC 2009
If you are just downloading the records to a file it might be better
to retrieve the raw records using EUtilities, providing you have
either the accession number or the GI. If downloading files via
Bio::DB::GenBank, it requires a preparse and write to file via
Bio::SeqIO.
---------------------------
use Bio::DB::EUtilities;
use Bio::SeqIO;
my @ids = (); # your GI/acc here
my $factory = Bio::DB::EUtilities->new(
-eutil => 'efetch',
-db => 'nucleotide',
-rettype => 'genbank',
-id => \@ids);
$factory->get_Response(-file => "records.gb");
---------------------------
If you have a long lost of IDs you can use epost first, then efetch
using the search history. This page has a few recipe scripts:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
chris
On Jul 2, 2009, at 1:50 PM, John Tyree wrote:
> I'm trying to use Bio::DB::GenBank to download a large number of files
> by accession number. The docs say not to do this in normal mode to
> reduce server load. There is some kind of helper function associated
> with this.
>
> %params = Bio::DB::GenBank->get_params('batch');
>
> But I don't understand how to use it. If you pass the hash using:
>
> Bio::DB::GenBank->new(%params);
>
> it raises the following and dies:
>
> --------------------- WARNING ---------------------
> MSG: invalid retrieval type tool must be one of
> (pipeline,io_string,tempfile
> ---------------------------------------------------
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: seq_start() must be integer value if set
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib64/perl5/site_perl/5.10.0/Bio/Root/Root.pm:357
> STACK: Bio::DB::NCBIHelper::seq_start
> /usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:416
> STACK: Bio::DB::NCBIHelper::new
> /usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:117
> STACK: Find_Patient_By_AccNo.pl:93
>
> There is a deprecated method called get_Stream_by_batch() but how does
> one achieve batch mode using the proper get_Stream_by_id() ?
>
> Thanks,
> John
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list