[Bioperl-l] Pulling down data from NCBI

Chris Fields cjfields at illinois.edu
Mon Feb 1 21:31:48 UTC 2010


Abhi,

The accession in question is for a record containing a set of sequences, not just one sequence (it's a contig record).  The NCBI web interface is performing an esearch on this to get 34K seqs, the equivalent with EUtilities is:

================================
use Bio::DB::EUtilities;

my $id = 'AAPP01000000[ACCN]';

my $factory = Bio::DB::EUtilities->new  (
       -eutil =>  'esearch',
       -db    =>  'nucleotide',
       -term  =>  $id,
       -usehistory => 'y');

say $factory->get_count;

# do more here...

================================

The 'do more here' part is covered in the cookbook, and will require you retrieving the seqs in chunks. 

chris

On Feb 1, 2010, at 2:45 PM, Abhishek Pratap wrote:

> Thank you guys for very quick responses.  My bad I trusted my fingers.
> 
> Now that this is working the output that I am getting is not what I
> want. I am sure I am missing the correct way of doing it. So If I
> search the Nucleotide db @NCBI for this accession number "
> AAPP01000000", I see some 34 k records. What I need to do is pull down
> those sequences as fasta files.
> 
> I am referring to
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook  but dint quite
> find a similar example.
> 
> Thanks!
> -Abhi
> 
> On Mon, Feb 1, 2010 at 3:39 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:
>> Looks like you've misspelled one of the parameters. It should be
>> 'efetch' not 'efecth'
>> 
>> Kevin Brown
>> Center for Innovations in Medicine
>> Biodesign Institute
>> Arizona State University
>> 
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org
>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
>>> Abhishek Pratap
>>> Sent: Monday, February 01, 2010 1:36 PM
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] Pulling down data from NCBI
>>> 
>>> Hi All
>>> 
>>> I looking to batch download some 34K nucleotide sequences,
>>> corresponding to a NCBI accession number. I tired the following and
>>> getting an error. Has it got anything to do with recent update to code
>>> that Chris was discussing.
>>> 
>>> 
>>> 
>>> my $factory = Bio::DB::EUtilities->new  (
>>>                                       -eutil  => 'efecth',
>>>                                       -db     =>      'nucleotide',
>>>                                       -retype =>      'fasta',
>>>                                       -id             => $id
>>>                               );
>>> 
>>> 
>>> ----------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: efecth not supported
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw
>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:357
>>> STACK: Bio::Tools::EUtilities::EUtilParameters::eutil
>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Tools/EUtilities/EUtilPar
>>> ameters.pm:452
>>> STACK: Bio::Root::RootI::_set_from_args
>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/RootI.pm:546
>>> STACK: Bio::Tools::EUtilities::EUtilParameters::new
>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Tools/EUtilities/EUtilPar
>>> ameters.pm:193
>>> STACK: Bio::DB::EUtilities::new
>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/DB/EUtilities.pm:74
>>> STACK: ./getDatafromNCBI.pl:9
>>> 
>>> 
>>> -Abhi
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list