[Bioperl-l] downloading multiple contigs from ncbi nucleotidedatabase

Rohit Ghai ghai.rohit at gmail.com
Fri Aug 21 13:40:02 UTC 2009


Thanks! I have made the change... no error yet.. so keeping my fingers
crossed

cheers
Rohit

On Fri, Aug 21, 2009 at 2:50 PM, Mark A. Jensen <maj at fortinbras.us> wrote:

> Hi Rohit-
> Re: timeout, you could try
> $factory->ua->timeout($number_greater_than_180_sec)
> before issuing the request.
> cheers MAJ
> ----- Original Message ----- From: "Rohit Ghai" <ghai.rohit at gmail.com>
> To: <Bioperl-l at lists.open-bio.org>
> Sent: Friday, August 21, 2009 7:34 AM
> Subject: [Bioperl-l] downloading multiple contigs from ncbi
> nucleotidedatabase
>
>
>  Hello all
>>
>> I would like to download the wgs sequences of the unfinished genomes from
>> ncbi.
>> (genomes in progress) from http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi
>>
>> here's an example accession
>>
>> NZ_ACVD00000000
>>
>> and here's the link to the accession at genbank
>>
>> http://www.ncbi.nlm.nih.gov/nuccore/NZ_ACVD00000000
>>
>> This record contains the accessions that belong to this record in the
>> following line in the genbank output
>>
>> WGS         NZ_ACVD01000001-NZ_ACVD01000139
>>
>> The        NZ_ACVD01000001-NZ_ACVD01000139  is the range of accession
>> numbers that are
>>
>> are specified by this range.
>>
>> here's a link
>>
>>
>> http://www.ncbi.nlm.nih.gov/sites/entrez?db=Nucleotide&cmd=Search&term=NZ_ACVD01000001:NZ_ACVD01000139[PACC]<http://www.ncbi.nlm.nih.gov/sites/entrez?db=Nucleotide&cmd=Search&term=NZ_ACVD01000001:NZ_ACVD01000139%5BPACC%5D>
>>
>>
>> The bioperl related question is...
>>
>> Since these are unassembled genomes, there are several contigs for each
>> one,
>> and they all available in this record.
>>
>> Is it possible to download a range without trying to recreate each
>> accession
>> number?
>>
>> on the other hand, it is possible to download each individually , this
>> would
>> mean making the following
>>
>> NZ_ACVD01000001
>> NZ_ACVD01000002
>> NZ_ACVD01000003
>> .
>> .
>> .
>> NZ_ACVD01000139
>>
>> from  NZ_ACVD01000001-NZ_ACVD01000139
>>
>>
>> I can recreate these numbers and download each one separately. However,
>> sometimes I get a timeout exception
>> and the whole thing stops.
>>
>> the code ( copied shamelessly from the bioperl website, works great to get
>> single accessions)
>>
>> my $id = "NZ_ACVD00000000";
>> my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch',
>>                                                                  -db =>
>> 'nucleotide',
>>                                                                  -id =>
>> $id,
>>                                                                 -rettype
>> => 'gbwithparts');
>>
>> $factory->get_Response(-file => 'fullcontig.gb');
>>
>>
>> I did try and catch the exceptions from the get_Response..but its not
>> working as expected... maybe someone can point out what I'm doing wrong
>> here. For some reason, the code never seems to go any print statement in
>> the
>> catch construct...
>>
>> $ele = "somecontig id";
>>
>>   try {
>>       print "\t[$numtries] TRYING TO DOWNLOAD $ele...\n";
>>       $factory->get_Response(-file => "$genbank_file");
>>
>>   } catch Bio::Root::Exception with {
>>           my $err = shift;
>>       if (! defined $err) {
>>           print "MAY HAVE DOWNLOADED $ele..\n";
>>       } else {
>>               print "PROBABLE TIMEOUT ERROR\n";
>>               print "$err\n";
>>       }
>>   };
>>
>>
>> Or is it possible to somehow increase the timeout time for the
>> get_Response
>> method?
>>
>> thanks in advance!
>>
>>
>> regards
>>
>> Rohit
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>



More information about the Bioperl-l mailing list