[Bioperl-l] Downloading a sequence in genbank format - related problem

Georg Otto georg.otto at tuebingen.mpg.de
Wed May 16 09:19:06 UTC 2007


Dear all,

I have a problem that has to do with downloading data from GenBank as
well, therefor I put it in this thread.

I try to get all entries from organism Danio rerio using the something
like this:


use Bio::Seq;
use Bio::SeqIO;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;

my $query = "Danio rerio[ORGN]";
my $query_obj = Bio::DB::Query::GenBank->new(-db => 'nucleotide',
					       -query => $query);
my $gb_obj = Bio::DB::GenBank->new;
my $stream_obj = $gb_obj->get_Stream_by_query($query_obj);


while (my $seq_obj = $stream_obj->next_seq) {
  my $out = Bio::SeqIO->new(-format => 'fasta',
			    -file => '>>output.fas');
  $out->write_seq($seq_obj);
}


However, the download process aborts after a few thousand entries. I
do not think that this is due to the request itself or problems with
specific entries, since the number of transferred sequences varies
before the stop. It might rather have to do with GenBank terminating
the connection.

Has anybody a suggestion of a better strategy to achieve what I want
(e.g. a different kind of query, a method to reassume the download at
the point where it terminated etc.)?

Best,

Georg


"Diogo Tschoeke" <diogoat at gmail.com> writes:

> Dear All,
>
> I need to download a lot of sequence of Leishmania major in genbank
> format...
> But i can't download on the page of NCBI, because the downloaded file are
> corrupted... when i use a browser to download this sequences
> And them i looking for some script to download that`s file and fink
> something like that:
>
>
> #########################################################
> use strict;
> use warnings;
>
> use Bio::Seq;
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
> my $query = Bio::DB::Query::GenBank->new
>                                 (-query   =>'Leishmania major [Organism]',
>                                 -db      => 'nucleotide');
> my $gb = new Bio::DB::GenBank;
> my $seqio = $gb->get_Stream_by_query($query);
>
> my $out = Bio::SeqIO->new(-format => 'genbank',
>                           -file => '>>teste6.gb');
> $out->write_seq($seqio);
> #########################################################
>
> And the system return me this erros
> [diogo1 at genome perl]$ perl teste6.pl
>
> -------------------- WARNING ---------------------
> MSG:  Bio::SeqIO::genbank=HASH(0x96c0f08) is not a SeqI compliant module.
> Attempting to dump, but may fail!
> ---------------------------------------------------
> Can't locate object method "seq" via package "Bio::SeqIO::genbank" at
> /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 692.
>
> Any Ideia?
>
> Thank`s
>
> Diogo Tschoeke
> Laboratory of Molecular Biology of Trypanosomatides
> Fundação Osvaldo Cruz - Fiocruz RJ, Brazil
> http:biowebdb.org <http://www.ncbs.res.in/>





More information about the Bioperl-l mailing list