[Bioperl-l] Pulling down data from NCBI

Robert Bradbury robert.bradbury at gmail.com
Tue Feb 2 13:57:53 UTC 2010


What species/chromosome is this?

> > my $id = 'AAPP01000000[ACCN]';
>
>
One can usually download the genome sequence files, chromosome files, or
fasta files from the various FTP sites (almost all of the major genomes have
them) or the that would generally be the fastest way to do it.  Or simply
look up the Genome sequence at NCBI and download it using a web browser.
There is standard documentation on how to convert genome sequences into
fasta files.

If you are looking for the "big" genomes which may not be in NCBI yet, go to
the Broad Institute.  Some bacterial sequences may still only be at TIGR or
JGI until they migrate upward.

But using the standard "system" utilities to do this is usually a far better
way to do this then wrestling in BioPerl.  FTP and Web Servers are written
in C or C++ which is compiled and much more efficient than an interpreted
language like Perl or Java.

Robert



More information about the Bioperl-l mailing list