[Biopython] downloading gnome Protein table

Peter Cock p.j.a.cock at googlemail.com
Wed Oct 26 15:27:37 UTC 2011


On Wed, Oct 26, 2011 at 4:11 PM, Sheila the angel
<from.d.putto at gmail.com> wrote:
> Hi All,
>
> I an facing some problem to downloading the gnome and other information.
> For an example I did a query on ncbi gnome for  NC_008390
> On clicking results you can get following link
>
> http://www.ncbi.nlm.nih.gov/sites/entrez?Db=genome&Cmd=ShowDetailView&TermToSearch=19840
> On my web-browser I can save this page  as File> Save as >out.html
>
> Furthermore I want to download the Protein table also
> http://www.ncbi.nlm.nih.gov/sites/entrez?Db=genome&Cmd=Retrieve&dopt=Protein+Table&list_uids=19840
>
> I want to do this for many Ids. Is there any simple way in Bio-Python???
>
> Thanks in Advance

Hmm, some of that might be available by Bio.Entrez, not sure though.

For the protein table I would personally work with the *.ptt files from
the NCBI FTP site, e.g.

ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Burkholderia_ambifaria_AMMD_uid13490/CP000441.ptt

or:

ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Burkholderia_ambifaria_AMMD_uid58303/NC_008391.ptt

The FTP links are on the page of the first URL you gave. You can download
all the "bacteria" *.ptt files as a tar ball,

ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.ptt.tar.gz

Typically I work from the GenBank file files instead (*.gbk rather than *.ptt)

Peter




More information about the Biopython mailing list