[Biopython] downloading gnome Protein table
Peter Cock
p.j.a.cock at googlemail.com
Wed Oct 26 15:27:37 UTC 2011
On Wed, Oct 26, 2011 at 4:11 PM, Sheila the angel
<from.d.putto at gmail.com> wrote:
> Hi All,
>
> I an facing some problem to downloading the gnome and other information.
> For an example I did a query on ncbi gnome for NC_008390
> On clicking results you can get following link
>
> http://www.ncbi.nlm.nih.gov/sites/entrez?Db=genome&Cmd=ShowDetailView&TermToSearch=19840
> On my web-browser I can save this page as File> Save as >out.html
>
> Furthermore I want to download the Protein table also
> http://www.ncbi.nlm.nih.gov/sites/entrez?Db=genome&Cmd=Retrieve&dopt=Protein+Table&list_uids=19840
>
> I want to do this for many Ids. Is there any simple way in Bio-Python???
>
> Thanks in Advance
Hmm, some of that might be available by Bio.Entrez, not sure though.
For the protein table I would personally work with the *.ptt files from
the NCBI FTP site, e.g.
ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Burkholderia_ambifaria_AMMD_uid13490/CP000441.ptt
or:
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Burkholderia_ambifaria_AMMD_uid58303/NC_008391.ptt
The FTP links are on the page of the first URL you gave. You can download
all the "bacteria" *.ptt files as a tar ball,
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.ptt.tar.gz
Typically I work from the GenBank file files instead (*.gbk rather than *.ptt)
Peter
More information about the Biopython
mailing list