[BioPython] how to retrieve data from PDB

Peter biopython at maubp.freeserve.co.uk
Tue Mar 31 10:08:08 UTC 2009


On Tue, Mar 31, 2009 at 10:45 AM, chen Ku <biopython.chen at gmail.com> wrote:
> Dear peter,
>                   thanks for the idea.I think I need to download all the pdb
> files first and then can use command on python mode. Can you please write
> one syntax to start with or give me the practical documentation so that I
> can try out and play with this PDBList.

Hi Chen,

To learn about the PDBList functionality, see page 4 of "The Biopython
Structural Bioinformatics FAQ" - this has some examples:
http://biopython.org/DIST/docs/cookbook/biopdb_faq.pdf

You can also read about PDBList from the built in help,
>>> from Bio import PDB
>>> help(PDB.PDBList)
Or online at http://biopython.org/DIST/docs/api/Bio.PDB.PDBList%27.PDBList-class.html

If you really do want to download all 56,000+ PDB files (and I don't
think this is a good idea), instead of using Python, you might also
consider using the command line tool rsync, see:
http://www.pdb.org/pdb/general_information/news_publications/newsletters/2003q3/focus_rsync.html

However, as I said before, you only want transcription factors with
DNA, so at most you'll need to download the 2250 protein structures in
complex with nucleotides.  I strongly urge you to find out more about
searching the PDB in order to get a list of just the few PDB reference
codes that you'll actually need - and download just those.

Peter




More information about the Biopython mailing list