[Biopython-dev] [Biopython - Feature #3271] Updates to PDBList.py- downloading PDB structures

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Sun Jul 31 20:22:05 UTC 2011


Issue #3271 has been updated by Eric Talevich.


Hi David,

Thanks for doing this. Overall I agree with your solution. I peppered your proposed fix with review comments on Github:
https://github.com/DavidCain/biopython/commit/e6eef7e2a8117b6de4e9fdea3b4bd77575d383cf

Once you've looked at it again can you submit your pdb-fixes branch as a pull request on GitHub? (If not, no worries, I can cherry-pick it. Just let us know when you're ready.)

-Eric
----------------------------------------
Feature #3271: Updates to PDBList.py- downloading PDB structures
https://redmine.open-bio.org/issues/3271

Author: David Cain
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: 
Target version: 1.57
URL: https://github.com/DavidCain/biopython


PDBList.py is somewhat out of date: it has support for .Z compression, but the ftp://ftp.wwpdb.org/ server only has .gz archives. It also relies on a system utility to decompress the downloaded archives. The default, gunzip, is effective enough for posix systems, but Windows requires the installation of a command line tool, such as 7zip. I've rewritten it to use the gzip module, and to ignore the compression parameter (as all files are .gz anyway). I left the 'uncompress' and 'compression' parameters for backwards compatibility. I've also made it so that the user can override and use a system decompression tool if desired. I'm not sure if this is the best way to handle it, as the retrieve_pdb_file() function would work just fine removing support for system decompression and the 'compression' parameter.

Also, when calling retrieve_pdb_file() repeatedly, urllib can generate too many FTP connections and crash (for example) a script attempting to download some structures in succession. Updating to urllib2 removes this issue.

My GitHub branch is linked, and the only file I've modified (PDBList.py) is attached.


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list