[Biopython-dev] Benchmarking PDBParser

Chad Davis chad.a.davis at gmail.com
Wed May 4 13:55:04 UTC 2011


I'd be very interested in this as well.
I'm working on some modifications (in the alpha stages still) to the
BioPerl PDB parser (based on the Perl Data Language, analogous to
NumPy) and would be interested to compare all of them (BioPython old
and new, BioPerl old and new).

In my experience, downloading the PDB, just the divided structures,
works best with rsync, and I believe it should only take several
hours, not several days, the first time. It should be as easy as:

rsync -a rsync.wwpdb.org::ftp_data/structures/divided/pdb/ ./pdb

Other options:
http://www.wwpdb.org/downloads.html

Chad


On Wed, May 4, 2011 at 15:23, João Rodrigues <anaryin at gmail.com> wrote:
> Just a word of advice. I tried to download the whole PDB with PDBList.py and
> I ran into an error. Their server shut me down due to too many connections.
> Perhaps adding an exception catcher like the one we have for NCBI servers
> would be useful?
>
> Preliminary results show some degradation of speed..
>
> ==> benchmark_CATH-biopython_149.time <==
> Total time spent: 530.686s
> Average time per structure: 46.839ms
>
> ==> benchmark_CATH-biopython_current.time <==
> Total time spent: 686.176s
> Average time per structure: 60.563ms
>
> I'll write a full summary when I finish downloading the PDB and testing it.
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>




More information about the Biopython-dev mailing list