[Biopython-dev] Contribution Structure Comparison

Peter Lackner Peter.Lackner at sbg.ac.at
Mon Sep 5 05:06:03 EDT 2005

Here is the code and a README file. The setup.py installs SComPy 
as separate module, at the moment.

On Fri, 2 Sep 2005, Thomas Hamelryck wrote:

> > I'd like to contribute code (Module SComPy) for protein structure
> > comparison to biopython. 
> Hi Peter,
> Sounds great. Can you send the code so we can take a look at it?
> Cheers,
> -- 
> Thomas Hamelryck, Postdoctoral researcher
> Bioinformatics center
> Institute of Molecular Biology and Physiology
> University of Copenhagen      
> Universitetsparken 15 Bygning 10                 
> 2100 Copenhagen, Denmark
> ---
> http://www.binf.ku.dk/users/thamelry/
 Assoc.Prof. Peter Lackner
 Department of Molecular Biology
 University of Salzburg
 Hellbrunner-Str. 34
 A-5020 Salzburg / Austria

 E-mail:  Peter.Lackner at sbg.ac.at

-------------- next part --------------
A non-text attachment was scrubbed...
Name: SComPy1.0.tar.gz
Type: application/x-gunzip
Size: 26561 bytes
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20050905/f712f273/SComPy1.0.tar-0001.bin
-------------- next part --------------
SComPy provides python code for building protein structure comparison
tools as well as two ready-to-use structure comparison approaches. 


SComPy is intended to be part future biopython releases.
Meanwhile, SComPy is being installed as separate package
but requires Biopython, release 1.40b or later (http://biopython.org/)
and Numerical Python (http://numeric.scipy.org/).

SComPy was tested on several 32 and 64 Bit Linux systems but also
should work on any Unix like system.

Unpack files:

tar xvzf SComPy1.0.tar.gz


cd SComPy1.0

As root run:

python setup.py install

That's it.


As input it uses structure files with PDB ATOM records, where every
residue is represented by exactly 5 atoms, N,CA,C,O,CB. Other lines may
not included. The scripts folder contains a python script to convert
PDB files into the required format. PDB files are searched locally
or they are downloaded from http://www.rcsb.org. E.g.:

python get_backbones.py 2hhb 

creates the files 2hhb-A.bb, 2hhb-B.bb, 2hhb-C.bb, 2hhb-D.bb , one for
each chain. Chains may be joined in single bb files, allowing for multi 
chain comparisons, e.g.:
cat 2hhb-A.bb 2hhb-B.bb > 2hhb-AB.bb 
However, if you pack more than one chain into a bb file, different
chain ids are required for each chain.

python get_backbones.py 2bbm -c A

extracts chain A from 2bbm, and creates file 2bbm-A.0.bb. The "0" indicates
that 2bbm is an NMR file. 2bbm-A.0.bb contains model number 0 from 2bbm. 

After processing the pdb files for input, you may go ahead with
the ready-to-use sample approaches. Please note that we only did a 
rough optimization of the parameters. Thus in some intricate cases
of remote structural similarity you may need to change some parameters.
Additionally, the refinement step applies only 4 iterations to
save execution time. It may be useful to increase the parameter maxiter.

An example:
Prepare bb files for 2bbm and 4cln, open a python shell and type:

>>> from SComPy import *
>>> p=Parameter()            # creates new set of parameters
>>> approach2(p,"2bbm-A.0.bb","4cln.bb")

This generates an number of output files:

2bbm-A.0_4cln.eqs  ... a list of equivalent residues for the best
                       superimposable solution
2bbm-A.0_4cln.ali  ... the corresponding alignment
2bbm-A.0_4cln.pml  ... a pymol script for visualizing the result.

Additionally you get:


2bbm/4cln contain two similar domains which are not superimposable
at the same time. Approach1 produces several alternative alignments,
where two of them represent the superimpositions of the corresponding
domains. As the two alternative solutions are non-overlapping, we
may join them. *.joined1.pml and *joined2.pml contain the two corresponding

If you prefer rasmol scripts instead of pymol scripts, after
creating the parameter object, type: 
p.mf_format = "rasmol"

Approach2 also detects circular permutations, as long as the 
permutated parts are superimposable. You might try this with
2bqp-A and 1nls. The spheres in the rasmol/pymol images
represent the point of permutation (or the point where alternative
alignments are joined).

More information about the Biopython-dev mailing list