Bioperl: Suggestions for Structure Object

Andrew Dalke dalke@bioreason.com
Wed, 23 Dec 1998 21:08:21 -0800


Bent Nagstrup Johansen <Bent.Nagstrup@post.uni-bielefeld.de> mentioned:
> The .mol2 format as used in Sybyl would also be very nice to have.

For those interested, the full format definition is at:
   http://www.tripos.com/TechBriefs/mol2_format/mol2.html

Unfortionately, I've never known the right way to deal with terms
like (in the "atom" record definition:
| atom_type (string) = the SYBYL atom type for the atom.

Luckily, Babel has some code describing the process :)


  As long as we're mentioning Sybyl, it uses a programming language
called SPL (Sybyl Programming Language) for manipulating most of
their software.  I've looked at it.  It's pretty complete for the functionality
they need (a control language with limited analysis
capabilities), but is a definite homebrew thing.  People interested
in methods needed for structure analysis might want to take a look
at it.
  We're buying Sybyl so have the manuals, including the on-line manuals
which cannot legally be put on the internet.  By the wonders of
Altavista, you might be able to track down a few places that
accidentally made it accessible.  Why it cannot be made available,
I do not know.

  Continuing in that vein, MSI's (or should I say Pharmacopeia's?)
Cerius2 has a Tcl based SDK with online documentation at:
   http://www.msi.com/support/sdk/pub/scripting/macro_language.html
  I prefer Cerius2's language design over SPL, though I believe
Sybyl makes more of the underlying functionality accessible to
their language.
  (Though in sometimes rather convoluted manners.  For example, one
SPL function, I forget which at the moment, prints some information to
stdout but doesn't make it accessible to SPL.  To caputure that data,
the documentation says to redirect stdout to a file then parse the
tmp file to get the data!)

  CCG (Computational Chemistry Group) has a system called MOE
(Molecular Operating Environment) which uses the SVL (Scientific Vector
Language.  This is probably the best of the languages, but I only saw
a demo of it about 1.5 years ago.  Their web page (chemcomp.com)
has some impressive statements.
  Their language is also homebrew.  It supports native vectors and
matricies, parallel and thread support, cross-platform graphics, and
a built-in "molecule" data type.
  My problems with it were its proprietary nature, lack of classes
and hybrid C/Fortran nature.  I believe they will always have a small
user group and limited library support.  I also believe what they've
done can be redone pretty easily with Python or Perl by a few people
in a year's time or so.  (Eg, the program described at 
http://www.chemcomp.com/feature/provalid.htm can be done in VMD
(using Tk as the input, Tcl for analysis and HTML for output) in a
day or so).
  Still, what they've got is something worth looking at for seeing
what people want and can do already.


> Construct multimers based on monomers and transrot matrices.

  If you need that now, in perl, and don't mindy crufty perl4 code,
you might want to look at the "pdblang" program I wrote and
mentioned easier (in ftp://ftp.ks.uiuc.edu/pub/group/dalke/ ).
  Perl4, being what it was, meant all of the molecule data is stored
as an array of strings, where the strings split on (I believe) "\0"
give you the atom record line and the atom record split on "\1" yields
the atom data.  (Back in the old days, we didn't have 1s and 0s --
we had to use ls and Os!  :)

						Andrew Dalke
						dalke@bioreason.com
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================