[Bioperl-l] dssp script

Robinson, Peter peter.robinson at charite.de
Fri Jan 21 08:30:14 EST 2005


Dear BioPerlers,

I am writing a script to use the BioPerl DSSP module to print out a list of phi and psi angles for all applicable residues  of all chains. Although the results are correct, I get the following error message at the end of each chain:

Argument "" isn't numeric in numeric eq (==) at /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line 1168.

and I am not quite sure where it is coming from. Perhaps I am using the wrong part of the API, but I am trying to get a list of all residues for each chain as follows:

foreach my $ch (@chains) {
  my $ss_elements_pts = $dssp->secBounds($ch);
  print "Chain $ch:\n";
  my $pos = 0;
  my $max = 0;
  foreach my $stretch (@{$ss_elements_pts}) {
    my $start = $stretch->[0];
    my $end = $stretch->[1]; 
    if ($end =~ m/(\d+)/) { $end = $1; }
   
    if ($end  > $max) { $max = $end; }
  }
  ## END is now the last residue in this chain
  for my $res (1..$max) {
    my $residueID = $res . ":" . $ch;
    my ($phi,$psi,$SS,$SSsum,$AA);
    eval { $phi = $dssp->resPhi($residueID);};
	etc.

The full script is appended to the bottom of this mail.


I also noticed what might be a minor bug in the module DSSP/Res.pm; when I use dsspcmbi to analyze a PDB file, it produces a results file with an empty last line. This causes a crash:

Use of uninitialized value in chomp at /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line 1284, <GEN1> line 955.


 If I manually remove this last empty line, there was no error. By adding the following line at Res.pm l.1284, you can fix the problem:


 while ( chomp( $cur = <$file> ) ) {
      next if ($cur =~ m/^\s*$/);  *********************************************
	$res_num = substr( $cur, 0, 5 );
	$res_num =~ s/\s//g;
	$self->{ 'Res' }->[ $res_num ] = &_parseResLine( $cur );
    }
}




Thanks in adavance for any tips! Peter
Peter N. Robinson, M.D.
Institute of Medical Genetics
Charité University Hospital
Augustenburger Platz 1
13353 Berlin
Germany
++49-30-450 569124
peter.robinson at charite.de
http://www.charite.de/ch/medgen/robinson
Beware of bugs in the above code; I have only proved it correct, not tried it. -Donald Knuth, computer scientist (1938- )

########################

#!/usr/bin/perl -w
use IO::File;
use Bio::Structure::SecStr::DSSP::Res;
use Data::Dumper;


=pod
parseDSSP.pl
Script to parse the output of DSSP using the BioPerl module
Bio::Structure::SecStr::DSSP::Res. To use it, process a PDB
file with dssp or dsspcmbi, and pass the resulting file to 
this script. For more information on dssp and BioPerl see the
module documentation at http://bioperl.org

@email peter.robinson at charite.de
21 January, 2005

=cut



my $file = "pdb43ca.dssp";
my $dssp = new Bio::Structure::SecStr::DSSP::Res('-file'=> "$file");

my $pdbID = $dssp->pdbID();
my $auth  = $dssp->pdbAuthor();
my $cmpd = $dssp->pdbCompound();
my $pdb_date = $dssp->pdbDate();
my $header = $dssp->pdbHeader();
my $pdbSource = $dssp->pdbSource();

print "PDB entry $pdbID \n\tauthor:\t$auth",
  "\n\tCompound:\t$cmpd",
  "\n\tDate:\t$pdb_date",
  "\n\tHeader:\t$header",
  "\n\tsource:\t$pdbSource\n\n";

my $totalRes = $dssp->numResidues();
print "Total residue count (all chains):$totalRes\n";


my $surArea= $dssp->totSurfArea();
print "Total accessible surface area:\t$surArea  (square Ang)\n";


my $chainRef = $dssp->chains();
my @chains = sort  @{$chainRef};
print "Chain[s]:\n";
foreach my $ch (@chains) {
  print "\t$ch";
}
print "\n";

my $hb = $dssp->hBonds();
print "H BONDS.\n";
print "TYPE O(I)-->H-N(J): $hb->[0]\n",
   "IN PARALLEL BRIDGES: $hb->[1]\n",
   "IN ANTIPARALLEL BRIDGES $hb->[2]\n",
   "TYPE O(I)-->H-N(I-5) $hb->[3]\n",
   "TYPE O(I)-->H-N(I-4) $hb->[4]\n",
   "TYPE O(I)-->H-N(I-3) $hb->[5]\n",
   "TYPE O(I)-->H-N(I-2) $hb->[6]\n",
   "TYPE O(I)-->H-N(I-1) $hb->[7]\n",
   "TYPE O(I)-->H-N(I+0) $hb->[8]\n",
   "TYPE O(I)-->H-N(I+1) $hb->[9]\n",
   "TYPE O(I)-->H-N(I+2) $hb->[10]\n",
   "TYPE O(I)-->H-N(I+3) $hb->[11]\n",
   "TYPE O(I)-->H-N(I+4) $hb->[12]\n",
   "TYPE O(I)-->H-N(I+5) $hb->[13]\n",
  "\n";

   
 
foreach my $ch (@chains) {
  my $ss_elements_pts = $dssp->secBounds($ch);
  print "Chain $ch:\n";
  my $pos = 0;
  my $max = 0;
  foreach my $stretch (@{$ss_elements_pts}) {
    my $start = $stretch->[0];
    my $end = $stretch->[1]; 
    if ($end =~ m/(\d+)/) { $end = $1; }
   
    if ($end  > $max) { $max = $end; }
  }
  ## END is now the last residue in this chain
  for my $res (1..$max) {
    my $residueID = $res . ":" . $ch;
    my ($phi,$psi,$SS,$SSsum,$AA);
    eval { $phi = $dssp->resPhi($residueID);};
    eval { $psi = $dssp->resPsi($residueID);};
    eval { $SS = $dssp->resSecStr($residueID);};
    eval { $SSsum = $dssp->resSecStrSum($residueID);};
    $AA = $dssp->resAA($residueID);
    $phi = $phi || "n/a";
    $psi = $psi || "n/a";
    $SS = $SS || "-";
    my $SSclass;
    if ($SSsum eq "H") { $SSclass = "helix"; }
    elsif ($SSsum eq "T") { $SSclass = "turn"; }
    elsif ($SSsum eq "B") { $SSclass = "beta"; }
    else { $SSclass = $SSsum; }
    print "$residueID) [$AA] phi:$phi psi:$psi SecStruct: $SS ($SSclass) \n";
  }
}









More information about the Bioperl-l mailing list