[Bioperl-l] Module for secondary structure feature recognition ?

Fri Dec 3 03:01:34 EST 2004

Hi Manisha,
Check out Bio::Structure::SecStr::STRIDE or Bio::Structure::SecStr::DSSP
These modules parse the output of the STRIDE and DSSP programs. Both of 
these programs take as input a pdb file and calculate secondary 
structure based on similar, but not identical, criteria. You can the 
"objectify" this output using the modules listed above.

Unfortunately, these modules are not well integrated with the rest of 
bioperl (my fault). It may also be somewhat tricky to map PDB residues 
to residues in you sequence alignment - you may have to do another 
alignment for this.

The modules are well-documented, though, (try perldoc 
Bio::Structure::SecStr::STRIDE::Res) and the analysis you are describing 
certainly sounds feasible.

I would try this:
1. Do alignment and get a Bio::SimpleAlign of your sequences
2. Map pdb residues to residues of sequences in Bio::SimpleAlign object, 
somehow
3. Run STRIDE on pdb of each sequence, objectify using 
Bio::Structure::SecStr::STRIDE::Res
4. Call secBounds method on each object to get the boundaries of each 
secondary structure element.
5. Map these boundaries through pdb-to-alignment sequence mapping found 
in #2.
6. Extract sequence slices from #1
7. Make brilliant observation(s) about alignments of secondary structure 
elements.

Good luck,
Ed Green
UC Berkeley

Goel, Manisha wrote:
> Hi All,
> 
> I am new to list as well as Bioperl, with some experience with perl.
> 
> 
> I need a perl script that does the following:
> 
> I have  a multiple sequence alignment, from which I want to extract
> blocks of alignment. This can possibly be taken care of easily by the
> Bio/SimpleAlign::slice
> 
> BUT my problem being that I want some module that will decide what
> regions to cut dpending on the secondary structure features of the
> sequences.
> 
> In other words.. I want a program to be able to judge the secondary str
> of all the sequences in the multiple sequence alignment and extract the
> region where all sequences have a consensus of the sec str.
> 
> All sequences in the alignment have known pdb structures, so the
> secondary str information could be available in multiple formats- like
> from pdb file itself or dssp output file etc. 
> 
> I have gone through the FAQS and HOWTO's at Bioperl but could not come
> up with anything suitable.
> If anyone can please guide me to any such existing modules that even
> approximate the task.. I should probably be able to put it together to
> fit the bill or at least know where to get started.
> 
> Thanks,
> -Manisha
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l