[BioPython] Fwd: An interface to the maximum-likelihood programs of PHYLIP

Fri Sep 2 10:00:09 EDT 2005

Begin forwarded message:

> From: rjw500 <rjw500 at cs.york.ac.uk>
> Date: September 2, 2005 9:04:44 AM EDT
> To: biopython-dev-owner at biopython.org
> Subject: An interface to the maximum-likelihood programs of PHYLIP
>
>
> Dear Biopython-dev-owner,
>
> I am sorry to trouble you, but I am not sure who to contact, since  
> my inital e-mail to biopython-dev at biopython.org bounced back.
>
> I am studying for an MSc in Information Processing at York  
> University in the UK. As part of the course I am carrying out a  
> short research project with Dr. James Cussens. I chose to develop  
> an interface in Python to the maximum-likelihood programs of the  
> phylogenetic analysis package PHYLIP. The code is based on the  
> modules available in Biopython and I was wondering if you would be  
> interested in incorporating it into the next release of Biopython.
>
> I have written the following classes and modules:
>
>
> - a class to represent a PHYLIP multiple sequence alignment based  
> on Bio.Align.Generic.Alignment
>
> - classes to represent the alphabets required for the sequence data  
> in PHYLIP input files based on Bio.Alphabet
>
> - a light class to allow the conversion of multiple sequence  
> alignment objects in other formats, such as Clustalw, that is  
> derived form Bio.Align.FormatConvert
>
> - a module to parse PHYLIP input files based on Bio.ParserSupport.  
> This module includes scanners to read the two PHYLIP input file  
> formats, a consumer and several parsers built upon these classes.
>
> - modules for the PHYLIP maximum-likelihood programs dnaml, dnamlk,  
> proml, promlk, as well as the PHYLIP  programs seqboot, consense  
> and treedist that are based upon Bio.Application
>
>
> These classes and modules allow the automation of phylogenetic  
> analysis, which particularly in the case of the maximum likelihood  
> methods can be a very time consuming process. A short script can be  
> written to analyse a multiple sequence alignment with one of the  
> maximum-likelihood programs, and then examine the tree produced by  
> bootstrapping to test if the relationships identified are supported  
> by the data.
>
> I realise Biopython currently provides support for the distance- 
> matrix programs of PHYLIP through the EMBOSS package wrappers.  
> However, wrappers for the PHYLIP maximum likelihood programs in the  
> EMBOSS package are either incomplete (dnaml and dnamlk), lacking  
> the facility to use multiple data sets which is critical for  
> bootstrapping, or completely absent (proml and promlk). Thus, I  
> decided to extend the existing support for PHYLIP by writing an  
> interface to the maximum likelihood programs of the standard PHYLIP  
> package. I wrote the modules for seqboot and consense, which are  
> available via the EMBOSS wrappers, so that people who had not  
> installed the EMBOSS package would also be able to carry out  
> bootstrap analysis.
>
> I look forward to hearing from you,
>
> Best wishes,
>
> Robert Wilson
>