[Biopython] User Defined Scoring Matrix

Peter Cock p.j.a.cock at googlemail.com
Fri Feb 22 10:35:41 UTC 2013


On Fri, Feb 22, 2013 at 2:19 AM, Willis, Jordan R
<jordan.r.willis at vanderbilt.edu> wrote:
> Hello,
>
> Since I'm not sure which tool to exactly use, I will defer to the
> biopython community since odds are I will be using it. I'm trying to produce
> a multiple sequence alignment with a user defined scoring matrix. When I
> look at Clustalw, there is an option to put in such a matrix, and the help
> indicates that this should be in "blast" format. When I search for blast
> format, they indicate that this is hard coded into the software.

I wouldn't start with ClustalW - it is old and still widley used, but even
the authors are trying to discourage this. They suggest their new tool
Clustal Omega, and that as a Biopython wrapper and takes an optional
distance matrix as input via the --distmat-i argument.

from Bio.Align.Applications import ClustalOmegaCommandline
help(ClustalOmegaCommandline)

http://biopython.org/DIST/docs/api/Bio.Align.Applications._ClustalOmega.ClustalOmegaCommandline-class.html

> My end goal is to produce a phylogeny tree using this PSSM, but I have no
> idea how to input this into ClustalW or any multiple sequence alignment
> software. I don't really care which software to use, which wrappers, or how
> I have to do it.I have used biopython to produce this matrix, so I thought
> it would be relatively easy to implement it in any multiple sequence
> alignment software.
>
> I'm not having very good luck and any help would be must appreciated.
>
> Jordan

There are people far more qualified than me to comment on the
goals and if and when you should use a distance based tree (my
understanding is distance based trees are the worst kind, but as
they are computationally inexpensive make can sense for large
datasets).

Regards,

Peter



More information about the Biopython mailing list