Protein Clustering tool
Farid Chetouani
fchetou at pasteur.fr
Tue Jul 3 09:38:29 UTC 2001
Bonjour
Firstly, Frank thank you for your reply.
I am sorry my first email was not enough precise.
In fact,
I was wondering if EMBOSS plan to provide a free clustering tool
with a view to get from a protein fasta sequence file
a list of family proteins.
For instance, thanks to A. Enright & C. Ouzounis
GeneRage software is free for academic research
(http://www.ebi.ac.uk/research/cgg/services/rage/)
but the sources are not yet available
best regards
thank you for your help
F
PS: please reply to my email, fchetou at infobiogen.fr
>
> If you wish to construct phylogenetic trees (specifically gene trees)
> from protein sequences so as to infer duplication and
> paralogous/orthologous relationships, then you can use the PHYLIP
> package (available as an EMBASSY application). Genetic distances can be
> calculated using EPROTDIST and the distance matrix created can be input
> into either EFITCH (slower, more accurate tree) or ENEIGHBOR (faster,
> more approximate clustering method, allowing the use of the
> Neighbor-Joining algorithm, or the UPGMA algorithm - use the latter only
> if you have previously tested that the "molecular clock" assumption is
> valid for your dataset).
>
> ePROTDIST, eFITCH and eNEIGHBOR come from version 3.5 of the PHYLIP
> package (http://evolution.genetics.washington.edu). PHYLIP 3.6 has
> recently been released (alpha version). However, PROTDIST 3.6 has
> improved distances (copes with among-site rate heterogeneity to give
> more accurate genetic distances) and there are also improvements to
> NEIGHBOR 3.6 (faster) and to FITCH 3.6. I presume that PHYLIP 3.6 will
> be available as an EMBASSY application once it is confident that there
> are no serious bugs :-)
>
> I hope that helps,
> Best Wishes,
> Frank
> --
> Frank Wright
> Biomathematics and Statistics Scotland,
> SCRI, DUNDEE DD2 5DA, Scotland
> frank at bioss.sari.ac.uk
More information about the EMBOSS
mailing list