[BioPython] simple class to generate random trees
Ernesto
e.picardi at unical.it
Wed Dec 21 10:45:38 EST 2005
Hi all,
I wrote a simple class to generate random phylogenetic trees clock (Kuhner
and Felsenstein 1994) and no clock-like (Guindon and Gascuel 2002). I
attached a copy.
I don't know if it could be useful for BioPython users.
Using RandonTree:
>>> from RandomTree import RandomTree
>>> tree = RandomTree()
>>> tree.nobr
0
>>> tree.ntips=4
>>> tree.constant_tree()
'((T3:0.01516,T4:0.01516):0.01332,(T1:0.02643,T2:0.02643):0.00205);'
>>> tree.variable_tree()
'(((T4:0.00050,T3:0.00954):0.00009,T2:0.01591):0.00595,T1:0.00531);'
>>>tree.nobr=1
>>> tree.constant_tree()
'((T2,(T1,T4)),T3);'
>>> tree.variable_tree()
'((T1,(T3,T2)),T4);'
The attributes of tree are:
ntips: number of tips for tree --> default is 10
nobr: 1 for trees without branch lengths --> default is 0
pm: probability of change per unit time --> default is 0.03
shape: gamma shape for variable trees --> default is 0.5
mean: mean of gamma distribution for variable trees --> default is 1
Regards,
Ernesto Picardi
PS: there is also an on-line version at:
http://biologia.unical.it/py_script/tree.html
Ernesto Picardi, PhD
Dept. of Cell Biology
University of Calabria
87036, Arcavacata di Rende (CS)
Italy
Phone: +39 0984 492937
Fax: +39 0984 492911
E-mail: e.picardi at unical.it
-------------- next part --------------
"""
RandomTree is a simple class to generate random rooted trees.
Clock-like trees are generated according to the methodology
of Kuhner and Felsenstein (1994) Mol. Biol. Evol. 11: 459-468,
whereas no clock-like trees are created following Guindon and
Gascuel (2002) Mol. Biol. Evol. 19: 534-543.
Once a clock-like tree is generted, each branch length is
multiplied by a gamma dinstributed factor. If the mean of this
distribution is equal to 1 and the shape fixed to 0.5, then the
departure from molecular clock is strong. The opposite situation
is when gamma shape is fixed to 2.0.
When the RandomTree class is invoked a simple object is created.
It contains:
ntips: number of tips for tree --> default is 10
nobr: 1 for trees without branch lengths --> default is 0
pm: probability of change per unit time --> default is 0.03
shape: gamma shape for variable trees --> default is 0.5
mean: mean of gamma distribution for variable trees --> default is 1
USING this class:
>>> from RandomTree import RandomTree
>>> tree = RandomTree()
>>> tree.nobr
0
>>> tree.ntips=4
>>> tree.constant_tree()
'((T3:0.01516,T4:0.01516):0.01332,(T1:0.02643,T2:0.02643):0.00205);'
>>> tree.variable_tree()
'(((T4:0.00050,T3:0.00954):0.00009,T2:0.01591):0.00595,T1:0.00531);'
>>>tree.nobr=1
>>> tree.constant_tree()
'((T2,(T1,T4)),T3);'
>>> tree.variable_tree()
'((T1,(T3,T2)),T4);'
Copyright (c) 2004-2005, Ernesto Picardi.
This class comes with ABSOLUTELY NO WARRANTY.
"""
import math,string,fpformat,random,re,sys # import of standard modules
class RandomTree:
def __init__(self,alltips=10,nobr=0,pm=0.03,shape=0.5,mean=1):
self.alltips=alltips # number of tips
self.nobr=nobr # use branch lengths
self.pm=pm # probability of change per unit time
self.shape=shape # gamma shape parameter
self.mean=mean # mean of gamma dinstribution
def constant_tree(self): # function to generate a clock-like tree
if self.alltips <=2:
sys.exit('At least three tips. Bye.')
tips=[]
for i in range(1, self.alltips+1):
tips.append("T"+str(i))
Lb=[]
for i in range(len(tips)):
Lb.append(0)
n=1
dictionary={}
while len(tips)!=1:
R=random.random()
tyme=(-(math.log(R))/len(tips))*self.pm
fixtyme=fpformat.fix(tyme,5)
brlens=float(fixtyme)
for i in range(len(tips)):
Lb[i]=Lb[i]+brlens
nodeName = '@node%04i@' % n
s1=random.choice(tips)
i1=str(Lb[tips.index(s1)])
del Lb[tips.index(s1)]
tips.remove(s1)
s2=random.choice(tips)
i2=str(Lb[tips.index(s2)])
del Lb[tips.index(s2)]
tips.remove(s2)
if self.nobr:
nodo="("+s1+","+s2+")"
else:
nodo="("+s1+":"+i1+","+s2+":"+i2+")"
dictionary[nodeName]=nodo
tips.append(nodeName)
Lb.append(0)
n+=1
findNodes=re.compile(r"@node.*?@", re.I) #to identify a node name
lastNode = max(dictionary.keys())
treestring = lastNode
while 1:
nodeList = findNodes.findall(treestring)
if nodeList == []: break
for element in nodeList:
treestring=treestring.replace(element, dictionary[element])
return treestring + ';'
def variable_tree(self): # function to generate a variable tree
treestring=self.constant_tree()
findbr=re.compile(":[0-9]+.[0-9]+[\),]")
allbr=findbr.findall(treestring)
dicbr={}
for i in allbr:
br=(i.split(':'))[1]
brval=eval(br.strip('),'))
beta=float(self.shape)/self.mean
gammafactor=random.gammavariate(self.shape,beta)
newbr=brval*gammafactor
newbr1=fpformat.fix(newbr,5)
dicbr[i]=newbr1
for j in dicbr:
if ',' in j:
treestring=treestring.replace(j,':'+dicbr[j]+',')
elif ')' in j:
treestring=treestring.replace(i,':'+dicbr[i]+')')
return treestring
More information about the BioPython
mailing list