[BioPython] simple class to generate random trees

Ernesto e.picardi at unical.it
Wed Dec 21 10:45:38 EST 2005


Hi all,
I wrote a simple class to generate random phylogenetic trees clock (Kuhner
and Felsenstein 1994) and no clock-like (Guindon and Gascuel 2002). I
attached a copy.
I don't know if it could be useful for BioPython users.

Using RandonTree:

>>> from RandomTree import RandomTree
>>> tree = RandomTree()
>>> tree.nobr
0
>>> tree.ntips=4
>>> tree.constant_tree()
'((T3:0.01516,T4:0.01516):0.01332,(T1:0.02643,T2:0.02643):0.00205);'
>>> tree.variable_tree()
'(((T4:0.00050,T3:0.00954):0.00009,T2:0.01591):0.00595,T1:0.00531);'
>>>tree.nobr=1
>>> tree.constant_tree()
'((T2,(T1,T4)),T3);'
>>> tree.variable_tree()
'((T1,(T3,T2)),T4);'

The attributes of tree are:
ntips: number of tips for tree --> default is 10
nobr: 1 for trees without branch lengths --> default is 0
pm: probability of change per unit time --> default is 0.03
shape: gamma shape for variable trees --> default is 0.5
mean: mean of gamma distribution for variable trees --> default is 1

Regards,
Ernesto Picardi

PS: there is also an on-line version at:
http://biologia.unical.it/py_script/tree.html




Ernesto Picardi, PhD
Dept. of Cell Biology
University of Calabria
87036, Arcavacata di Rende (CS)
Italy
Phone: +39 0984 492937
Fax: +39 0984 492911
E-mail: e.picardi at unical.it



-------------- next part --------------
"""
RandomTree is a simple class to generate random rooted trees.
Clock-like trees are generated according to the methodology 
of Kuhner and Felsenstein (1994) Mol. Biol. Evol. 11: 459-468,
whereas no clock-like trees are created following Guindon and
Gascuel (2002) Mol. Biol. Evol. 19: 534-543.
Once a clock-like tree is generted, each branch length is 
multiplied by a gamma dinstributed factor. If the mean of this
distribution is equal to 1 and the shape fixed to 0.5, then the
departure from molecular clock is strong. The opposite situation
is when gamma shape is fixed to 2.0.

When the RandomTree class is invoked a simple object is created.
It contains:
ntips: number of tips for tree --> default is 10
nobr: 1 for trees without branch lengths --> default is 0
pm: probability of change per unit time --> default is 0.03
shape: gamma shape for variable trees --> default is 0.5
mean: mean of gamma distribution for variable trees --> default is 1

USING this class:
>>> from RandomTree import RandomTree
>>> tree = RandomTree()
>>> tree.nobr
0
>>> tree.ntips=4
>>> tree.constant_tree()
'((T3:0.01516,T4:0.01516):0.01332,(T1:0.02643,T2:0.02643):0.00205);'
>>> tree.variable_tree()
'(((T4:0.00050,T3:0.00954):0.00009,T2:0.01591):0.00595,T1:0.00531);'
>>>tree.nobr=1
>>> tree.constant_tree()
'((T2,(T1,T4)),T3);'
>>> tree.variable_tree()
'((T1,(T3,T2)),T4);'

Copyright (c) 2004-2005, Ernesto Picardi.
This class comes with ABSOLUTELY NO WARRANTY.
"""

import math,string,fpformat,random,re,sys  # import of standard modules

class RandomTree:

      def __init__(self,alltips=10,nobr=0,pm=0.03,shape=0.5,mean=1):
          self.alltips=alltips   # number of tips
          self.nobr=nobr     # use branch lengths
          self.pm=pm         # probability of change per unit time
          self.shape=shape   # gamma shape parameter
          self.mean=mean     # mean of gamma dinstribution

      def constant_tree(self):  # function to generate a clock-like tree
          if self.alltips <=2:
              sys.exit('At least three tips. Bye.')
          tips=[]
	  for i in range(1, self.alltips+1):
              tips.append("T"+str(i))
          Lb=[]
	  for i in range(len(tips)):
              Lb.append(0)
          n=1
	  dictionary={}
	  while len(tips)!=1:
                R=random.random()
                tyme=(-(math.log(R))/len(tips))*self.pm
	        fixtyme=fpformat.fix(tyme,5)
		brlens=float(fixtyme)
		for i in range(len(tips)):
			Lb[i]=Lb[i]+brlens
		nodeName = '@node%04i@' % n
		s1=random.choice(tips)
		i1=str(Lb[tips.index(s1)])
		del Lb[tips.index(s1)]
		tips.remove(s1)
		s2=random.choice(tips)
		i2=str(Lb[tips.index(s2)])
		del Lb[tips.index(s2)]
		tips.remove(s2)
		if self.nobr:
		   nodo="("+s1+","+s2+")"
                else:
		     nodo="("+s1+":"+i1+","+s2+":"+i2+")"
		dictionary[nodeName]=nodo
		tips.append(nodeName)
		Lb.append(0)
		n+=1
          findNodes=re.compile(r"@node.*?@", re.I) #to identify a node name
          lastNode = max(dictionary.keys())
          treestring = lastNode
          while 1:
                nodeList = findNodes.findall(treestring)
                if nodeList == []: break
                for element in nodeList:
                    treestring=treestring.replace(element, dictionary[element])
          return treestring + ';'

      def variable_tree(self):  # function to generate a variable tree
          treestring=self.constant_tree()
          findbr=re.compile(":[0-9]+.[0-9]+[\),]")
          allbr=findbr.findall(treestring)
          dicbr={}
          for i in allbr:
              br=(i.split(':'))[1]
	      brval=eval(br.strip('),'))
              beta=float(self.shape)/self.mean
	      gammafactor=random.gammavariate(self.shape,beta)
              newbr=brval*gammafactor
	      newbr1=fpformat.fix(newbr,5)
	      dicbr[i]=newbr1
          for j in dicbr:
	      if ',' in j:
	         treestring=treestring.replace(j,':'+dicbr[j]+',')
              elif ')' in j:
                   treestring=treestring.replace(i,':'+dicbr[i]+')')
          return treestring




More information about the BioPython mailing list