[BioPython] kcluster and distances
Scott Rifkin
scott.rifkin at yale.edu
Fri Mar 4 17:15:08 EST 2005
The euclidean distance function in cluster.c is:
{ double result = 0.;
double tweight = 0;
int i;
if (transpose==0) /* Calculate the distance between two rows */
{ for (i = 0; i < n; i++)
{ if (mask1[index1][i] && mask2[index2][i])
{ double term = data1[index1][i] - data2[index2][i];
result = result + weight[i]*term*term;
tweight += weight[i];
}
}
}
else
{ for (i = 0; i < n; i++)
{ if (mask1[i][index1] && mask2[i][index2])
{ double term = data1[i][index1] - data2[i][index2];
result = result + weight[i]*term*term;
tweight += weight[i];
}
}
}
if (!tweight) return 0; /* usually due to empty clusters */
result /= tweight;
result *= n;
return result;
}
why at the end is the result multiplied by n? and why isn't the square
root of result given as the distance?
thanks
scott rifkin
More information about the BioPython
mailing list