[BioPython] Bio.distance
Peter
biopython at maubp.freeserve.co.uk
Wed Oct 1 12:03:22 EDT 2008
On Wed, Oct 1, 2008 at 4:49 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>
> Hi,
> Under the 'standard' install I do not think that there is any advantage of
> using Bio.cdistance within Bio.kNN. I tested this on a bioinformatics data
> set with almost 1500 data points, 8 explanatory variables and k=9. ...
> Actual maximum times across three runs were under 16.6 seconds with
> it [Bio.cdistance] and under 17.4 seconds without it [Bio.distance using
> Numeric]
Its interesting that the C version is only slightly faster than
Numeric - of course as you point out there are lots of possible
complications here like lapack and atlas (plus compiler options and
CPU features).
I think your numbers are good support for Michiel's proposition that
we should deprecate Bio.cdistance and Bio.distance and just use numpy
in Bio.kNN - this will simplify our code base and make very little
difference to the speed.
Peter
More information about the BioPython
mailing list