[Biopython-dev] [Bug 2657] Improved Bio/Statistics/lowess.py
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Fri Nov 14 11:11:26 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2657
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-11-14 06:11 EST -------
I've updated CVS to use standard four space indentation, add a doctest and the
copyright statement etc.
James' code makes two code changes (shown against CVS revision 1.9).
67,68c67,68
< h = [numpy.sort(abs(x-x[i]))[r] for i in range(n)]
< w = numpy.clip(abs(([x]-numpy.transpose([x]))/h),0.0,1.0)
---
> h = [numpy.sort(numpy.abs(x-x[i]))[r] for i in range(n)]
> w = numpy.clip(numpy.abs(([x]-numpy.transpose([x]))/h),0.0,1.0)
Due to the historic usage "from Numeric import *" this code did once use
Numeric.abs here, so it makes sense to use numpy.abs now. Probably just an
oversight from the recent Numeric/numpy conversion. This is another reminder
that using "from XXX import *" is a bad idea.
76,80c76,82
< b = numpy.array([sum(weights*y), sum(weights*y*x)])
< A = numpy.array([[sum(weights), sum(weights*x)],
< [sum(weights*x), sum(weights*x*x)]])
< beta = numpy.linalg.solve(A,b)
< yest[i] = beta[0] + beta[1]*x[i]
---
> theta = weights*x
> b_top = sum(weights*y)
> b_bot = sum(theta*y)
> a = sum(weights)
> b = sum(theta)
> d = sum(theta*x)
> yest[i] = (d*b_top-b*b_bot+(a*b_bot-b*b_top)*x[i])/(a*d-b**2)
I can see the point of calculating and caching these:
weights*y
weights*x
sum(weights*x)
Was there a good reason for the name theta for weights*x?
I personally think using an explicit matrix solver is much nicer to read than
that complex hand coded version. Does it really save much time?
My suggestion is just:
76,78c76,81
< b = numpy.array([sum(weights*y), sum(weights*y*x)])
< A = numpy.array([[sum(weights), sum(weights*x)],
< [sum(weights*x), sum(weights*x*x)]])
---
> weights_x = weights*x
> weights_y = weights*y
> sum_weights_x = sum(weights_x)
> b = numpy.array([sum(weights_y), sum(weights_y*x)])
> A = numpy.array([[sum(weights), sum_weights_x],
> [sum_weights_x, sum(weights_x*x)]])
However, I'm going to leave this for Michiel to resolve (given he wrote the
code in the first place).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list