[Biopython-dev] [Bug 2693] New: LogisticRegression convergence criterion is too lenient
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Mon Dec 1 20:01:44 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2693
Summary: LogisticRegression convergence criterion is too lenient
Product: Biopython
Version: Not Applicable
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P3
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: bsouthey at gmail.com
In R and SAS, the example in the code and tutorial provides the following
parameters:
Intercept = 18.9622
x1 = -0.0714
x2 = 0.0444
By default, Bio/LogisticRegression.py defines the following parameters
MAX_ITERATIONS = 500
CONVERGE_THRESHOLD = 0.01
The convergence threshold is too lenient so the iterations terminate before the
expected values are obtained. Using more stringent criteria (CONVERGE_THRESHOLD
= 0.000000001) permits convergence to the R/SAS values provided MAX_ITERATIONS
is greater than 7761 with my system.
MAX_ITERATIONS and CONVERGE_THRESHOLD are fixed within
Bio/LogisticRegression.py module but should be part of the API for the train
function such as:
def train(xs, ys, update_fn=None, typecode=None, CONVERGE_THRESHOLD =
0.000000001, MAX_ITERATIONS=10000):
Note the algorithm used requires a large number of iterations and the train
function does not display the degree of convergence attained when
MAX_ITERATIONS is exceeded.
Jeffrey Whitaker provides Python code using an alternative algorithm:
http://www.cdc.noaa.gov/people/jeffrey.s.whitaker/python/logistic_regression.py
Furthermore, the update_fn should also pass the previous likelihood or
difference is likelihood so the actual convergence can be seen. Really the
update_fn should be more general than this and be able to display more
information but the attached patches provides the previous llh (old_llik).
def show_progress(iteration, old_llh, loglikelihood):
print "Iteration:", iteration, "Old", old_llh, "Log-likelihood function:",
loglikelihood, "Diff:", (old_llh-loglikelihood)
model = LogisticRegression.train(xs, ys, update_fn=show_progress)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list