[Biopython-dev] [Bug 2629] Updated Bio.NaiveBayes to listfns import

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Nov 5 10:24:15 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2629





------- Comment #9 from biopython-bugzilla at maubp.freeserve.co.uk  2008-11-05 05:24 EST -------
(In reply to comment #8)
> Created an attachment (id=1037)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1037&action=view) [details]
> Patch to update NaiveBayes
> 
> Hopefully I got this correct, if not just let me know.
> 

At first glance it looks like this patch would remove the Python 2.3 set work
around.  Easily fixed.

Also, I would have called the new get_content_freq function _get_content_freq
(leading underscore denoting private) as this is an implementation detail that
doesn't need to be part of the public API.

I'm curious what your other implementations looked like, as this one does not
look that clear to me at first read:

    p_contents=1.0/len(contents)
    content_freqs={}
    for cval in contents:
        vcount=content_freqs.get(cval,0)+p_contents
        content_freqs.update({cval:vcount})

In particular, why use the dict update method?

Given the possible rounding issues, does doing the rescaling (dividing by the
number of elements) at the start make a big time saving (over dividing each
total at the end)?  I would feel happier with the division at the end (as done
in the listfns code).


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list