[Biopython-dev] Deprecating Bio.mathfns, Bio.stringfns and their C code?
Bruce Southey
bsouthey at gmail.com
Thu Oct 23 16:28:48 UTC 2008
Peter wrote:
> This is about three Biopython "support" modules: Bio.mathfns,
> Bio.listfns, Bio.stringfns, each of which has its own C implementation
> for speed. These haven't been touched for 6 years (which suggests
> they are stable and well tested), but they are now hardly used in
> Biopython.
>
> By removing these we not only reduce the amount of C code in Biopython
> (although here it is optional) which is a good thing for portability
> and supporting other python variants, but we also can reduce the
> "clutter" under the Bio.* namespace, e.g.
>
>>>> import Bio
>>>> help(Bio)
>>>>
>
> On 9th Oct I wrote:
>
>> Until recently Bio.mathfns was used in Bio/NaiveBayes.py but that now
>> uses numpy more heavily instead. I think that Bio.mathfns (and its C
>> implementation) are no longer used anywhere in Biopython (and I would
>> be surprised if anyone else is using this module). I'm suggesting
>> deprecating Bio.mathfns and Bio.cmathfns for the next release.
>>
>
> Any objections to deprecating Bio.mathfns and Bio.cmathfns?
>
Nope, the functions used by Bio/NaiveBayes.py are:
mathfns.safe_log (also defines safe_log2) but is not very good because
it sets a hard constant (1E-100) as a limit.
mathfns.safe_exp
The other functions included are:
fcmp Compare two floating point numbers, up to a specified precision.
intd Represent a floating point number as an integer.
I presume that you mean adding mathfns.safe_log and mathfns.safe_exp to
Bio/NaiveBayes.py first because these are needed by Bio/NaiveBayes.py.
Note that the safe_log in Bio/MarkovModel.py is not the same as
mathfns.safe_log.
> On 9th Oct I wrote:
>
>> I think Bio.stringfns and its C implementation Bio.cstringfns are also
>> now unused in Biopython, and like Bio.mathfns and Bio.cmathfns
>> should be deprecated for the next release.
>>
>
> Any objections to deprecating Bio.stringfns and Bio.cstringfns?
>
Nope, as you say these are not used. But just to be clear, the
functions, lost are
splitany Split a string using many delimiters.
find_anychar Find one of a list of characters in a string.
rfind_anychar Find one of a list of characters in a string, from end to
start.
starts_with Check whether a string starts with another string
[DEPRECATED].
> On 9th Oct I wrote:
>
>> Similarly, Bio.listfns and its C implementation Bio.clistfns might
>> also be deprecated with a little effort ... only three modules
>> currently use Bio.listfns
>>
>
> We could just label Bio.listfns (and Bio.clistfns) as obsolete for the
> next release, or just add a note in the docstring that this might be
> deprecated shortly.
>
Used by:
Bio/MaxEntropy.py
Bio/NaiveBayes.py
Bio/MarkovModel.py
Bio/pairwise2.py
Functions directly used:
itemindex Make an index of the items in the list.
items Get one of each item in a list.
contents Calculate percentage each item appears in a list.
Functions indirectly or not used:
asdict Make the list into a dictionary (for fast testing of
membership).
count Count the number of times each item appears.
intersection Get the items in common between 2 lists.
difference Get the items in 1 list, but not the other.
indexesof Get a list of the indexes of some items in a list.
take Take some items from a list.
Also Bio.listfns used by pairwise2.py which also has a c implementation
(cpairwise2) that I would also suggest is a candidate for removal.
At present I do not know enough about Bio/MaxEntropy.py,
Bio/NaiveBayes.py, and Bio/MarkovModel.py to indicate if Bio.listfns
functions are really required or to port them to numpy. (I may try look
at trying to port them but not soon.)
In summary I have no objection to removing the c code associated with
this code.
Bruce
More information about the Biopython-dev
mailing list