[Biopython-dev] motifs module: motif similarity measures

Sefa Kilic sefa1 at umbc.edu
Mon Jun 9 13:51:55 UTC 2014


Hello all,

If I am not mistaken, the Motif module is now deprecated and being replaced
with "motifs" module. I use built-in motif similarity measures frequently
and the new module contains only one: "dist_pearson".

If no one else is actively working on it, I would like to add

- dist_product
- dist_dpq

functions to the motifs module which already exist in the deprecated module.

I also would like to add a few more motif similarity measures such as

- Average log-likelihood ratio (Wang and Stormo, 2003)
- Euclidean distance
- Sandelin-Wasserman similarity (Sandelin and Wasserman, 2004)

For overview, see methods section here:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852410/

All these functions measure similarity/distance between two columns. So,
instead of having two functions for each (i.e. dist_pearson for
motif-similarity which calls dist_pearson_at for motif column similarity),
I think it would be neat to have one wrapper function which takes the
column-similarity function as a parameter.

What do you think? Thank you.

-- 
Sefa Kilic
​




More information about the Biopython-dev mailing list