[Biopython] I've written a library for executing fuzzy searches...
c0d3g33k
c0d3g33k at gmail.com
Fri Nov 15 20:12:40 UTC 2013
Hi Tal,
This is only tangentially related to your original post, but I thought
I'd point out the existence of Simmetrics, a Java-based similarity
metrics library (GPL v2). I thought that at some point there was a
Python port, but I could be confusing that with using the library myself
under Jython. Though it is implemented in Java, it might provide a
solid foundation for a python library/api should you find it
interesting. It's fairly comprehensive, so it might at least provide
inspiration for extending your current efforts. It seems to be
unmaintained at present, but source code is available both at the
original Sourceforge page and at github where someone cloned the project.
http://sourceforge.net/projects/simmetrics/
https://github.com/Simmetrics/simmetrics
On 11/15/2013 2:08 PM, Tal Einat wrote:
> Hi Martin!
>
> I'm really excited to get such a response! I would love feedback and
> suggestions on how this could be made more useful for Biological uses. If
> you could expand on specific biological use-cases and their details, for
> example, that would be lovely!
>
> - Tal
>
>
> Tal Einat wrote:
>>> Hi everyone,
>>>
>>> (I'm not on this list, so please make sure to reply to me as well as the
>>> list.)
>>>
>>> In response to a stackoverflow
>>> question<http://stackoverflow.com/questions/19725127/>,
>>> I've written a Python library for fuzzy searches called
>>> 'fuzzysearch'<https://github.com/taleinat/fuzzysearch>.
>>> Currently, it allows searching for a string inside a longer string,
>>> returning the best sub-string which match up to a given maximum
>> Levenshtein
>>> distance. This is done quite efficiently, and there is more optimization
>> to
>>> be done, as needed.
>>>
>>> Is there any interest in this library and its further development? One
>>> thing which I think might be useful is support for BioPython Sequence
>> types.
>>> This is open-source with a very liberal license (the MIT license).
>>>
>>> I'd be happy to collaborate on this!
>>>
>>> - Tal Einat
>>> _______________________________________________
>>> Biopython mailing list - Biopython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>>
>>>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list