[BioPython] help for searching overlapping occurrences
Andrew Dalke
dalke at dalkescientific.com
Wed Oct 22 10:51:25 EDT 2003
Alessandro:
> I' am looking for a method like finditer from module re but
> returning all the occurrences of a pattern in a straing even
> if overlapping to each other.
There is no such function.
Such a search may be highly exponential as every ambiguous
branch must be taken.
If you want, I have a very experimental pure Python regular
expression engine. The NFA portion is decent, but doesn't
handle all regexp syntax (eg, non-greedy matches). You
could modify that to explore all paths rather than just take
one. It also includes the ability to turn simple regexps
into a DFA, which is much faster but limited to a smaller
number of patterns.
You can also use Perl's regexp engine. That has the ability
to call arbitrary Perl code to decide if a match occurs. You
would just need to write a hook which saves the current match
information and rejects it, forcing the engine to backtrack
and find another possibility.
Andrew
dalke at dalkescientific.com
More information about the BioPython
mailing list