[BioPython] help for searching overlapping occurrences

Andrew Dalke dalke at dalkescientific.com
Wed Oct 22 10:51:25 EDT 2003


Alessandro:
> I' am looking for a method like finditer from module re but
> returning all the occurrences of a pattern in a straing even
> if overlapping to each other.

There is no such function.

Such a search may be highly exponential as every ambiguous
branch must be taken.

If you want, I have a very experimental pure Python regular
expression engine.  The NFA portion is decent, but doesn't
handle all regexp syntax (eg, non-greedy matches).  You
could modify that to explore all paths rather than just take
one.  It also includes the ability to turn simple regexps
into a DFA, which is much faster but limited to a smaller
number of patterns.

You can also use Perl's regexp engine.  That has the ability
to call arbitrary Perl code to decide if a match occurs.  You
would just need to write a hook which saves the current match
information and rejects it, forcing the engine to backtrack
and find another possibility.

					Andrew
					dalke at dalkescientific.com



More information about the BioPython mailing list