[BioPython] help for searching overlapping occurrences

Andrew Dalke dalke at dalkescientific.com
Wed Oct 22 11:58:36 EDT 2003


Jeff:
> Would the following (untested) code do what Alessandro wants?
>
> def finditer_overlapped(pattern, string):
>   for i in range(len(string)):
>     m = re.match(pattern, string[i:])
>     if m:
>       yield m

Consider the pattern

   a(bc|bcd)

when searched against

   abcd

Alessandro wanted
> all the occurrences of a pattern in a straing even if overlapping to
> each other.

which I take to mean he wants the "abc" AND "abcd" matches.
Python uses the left-first approach so only finds the "abc"
(compared to the POSIX left-longest one which finds "abcd").
The scanner code you wrote won't yield both of those
possibilities.

					Andrew
					dalke at dalkescientific.com



More information about the BioPython mailing list