[Biopython] Find Sub-sequence with Variable positions

Peter Cock p.j.a.cock at googlemail.com
Mon Jul 8 14:06:36 UTC 2013


On Mon, Jul 8, 2013 at 2:19 PM, Jurgens de Bruin <debruinjj at gmail.com> wrote:
> Hi,
>
> I hope someone can help me with the following:
>
> I want to find a sub-sequence within a sequence,but the catch is that the
> sub-sequence contains positions that are variable and does not have to
> match 100%.
> For example:
> if the following is the sub-sequence all the postions have to match but
> position 5(A) can be any of the 4 bases ( ACGT ) within the query-seq.
> ACGTACGTACGT
>
> Thanks!!!

You could use a regular expression to do that - in Python, or at the
command line with something like EMBOSS dreg or fuzzynuc:

http://emboss.open-bio.org/rel/rel6/apps/dreg.html
http://emboss.open-bio.org/rel/rel6/apps/fuzznuc.html

Peter



More information about the Biopython mailing list