[BioPython] How to test Sequence objects for equality?

Peter biopython at maubp.freeserve.co.uk
Sat Mar 29 16:13:22 UTC 2008


>  Hello Iddo, thank you for the quick response!
>
>  Extracting the strings and comparing is good for exact matches, but I
>  also need to match sequences with ambiguities. Is there no such
>  function in BioPython?
>
>  Unfortunately sequence alignment is not what I'm trying to do, so much
>  so that I can't think of a way to transform my problem into a sequence
>  alignment problem. I really do need to compare pairs of sequences one
>  by one, as efficiently as possible.

So you want to know if two ambiguous sequences are "compatible"?  In
some cases that looks simple and well defined:

ACT and ACA -> False
ACT and ACN -> True
ACY and ACN -> True
ACY and ACR -> False
ACY and ACM -> Maybe

That last example is about doubly ambiguous comparisons like Y (T or
C) and M (A or C)?  If they both are really a C, then yes, ACY and ACM
would be compatible.  But they might not be.

>  On a side note, I was surprised by having == return False for
>  identical sequences. To make BioPython less confusing, may I suggest
>  either disabling comparison of sequences or making such comparison do
>  the Right Thing?

As I tried to explain in my other email - I don't think there is a
clear "Right Thing" that would suit everyone.  So maybe you are right
- some sort of exception would make sense...

Peter



More information about the Biopython mailing list