[BioPython] How to test Sequence objects for equality?
Peter
biopython at maubp.freeserve.co.uk
Sat Mar 29 16:05:31 UTC 2008
On Sat, Mar 29, 2008 at 12:38 PM, Tal Einat <taleinat at gmail.com> wrote:
> Hello,
>
> I'm new to BioPython, but I've managed to stumble in my very first
> steps. Could someone help explain this behavior?
>
> >>> from Bio.Seq import Seq
> >>> from Bio.Alphabet import IUPAC
> >>> Seq('A', IUPAC.unambiguous_dna) == Seq('A', IUPAC.unambiguous_dna)
> False
This is a little tricky because Biopython would have to be able to
decide sequence equality based on a combination of the sequence and
the alphabets. For example, which of the following would you say are
equal:
Seq('A', IUPAC.unambiguous_dna)
Seq('A', IUPAC.ambiguous_dna)
Seq('A', IUPAC.unambiguous_rna)
Seq('A', IUPAC.ambiguous_dna)
Seq('A', IUPAC.protein)
etc
In this sort of work, you probably won't be trying to compare DNA to
RNA, or to proteins - all you care about is the sequence string
itself. So compare that:
from Bio.Seq import Seq
from Bio.Alphabet import IUPAC
alpha = Seq('ACG', IUPAC.unambiguous_dna)
beta = Seq('ACG', IUPAC.ambiguous_dna)
gamma = Seq('ACN', IUPAC.ambiguous_dna)
print str(alpha) == str(beta)
print str(beta) == str(gamma)
NOTE - If you are using an older version of Biopython, do this instead:
print alpha.tostring() == beta.tostring()
print beta.tostring() == gamma.tostring()
Peter
More information about the Biopython
mailing list