[Biopython-dev] antiparallel ?
Andrew Dalke
dalke at acm.org
Fri Aug 11 13:11:16 EDT 2000
thomas at cbs.dtu.dk
>How are people changing sequences to antiparallel with biopython ?
>Currently I use
>
> def complement(self, seq):
> return string.join(map(lambda
x:IUPACData.ambiguous_dna_complement[x], map(None,seq)),'')
Two thing here. First, I like working in Seq space rather than as
strings. Which means I just realized there's no way to get the complement
table for an alphabet. (Well, there is a way using the PropertyManager
and setting the values in IUPACEncodings. It's just not begin done.)
If it did, then this would be a function in utils (not a method) and
work like:
def complement(self, seq):
alphabet = seq.alphabet
table = default_manager.resolve(alphabet, "complement_table")
new_data = []
for c in seq.data:
new_data.append(table[c])
return Seq(string.join(new_data, ''), alphabet)
If I weren't trying to get things done for BOSC, I would fix things now :(
Second, there's no need to do the map(None, seq) since a string is a
sequence-like object. That is,
def spam(c):
print "Character", repr(c)
return c
map(spam, "Andrew")
prints
Character 'A'
Character 'n'
Character 'd'
Character 'r'
Character 'e'
Character 'w'
['A', 'n', 'd', 'r', 'e', 'w']
Also, doing the map(lambda x, IUPACData.ambiguous_dna_complement[x], ...)
is slower than
x = []
for c in seq:
x.append(IUPACData.ambiguous_dna_complement[c])
return string.join(x, '')
because the lambda introduces the function call overhead. Also, using
a loop is easier for most people to understand.
> def reverse(self, seq):
> r = map(None, seq)
> r.reverse()
> return string.join(r,'')
instead of "r = map(None, seq)" try "r = list(seq)"
> def antiparallel(self, seq):
> s = self.complement(seq)
> s = self.reverse(s)
> return s
If you are interested in performance, you could repeat the code for
complement, except adding a ".reverse()" before the string.join. This
would prevent the extra conversion from list -> string -> list.
Is it usually called "antiparallel"? I'm used to "rc" or
"reverse_complement". I believe bioperl calls it "rc", so and for
consistency that is what I would lean towards - except that it's too
small a name for my preferences.
Andrew
More information about the Biopython-dev
mailing list