[Biopython-dev] Sorting alignments

Peter Cock p.j.a.cock at googlemail.com
Sat Jun 23 16:12:06 UTC 2012


On Sat, Jun 23, 2012 at 5:04 PM, Eric Talevich wrote:
> On Sat, Jun 23, 2012 at 9:48 AM, Peter Cock wrote:
>>
>> Hi all,
>>
>> This branch extends the MultipleSeqAlignment's sort method to
>> accept a key function and a reverse option (just like lists under
>> Python 3 - there is no need for a cup argument):
>>
>> https://github.com/peterjc/biopython/tree/align-sort
>>
>> This was prompted by a BioStars question,
>>
>> http://www.biostars.org/post/show/47562/is-there-a-way-to-sort-a-biopython-alignment-by-a-feature-other-then-id/
>>
>> Does this seem like a good idea?
>
>
> Seems cool to me. I've normally had to operate on the alignment._records
> attribute to do this sort of thing, so it's nice to have an officially
> sanctioned method.

OK - applied to the trunk.

>>
>> Can anyone think of a nicer
>> example for custom sort ordering for the doctest or Tutorial?
>
>
> I sort by unaligned sequence length sometimes. In alignment, it would be the
> reverse of gappiness:
>
>>>> aln.sort(key=lambda rec: rec.seq.count('-'), reverse=True)
>
> Other use cases could include sorting by sequence weights, given a function
> for calculating them, or according to the ordering in a phylogenetic tree.
>
> -E

The gappiness sort makes sense - that seems like a good
example for the main tutorial.

Thnaks

Peter



More information about the Biopython-dev mailing list