[Biopython-dev] [Bug 2552] Adding alignments

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Nov 13 10:19:29 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2552





------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk  2008-11-13 05:19 EST -------
(In reply to comment #1)
> (In reply to comment #0)
> > This is related to the very broad alignment bug 1944.
> > 
> > Given two alignments, it can make sense to talk about adding them together.
> 
> Actually, this is a very common procedure in phylogenetic analyses, where
> multiple genes/loci are combined into a "super" matrix for a set of taxa.

This was one of the use cases I originally had in mind here (with hindsight I
should have mentioned this in the original proposal).  Another potentially use
for this is in combination with extracting sub-alignments by column (see Bug
2551) - for example to remove some middle region of an alignment by selecting
the two end regions and adding them together, e.g. new_align = align[:,:10] +
align[:,20:] to remove the region from columns 10 to 20.

As described in my original proposal, adding two alignments "by column" would
require they have the same number of rows, and the same IDs (possibly in a
different order - this is not essential as making the user think about their
preferred sort order seem fine to me).

I suppose using any common subset of shared names is also well defined, or
automatically including null sequences for missing entries (as Frank suggested
in comment 2), but I would much prefer to keep any alignment addition simple
and explicit - no "magic".

More generally you could consider adding any two alignments "by column" if they
have the same number of rows, but first we'd have to talk about adding
SeqRecord objects.  This means doing something sensible with the annotation, in
particular the id and name.  I was hoping to avoid this.

Once Biopython 1.49 is out, dealing with this bug is certainly on my todo list,
especially now that we have some positive responses.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list