[Biopython-dev] [Bug 2552] Adding alignments

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Nov 13 11:18:01 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2552





------- Comment #4 from fkauff at biologie.uni-kl.de  2008-11-13 06:18 EST -------
(In reply to comment #3)
>
> > 
> > Actually, this is a very common procedure in phylogenetic analyses, where
> > multiple genes/loci are combined into a "super" matrix for a set of taxa.
> 
> This was one of the use cases I originally had in mind here (with hindsight I
> should have mentioned this in the original proposal).  Another potentially use
> for this is in combination with extracting sub-alignments by column (see Bug
> 2551) - for example to remove some middle region of an alignment by selecting
> the two end regions and adding them together, e.g. new_align = align[:,:10] +
> align[:,20:] to remove the region from columns 10 to 20.

Nexus parser can already handle this by rewriting the data set

>> nexobject.write_nexus_data(filename='new.nex',exclude=[range(10,21)],delete=['list','of','taxa','two','delete'])

where the indices of remaining character sets and character partitions get
recalculated.


> 
> As described in my original proposal, adding two alignments "by column" would
> require they have the same number of rows, and the same IDs (possibly in a
> different order - this is not essential as making the user think about their
> preferred sort order seem fine to me).
> 
> I suppose using any common subset of shared names is also well defined, or
> automatically including null sequences for missing entries (as Frank suggested
> in comment 2), but I would much prefer to keep any alignment addition simple
> and explicit - no "magic".
> 

Yes, missing names are given missing character entries

> More generally you could consider adding any two alignments "by column" if they
> have the same number of rows, but first we'd have to talk about adding
> SeqRecord objects.  This means doing something sensible with the annotation, in
> particular the id and name.  I was hoping to avoid this.
> 
> Once Biopython 1.49 is out, dealing with this bug is certainly on my todo list,
> especially now that we have some positive responses.
> 


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list