[Biopython-dev] [Bug 2552] Adding alignments
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Thu Nov 13 11:18:01 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2552
------- Comment #4 from fkauff at biologie.uni-kl.de 2008-11-13 06:18 EST -------
(In reply to comment #3)
>
> >
> > Actually, this is a very common procedure in phylogenetic analyses, where
> > multiple genes/loci are combined into a "super" matrix for a set of taxa.
>
> This was one of the use cases I originally had in mind here (with hindsight I
> should have mentioned this in the original proposal). Another potentially use
> for this is in combination with extracting sub-alignments by column (see Bug
> 2551) - for example to remove some middle region of an alignment by selecting
> the two end regions and adding them together, e.g. new_align = align[:,:10] +
> align[:,20:] to remove the region from columns 10 to 20.
Nexus parser can already handle this by rewriting the data set
>> nexobject.write_nexus_data(filename='new.nex',exclude=[range(10,21)],delete=['list','of','taxa','two','delete'])
where the indices of remaining character sets and character partitions get
recalculated.
>
> As described in my original proposal, adding two alignments "by column" would
> require they have the same number of rows, and the same IDs (possibly in a
> different order - this is not essential as making the user think about their
> preferred sort order seem fine to me).
>
> I suppose using any common subset of shared names is also well defined, or
> automatically including null sequences for missing entries (as Frank suggested
> in comment 2), but I would much prefer to keep any alignment addition simple
> and explicit - no "magic".
>
Yes, missing names are given missing character entries
> More generally you could consider adding any two alignments "by column" if they
> have the same number of rows, but first we'd have to talk about adding
> SeqRecord objects. This means doing something sensible with the annotation, in
> particular the id and name. I was hoping to avoid this.
>
> Once Biopython 1.49 is out, dealing with this bug is certainly on my todo list,
> especially now that we have some positive responses.
>
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list