[Biopython-dev] [Bug 2552] New: Adding alignments
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Mon Jul 28 09:48:56 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2552
Summary: Adding alignments
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
This is related to the very broad alignment bug 1944.
Given two alignments, it can make sense to talk about adding them together.
However we can either add by row, or by column.
e.g. Consider this alignment, a
DNAAlphabet() alignment with 3 rows and 14 columns
ACGATCAGCTAGCT Alpha
CCGATCAGCTAGCT Beta
ACGATGAGCTAGCT Gamma
Doing a+a by column would give:
DNAAlphabet() alignment with 3 rows and 28 columns
ACGATCAGCTAGCTACGATCAGCTAGCT Alpha
CCGATCAGCTAGCTCCGATCAGCTAGCT Beta
ACGATGAGCTAGCTACGATGAGCTAGCT Gamma
This sort of operation is often done to combined alignments from multiple genes
(after first sorting the rows to ensure the species names are in the same
order). To implement this would ideally require the ability to add SeqRecord
objects together, doing something sensible with the annotation and in
particular the identifies.
Doing a+a by row would give:
DNAAlphabet() alignment with 6 rows and 14 columns
ACGATCAGCTAGCT Alpha
CCGATCAGCTAGCT Beta
ACGATGAGCTAGCT Gamma
ACGATCAGCTAGCT Alpha
CCGATCAGCTAGCT Beta
ACGATGAGCTAGCT Gamma
This particular example, a+a, is perhaps unrealistic due to the repeated
identifiers, but I imagine there are some real use cases for this operation.
More generally, suppose we have two alignments a and b. Treating each
alignment as a list of SeqRecord objects, you might expect:
a.extend(b) -> addition by row
a+b -> addition by row
However, I would suggest for alignment objects:
a.extend(b) -> addition by row, requires sequence all be same length (same
number of columns)
a+b -> addition by column, requires same number of sequences (rows)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list