[Biopython-dev] Bio.AlignIO
Michiel de Hoon
mdehoon at c2b2.columbia.edu
Wed Jul 25 14:44:56 UTC 2007
Peter wrote:
> Personally I see an alignment as both an array of characters (i.e. amino
> acid residues or nucleotides), and a list of sequences.
>
> In the same way that a Numeric or NumPy array lets you iterate over
> rows, yet also access individual elements, we could allow iteration of
> SeqRecords and also allow access to individual letters.
How about the following:
-Iterators iterate for the SeqRecords in the alignment
-An index of the form [xxx] returns the corresponding SeqRecord
-An index of the form [xxx:yyy:zzz] returns an Alignment object
containing the SeqRecords in rows [xxx:yyy:zzz]
(compare to the current method get_all_seqs()).
-An index of the form [xxx,:] returns the Seq object of the SeqRecord at
xxx (this is currently done by the get_seq_by_num() method).
-An index of the form [xxx:yyy:zzz,:] returns a list of Seq objects
-An index of the form [:,www] returns a string containing the characters
at column www (which is currently done by the get_column method)
-An index of the form [xxx:yyy:zzz,www] returns a string containing the
characters at column www using only the rows xxx:yyy:zzz.
-An index of the form [xxx,www] returns a string containing the
character of the sequence in row xxx at column www.
This is more-or-less how Numerical Python arrays work, except that we'll
be returning SeqRecord/Seq/string objects depending on the indices.
--Michiel.
More information about the Biopython-dev
mailing list