[Biopython-dev] Bio.AlignIO
    Michiel de Hoon 
    mdehoon at c2b2.columbia.edu
       
    Wed Jul 25 14:44:56 UTC 2007
    
    
  
Peter wrote:
> Personally I see an alignment as both an array of characters (i.e. amino 
> acid residues or nucleotides), and a list of sequences.
> 
> In the same way that a Numeric or NumPy array lets you iterate over 
> rows, yet also access individual elements, we could allow iteration of 
> SeqRecords and also allow access to individual letters.
How about the following:
-Iterators iterate for the SeqRecords in the alignment
-An index of the form [xxx] returns the corresponding SeqRecord
-An index of the form [xxx:yyy:zzz] returns an Alignment object 
containing the SeqRecords in rows [xxx:yyy:zzz]
(compare to the current method get_all_seqs()).
-An index of the form [xxx,:] returns the Seq object of the SeqRecord at 
xxx (this is currently done by the get_seq_by_num() method).
-An index of the form [xxx:yyy:zzz,:] returns a list of Seq objects
-An index of the form [:,www] returns a string containing the characters 
  at column www (which is currently done by the get_column method)
-An index of the form [xxx:yyy:zzz,www] returns a string containing the 
characters at column www using only the rows xxx:yyy:zzz.
-An index of the form [xxx,www] returns a string containing the 
character of the sequence in row xxx at column www.
This is more-or-less how Numerical Python arrays work, except that we'll 
be returning SeqRecord/Seq/string objects depending on the indices.
--Michiel.
    
    
More information about the Biopython-dev
mailing list