[Biopython-dev] Simple __getitem__ for Alignments

Peter biopython at maubp.freeserve.co.uk
Wed Jul 9 13:03:16 UTC 2008


Now that the latest release is out (Biopython 1.47), Bio.AlignIO
should start to get used more.  I anticipate more people getting
frustrated with the current Alignment object, and would like to make
another baby-step in improving it.

I'd like to add a minimal __getitem__ method, as described in Bug 1944
comment 15,
http://bugzilla.open-bio.org/show_bug.cgi?id=1944#c15

>     def __getitem__(self, index) :
>         """Access part of the alignment.
>
>         You can access a row of the alignment as a SeqRecord using an integer
>         index (think of the alignment as a list of SeqRecord objects here):
>
>         first_record = my_alignment[0]
>         last_record = my_alignment[-1]
>
>         Right now, this is the ONLY indexing operation supported.  The
>         use of two indices and splice notation to extract a sub-alignment,
>         row, column or letter is under discussion for a future update."""
>         if isinstance(index, int) :
>             #e.g. result = align[x]
>             #Return a SeqRecord
>             return self._records[index]
>         else :
>             raise TypeError, "Not currently supported, but may be in future."

>From the discussion on Bug 1944, this doesn't seem to be contentious -
the debate is about more advanced splicing operations.

I'd also like to add a __len__ method which would return the number of
SeqRecord objects (i.e. the number of rows).  This would then let the
alignment be treated very much like a read-only list of SeqRecord
objects.  Remember, we can already iterate over the rows in the
alignment as SeqRecord objects.

Any comments?

Peter



More information about the Biopython-dev mailing list