[Biopython-dev] slicing in Bio.PDB.Chain.__getitem__() ?

Hongbo Zhu 朱宏博 macrozhu at gmail.com
Fri Dec 2 12:43:59 UTC 2011


Hi, Joao,

thanks for the response. When I spoke of slicing Bio.PDB.Chain, I meant to
slice it using residue id, not list index. And these two ways are
fundamentally different.

For instance :

not only slicing like this:
or
chain.child_list[2:12]  # slice using list index

but also slicing like this:

chain[2:12]   # slice using residue sequence id, not feasible at the moment
                   # NOTE: this is fundamentally different from
chain.child_list[2:12]
or even:
chain[(' ', 2, ' ') : (' ', 12, ' ')] # slice using residue full id, even
better

Of course one can play with child_list and obtain the same outcome. But I
think it would be very convenient to implement it in the __getitem__()
function.

cheers,hongbo

2011/12/2 João Rodrigues <anaryin at gmail.com>

> Hey Hongbo,
>
> Interesting idea, but couldn't it be done already with child_list in a
> more or less straightforward manner?
>
> Best,
>
> João
> No dia 2 de Dez de 2011 10:43, "Hongbo Zhu 朱宏博" <macrozhu at gmail.com>
> escreveu:
>
>>  Hi,
>>
>> I propose to add slicing to class Bio.PDB.Chain by changing function
>> Bio.PDB.Chain.__getitem__().
>>
>> * Why is slicing necessary for Bio.PDB.Chain?
>> Protein domain definitions are usually presented as the starting and
>> ending
>> positions of the domain in protein primary structures, e.g. in SCOP, or
>> CATH. Slicing comes in handy when extracting domains from PDB files.
>>
>> * Why is slicing not available at the moment?
>> I understand that the majority of Bio.PDB.Entity objects are not lists.
>> And
>> there is not internal *sequential order* for the child entities in these
>> objects. For example, In Bio.PDB.Model, its child Chain entities do not
>> really have a sequential order within Model. Slicing seems not make sense.
>> But Bio.PDB.Chain is exceptional: Residue entities in Bio.PDB.Chain have a
>> sequence order as presented in the primary structure and slicing becomes a
>> reasonable operation.
>>
>> * How to slice a Chain entity?
>> I think it can be realized by revising the
>> function Bio.PDB.Chain.__getitem__(). For example:
>>
>>    def __getitem__(self, id):
>>        """Return the residue with given id.
>>
>>        The id of a residue is (hetero flag, sequence identifier, insertion
>> code).
>>        If id is an int, it is translated to (" ", id, " ") by the
>> _translate_id
>>        method.
>>
>>        Arguments:
>>        o id - (string, int, string) or int
>>        """
>>        if isinstance(id, slice):
>>            res_id_list = [r.id for r in self.get_iterator()]
>>            if id.start is not None:
>>                start_index =
>> res_id_list.index(self._translate_id(id.start))
>>            else:
>>                start_index = 0
>>            stop_index = res_id_list.index(self._translate_id(id.stop))
>>            return self.get_list()[start_index:stop_index:id.step]
>>        else:
>>            id=self._translate_id(id)
>>            return Entity.__getitem__(self, id)
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>
>


-- 
Hongbo




More information about the Biopython-dev mailing list