[Biopython-dev] slicing in Bio.PDB.Chain.__getitem__() ?
Hongbo Zhu 朱宏博
macrozhu at gmail.com
Fri Dec 2 12:43:59 UTC 2011
Hi, Joao,
thanks for the response. When I spoke of slicing Bio.PDB.Chain, I meant to
slice it using residue id, not list index. And these two ways are
fundamentally different.
For instance :
not only slicing like this:
or
chain.child_list[2:12] # slice using list index
but also slicing like this:
chain[2:12] # slice using residue sequence id, not feasible at the moment
# NOTE: this is fundamentally different from
chain.child_list[2:12]
or even:
chain[(' ', 2, ' ') : (' ', 12, ' ')] # slice using residue full id, even
better
Of course one can play with child_list and obtain the same outcome. But I
think it would be very convenient to implement it in the __getitem__()
function.
cheers,hongbo
2011/12/2 João Rodrigues <anaryin at gmail.com>
> Hey Hongbo,
>
> Interesting idea, but couldn't it be done already with child_list in a
> more or less straightforward manner?
>
> Best,
>
> João
> No dia 2 de Dez de 2011 10:43, "Hongbo Zhu 朱宏博" <macrozhu at gmail.com>
> escreveu:
>
>> Hi,
>>
>> I propose to add slicing to class Bio.PDB.Chain by changing function
>> Bio.PDB.Chain.__getitem__().
>>
>> * Why is slicing necessary for Bio.PDB.Chain?
>> Protein domain definitions are usually presented as the starting and
>> ending
>> positions of the domain in protein primary structures, e.g. in SCOP, or
>> CATH. Slicing comes in handy when extracting domains from PDB files.
>>
>> * Why is slicing not available at the moment?
>> I understand that the majority of Bio.PDB.Entity objects are not lists.
>> And
>> there is not internal *sequential order* for the child entities in these
>> objects. For example, In Bio.PDB.Model, its child Chain entities do not
>> really have a sequential order within Model. Slicing seems not make sense.
>> But Bio.PDB.Chain is exceptional: Residue entities in Bio.PDB.Chain have a
>> sequence order as presented in the primary structure and slicing becomes a
>> reasonable operation.
>>
>> * How to slice a Chain entity?
>> I think it can be realized by revising the
>> function Bio.PDB.Chain.__getitem__(). For example:
>>
>> def __getitem__(self, id):
>> """Return the residue with given id.
>>
>> The id of a residue is (hetero flag, sequence identifier, insertion
>> code).
>> If id is an int, it is translated to (" ", id, " ") by the
>> _translate_id
>> method.
>>
>> Arguments:
>> o id - (string, int, string) or int
>> """
>> if isinstance(id, slice):
>> res_id_list = [r.id for r in self.get_iterator()]
>> if id.start is not None:
>> start_index =
>> res_id_list.index(self._translate_id(id.start))
>> else:
>> start_index = 0
>> stop_index = res_id_list.index(self._translate_id(id.stop))
>> return self.get_list()[start_index:stop_index:id.step]
>> else:
>> id=self._translate_id(id)
>> return Entity.__getitem__(self, id)
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>
>
--
Hongbo
More information about the Biopython-dev
mailing list