[Biopython-dev] private methods/vars
adalke at mindspring.com
adalke at mindspring.com
Mon Jun 23 10:26:32 EDT 2003
[Ingo, asking about Python's "private" name mangling.]
There are three types of privacy in Python, and all are consensual,
meaning they are not strictly enforced by the language.
The first is "public". That is a variable meant for anyone to use
as part of the API. These are stored in variables without a leading
underscore.
[Note: in some cases, like the UserDict thread recently, data which
is meant to be private is still stored in a variable without a leading
underscore. Often it is hard to tell what is really meant to be
public and what is meant to be private.]
The second level is variables where the name starts with a single
leading underscore. These are meant to be part of the private API
and in nearly all cases should not be referenced from external code.
These are usually helper methods.
[Note: the exception is that a few APIs, most notably the win32
ones from Mark Hammond, use both a single leading and a single
trailing underscore (eg, "_reg_clsctx_") to indicate an
architecture-specific special variable.]
The third level is a Python-supported level of obsfucation. Variables
starting with two leading underscores and not ending with two leading
underscores undergo a name change. The new variable name is the
same as the old variable name but with "_" + class name added to
the front of the variable. For example, if the class "Spam" has a
variable or method named "__private_var" then the obsfucated name
is "_Spam__private_var".
This form is not truely private in that other code can easily access it
by using the obsfucated name, and indeed can even introspect it
using functions like dir() or looking at the class's __dict__. Hence,
it is not like what C++ or Java programs might term "private."
What it's most useful for is for base classes which need an internal
method or flag and want to make it hard for derived classes to
accidentally replace that term. Even then, I've rarely needed it.
For example, suppose you have a UserDict-like replacement which
counts the number of get or __getitem__ calls. It could look like
class CountGetDict(UserDict): # or derive from dict now-adays
def __init__(self, data = None):
UserDict.__init__(self, data)
self.__counter = 0
def getCounter(self):
return self.__getCounter
def get(self, name, default = None):
self.__count += 1
return UserDict.get(self, name, default)
def __getitem__(self, name):
self.__count += 1
return UserDict.__getitem__(self, name)
In this example, the __count becomes "_UserDict__count", and it's
unlikely that a derived class would use that name by accident. (More
worrisome is that a highly derived class might use the same class
name and then have a "__count" member, but deep trees like that are
rare, so this is mostly theoretical.)
In some sense there is one more level of naming. Names which
both start and end with two underscores are reserved for Python.
You can use them, but you shouldn't.
To show this at work,
>>> class Spam:
... public_var = 1
... _semiprivate_var = 2
... __private_var = 3
... __python_reserved__ = 4
... def __init__(self):
... self.instance_var = 9
... self._instance_semiprivate = 8
... self.__instance_private = 7
...
>>> spam = Spam()
>>> dir(spam)
['_Spam__instance_private', '_Spam__private_var', '__doc__', '__init__', '__module__', '__python_reserved__', '_instance_private', '_instance_semiprivate', '_semiprivate_var', 'instance_var', 'public_var']
>>> dir(Spam)
['_Spam__private_var', '__doc__', '__init__', '__module__', '__python_reserved__', '_semiprivate_var', 'public_var']
>>>
>>> spam.__dict__
{'instance_var': 9, '_Spam__instance_private': 7, '_instance_semiprivate': 8, '_instance_private': 8}
>>> Spam.__dict__
{'_semiprivate_var': 2, '__module__': '__main__', 'public_var': 1, '_Spam__private_var': 3, '__python_reserved__': 4, '__doc__': None, '__init__': <function __init__ at 0x5829b0>}
>>>
>>> spam._Spam__private_var
3
>>>
Andrew
dalke at dalkescientific.com
More information about the Biopython-dev
mailing list