[BioPython] Re: UnitTests

Fri, 14 Apr 2000 01:13:42 -0700

>
> ----- Original Message -----
> From: Andrew Dalke <dalke@acm.org>
> To: Cayte <katel@worldpath.net>
> Sent: Thursday, April 13, 2000 2:36 AM
> Subject: UnitTests
>
>
 > Hello,
 >
 > I've been going through your UnitTest code.  I can see we have
 > different style of writing regression suites.  From what I've seen, it
> looks like the XP style, with a strong Smalltalk feel, although the
 > last time I looked at either style of regression tests was about a
 > year, so my memory is vauge.
 >
   Yes, I based it on the Xtreme conceps.
>
 > I still don't fully understand the code, so I'm working at
 > understanding bits.  I have several comments, observations and
 > questions about those bits.  I do not mean this in any way to be an
 > attack on your code or source of flammage.  If I have misspoken,
 > please tell me.  (I make this comment because I don't know you and
 > don't know how you react to my way of describing things, especially
 > over nuance-poor email.)
 >
    No problem.  The only code that is never criticized is shelfware:)
 >
 > There are some idiomatic changes I can suggest in the code:
 >
 >  - UnitTestCase  has a "failure = 0" instead of "failure = None"
 >
 >  - you don't need parens around the return type
 >        ("return( failure )" is the same as "return failure")
 >
 >  - you don't need parens around if expressions
 >

    Should be easy to fix.

 >  - != is supported, so the 6 expressions like
 >      "not (name[ :5 ] == 'test_')"
 >     can be written as
 >      "name[ :5 ] != 'test_'"

   I guess I'm paranoid about precedence.

 >  - you have checks against type() which raise BadParameter.  The
 >     standard Python way is to raise a TypeError.  Also, I think
 >     some of your checks are excessive, like testing against string.
 >     This comparison will break for unicode strings.
 >
 >  - in UnitTestExceptions, you may be interested to know that "Exception"
 >     is a built-in, so the "import exceptions" is never needed.
 >
 >  - you have try/except blocks where you use a generic "except:".
 >     Interrupts in Python (ie, "^C"), are handled as a
 >     KeyboardInterrupt exception.  If someone tries to stop the
 >     program inside one of these blocks, it will not succeed.
 >     Instead, you should at least allow
 >
 >       except KeyboardInterrupt:
 >         raise
 >
    I'd be happy to to fix this.

 >  - I think you are overly cautious on checking your input parameters
 >     in that you do some checks too often.  By that I point out
 >     "add_test" of UnitTestSuite.  It checks the caller's name to
 >     ensure that it's a string and starts with 'test_'.  But the
 >     input is a UnitTestCaller which has already had that test done.
 >     If you want to ensure the types, then doing
 >
 >      def add_test( self, caller ):
 >        try:
 >            if not isinstance(caller, UnitTestCaller.UnitTestCaller):
 >                raise ....BadParameter( "method must be a TestCaller" )
 >         ...
 >
 >     I'll add that in general I don't encourage explicit type checks.
 >     The Python way is to use "objects which act this way" rather than
 >     "objects which are this way".
 >
 >
 > Here are some parts of your code I find confusing:
 >
 >  ===
 >
 > Why is there a setup/teardown in addition to __init__/__del__ ?  I
 > assume it's so you can run the same suite multiple times.  If so, why
 > not recreate the test cases?
 >
 >  ===
 >
   Its to set up global data that can be used by all of the test cases.
I'll
 investigate to see if init can do this.

 > In UnitTestCaller, I expected the __init__ to look more like:
 >
 > class UnitTestCaller( UnitTestCase.UnitTestCase ):
 >   def __init__(self, name, method, test_object):
 >     UnitTestCase.UnitTestCase.__init__(name)
 >     if not callable(method):
 >       raise TypeError, "method must be callable"
 >     if name[:5] != "test_":
 >       raise TypeError, "method name must begin with 'test'"
 >
 >     self.method = method
 >     self.test_object = test_object
 >
 >
 > That is, call the parent's constructor to handle checking and setting
 > self.name .  There's also a function called "callable", which is
 > perhaps better than checking if the object is a MethodType.  (Checking
 > for MethodType won't work for built-in methods, which have a different
 > type.  But then, I doubt test suite code will be written as C
 > extensions.)
 >
 >
 > I don't follow the need for 'self.method = method.im_func' in your
 > code.  Methods in Python are bound to the object, so unlike C++ you
 > don't need to pass around the object and a pointer-to-method.  What
 > you are doing is unbinding the object.
 >
 > It looks from your code you never call the method other than with the
 > original test_object.  That being the case, you don't even need to
 > track the test_object, since the bound method keeps that reference.
 >
 > Otherwise, you should try mucking around with __class__, as in
 >
 > class F:
 >   def __init__(self, i):
 >     self.i = i
 >   def p(self):
 >     print self.i
 >
 > f = F(5)
 > g = F(10)
 > f.__class__.p(g)
 >
 > The last line prints `10'.  Or do F.p(g) to get the same result.
>
   I need to investigate this more.
 >  ===
 >
 > Your UnitTestSuite.build_suite is:
 >
 >   def build_suite( self, modname, test_object ):
 >
 >     print 'build_suite'
 >     module = sys.__dict__[ 'modules' ][ modname ]
 >     for entry in module.__dict__[ modname ].__dict__.keys():
 >       if( entry[ :5 ] == 'test_' ):
 >         method = eval( 'test_object.' + entry )
 >         caller = UnitTestCaller.UnitTestCaller( entry, method,
> test_object )
 >         self.add_test( caller )
 >
 > What I *think* you want is:
 >
 > # This is stolen from John Aycock's SPARK, which uses "t_" to indicate
 > # methods defining tokens and "p_" to indicate grammer rules.
 > def _namelist(instance):
 >         namedict, classlist = {}, [instance.__class__]
 >         for c in classlist:
 >                 for b in c.__bases__:
 >                         classlist.append(b)
 >                 namedict.update(c.__dict__)
 >         return namedict.keys()
 >
 >
 >   def build_suite( self, test_object ):
 >
 >     print 'build_suite'
 >     for entry in _namelist(test_object):
 >       if( entry[ :5 ] == 'test_' ):
 >         method = getattr(test_object, entry)
 >         caller = UnitTestCaller.UnitTestCaller( entry, method )
 >         self.add_test( caller )
 >
 >
 > That is, in your code, modname appears only to be used to get the
 > methods of the test_object, which is found here using "_namelist()".
 > I also used getattr instead of eval(), and assumed the method would be
 > kept bound.
 >
 >   ==
 >
 > Your UnitTestResults class is more cleanly written as:
 >
 > class UnitTestResults:
 >   def __init__(self):
 >     self.failures = []
 >     self.errors = []
 >     self.warnings = []
 >   def was_successful(self):
 >     return not(self.failures or self.errors)
 >   def print_*(self):
 >     # as you had them
 >
> > This removes most of the methods, and makes the interface more Python
 > like.  Instead of callers using "results.count_errors()", they just do
 > "len(results.errors)".  Instead of "results.add_failure(failure)" it's
 > "results.failure.append(failure)".
 >
 > C++ has gotten most people to try to wrap things up inside of methods.
 > That allows implementation details to be hidden.  Python works "by
 > protocol" rather than by type signature, so doesn't need the same
 > precautions.  Even if the implementation was to change completely, the
 > failure/error/warning lists can be created via a __getattr__ hook.
 >
 > Oh, and to be one step cleaner, make the three lists derive from
 > UserList, then override __str__ to implement the appropriate print_*
 > functionality.
 >
 >   ==
 >
 > Why is "invoke_suite" a method of the TestCase?  I could easily be
 > written as a function, since it uses no instance or class
 > data/methods.  Methods which never need "self.something" to me usually
 > suggest they should just be functions.
 >
 >   ==
 >
 > Okay, now for the program as a whole.  You create an instance of a
 > TestCase, then invoke with the module name (BTW, you can pass modules
 > around as an object instead of going through
 > sys.modules[module_name]).
 >
 > Once created, the methods are scanned to make a list of
 > UnitTestCallers.  When the test suite is run, the TestCase is
 > initialized using "setup", then the individual test callers are
 > executed.  They return with either None (to indicate success), or an
 > object (to indicate an error) or via an exception (to indicate a
 > failure).
 >
 > After the test is run, the test case is "tear_down"ed (potentially to
 > allow the tests to be run again?).
 >
 > The comparison of test results with the gold standard is done through
 > methods named "assert_*".
 >
 >
 > If that's how it is supposed to work, here are some questions:
 >
 >   1) is the order of method execution guaranteed?  I have test cases
 >   which build on previous results.  It appears either those cases will
 >   have to all be in a single method, or I need a guarantee of order so
 >   I can store results in the instance.
 >
   No.  Actually, one of the things I like about the Xtreme approach is that
 each test is isolated and you don't have to hunt for hidden dependencies.
 IMHO, tests with dependencies should be grouped together.
>
>   2) how do I test exceptions?  Is it
 >    try:
 >      obj.meth()
 >    except TypeError, obj:
 >      self.assert_equals(obj.__class__, TypeError)
 >
 >   or simply
 >
 >    try:
 >      obj.meth()
 >    except TypeError, obj:
 >      pass
 >
 >   and let the caller catch any other exceptions (as a failure rather
 >   than an error).  (Should there be the difference between those two
 >   types of errors?)
 >
    Yes, I think so.

 >   3) this is a big one ...  What advantage does this framework give me
 >   over the, perhaps, naive version I've been working with?
 >
   1.  Each test is isolated, so I only have to look for dependencies in
 setup and the local test function.

 2.  This isolation also makes it easier for someone reading the code.  You
 may remember the sequence but s/he doesn't.

 3. I can create suites of just a few test cases.  If only three tests fail,
 I don't have to rerun everything.

 4.  I can separate the design of test cases from the mechanics of
 implementing.  Because, I can use a rote mechanical method to implement
test
 cases, almost like filling out a form, at least for me, it frees me to
think
 more about how to test.
 .

 5. If there's a failure, I know exactly where to look in the code.

 > Have you taken a look at the one I used?  It drives a set of test_*.py
 > scripts, captures text written to stdout, and compares the text to a
 > golden copy.  Errors are signified either by a difference in the test,
 > or finishing with an exception.
 >

 I've looked superficially, but I plan to look closer this weekend.

 > This corresponds roughly to each test_*.py file is a TestCase, where
 > the setup() does nothing and the tear_down() deletes all variables.
 > The assert_* function is an assert_equals between the generated output
 > and the expected output.
 >
 > The main difference I see is my code fails a test at the first time
 > the two outputs differ, while yours can keep on going.  However,
 > that's not quite true if attributes are stored in a TestCase for
 > other, future test cases.  In fact, if things aren't stored, that's
 > equivalent to spliting up a test_*.py file into several different
 > ones.
 >
 > Mine has the advantage that if something changes, then once you've
 > verified the new regression output, it can be made into the new gold
 > standard.  With your code, you have to modify scatters assert_equals
 > and assert_condition lines.
 >
    I use the diff function when it comes to comparing output files, but I'm
 not sure it's appropriate for every situation.  I think a list of passed
and
 failed test cases provides a useful summary and if mnemonic names are used,
 give you an idea of what was covered.
 >
>
 >
 >
                                                            Cayte
>