[Bioperl-l] Tests involving remote databases

Fri Sep 29 16:44:15 UTC 2006

Sendu, all,

We're running into several problems with tests being skipped based on URL
failure but passing as 'ok'.  Two incidents involving XEMBL_DB.t and
Biblio_biofetch.t come to mind, and recent problems have now surfaced with
proxies (thanks Torsten!).  

As one option, I would like to propose using the following SKIP block format
(or something similar) and Test::More to run checks on remote DB's in the
various tests where remote DB access is required.  There are probably
similar ways to accomplish the same thing using Test, but I believe
Test::More makes it easier.

SKIP:{
  skip('Set BIOPERLDEBUG=1 to run tests which require remote DB access', 5)
      if !$DEBUG;
  my $db = Bio::DB::GenBank->new();
  my $seq;
  eval { $seq = $db->get_Seq_by_acc('ABC123')};
  ok(!$@, 'Bio::DB::GenBank URL test');
  skip('Bio::DB::GenBank URL failure', 4) if $@;
  ... # four more tests based on $seq 
}

Most of us do not run each set of tests individually unless we are
developing code that relies on a specific set.  Most often, when we run all
of the tests we use 'make test' and Test::Harness, which treats skipped
tests as 'ok.'  

So, in effect, we never see any URL failure, just that all tests pass.  This
practice isn't used by other CPAN modules.  WWW::Shorten, for instance, runs
tests on all URLs and fails if they are invalid.  If we did the same thing,
we would have picked up on the following almost immediately when running
full tests:

1)  The XEMBL server has been out for over six months.  
2)  The Biblio_biofetch.t tests probably never worked correctly, judging by
recent fixes (thanks Brian).  

However, we would never had known that if we relied strictly on the summary
results from running 'make test.'  Blindly skipping these doesn't inform us
when the URL is invalid, and so we never manage to address the issue when it
pops up.  Explicit and consistent test failures let us know when things go
wrong (i.e. when the URL is no longer valid).  

Furthermore, the debugging output that accompanies the tests when using
BIOPERLDEBUG=1 obfuscates any potential error messages, so we tend to miss
warning flags that pop up (such as the biblio_biofetch.t warning message
about the bad URL).  Torsten has a proposal that we use a different variable
for running remote DB tests, which IMHO we should consider and which should
take care of this.

Setting these to run based on BIOPERLDEBUG=1 also passes over the remote
tests for most users, which shouldn't cause a problem with spamming the
servers.

Thoughts?  Flames?

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign