[Bioperl-l] Re: installing downloaded BioPerl Bundle locally in a linux farm

chris dagdigian dag@sonsorol.org
Thu, 17 Oct 2002 18:18:21 -0400


Hi Bonnie,

{CC'ing this reply back to the list so it gets archived and others can 
idiot check what I've wrote :) }

Among other thigns I build bioinformatics linux farms to pay the rent 
(www.bioteam.net) and have been in your situation many times. Most of my 
clients want local BioPerl available on every cluster node. Here are 
some options that I have used:

(1) If your cluster head node is multi-homed onto both the private 
network and a network with internet access then you can set up the head 
node to do NAT translation. This will allow your slave nodes to have 
mediated access to the internet. This is actually very useful for 
downloading patches, kernels, software and OS updates as well as for 
cluster apps that may need to open ODBC connections to an external SQL 
database somewhere within the organization.

(2) I'm not sure how you manage the OS on your slave nodes and how you 
deal with unattended installations and incremental updates. All of the 
clusters I build use SystemImager (www.systemimager.org) to manage 
slave OS versions. With SystemImager I can install the CPAN modules on 
_ONE_ slave node and then 'push' those files out to all the remaining 
slave nodes while they are online or during the next OS install. This is 
great because I get the advantage of having the modules be on local disk 
as part of my default compute element system image without the hassle of 
manually doing N installs by hand.

(3)(I have not done this myself)  The shared directory option. Perl 
allows you to install and invoke perl modules from just about anywhere. 
You could create a cluster-wide share called "/n/cluster/perl5modules/" 
or whatever. You can set a custom target directory for your modules to 
install into by appending an additional option on to your "perl 
Makefile.PL" command. Using the share I named above your initial command 
to install the module (after you unpacked it) would be:
  "perl Makefile.PL PREFIX=/n/cluster/perl5modules/"

You can install all your BioPerl and any other modules into that 
directory and then your cluster nodes (or any other system that can see 
the share) can make use of them as long as the following 'use' directive 
is inserted into the perl script:
"use lib /n/cluster/perl5modules/";

Note that this involved unpacking each CPAN module and doing the "perl 
Makefile.PL; make test; make install" dance. I'm sure that if you wanted 
to you could configure CPAN.pm such that your automatic downloads were 
intstalled into a custom target directory. Google and perl documentation 
in general does a good job of explaining how to install and use perl 
modules from non standard locations.

Another option that does not require a NFS share would be to install 
these modules into one slave node inside a custom directory. You could 
then use rsync or a similar tool to mirror the directory out cluster 
wide so that all the slave nodes have local copies in known locations.

Personally I find that going through the trouble of setting up NAT for 
my cluster compute elements is worth it. It allows me to have a 'golden 
client' slave node download whatever it needs from the internet 
including OS updates and perl modules. Once I do that SystemImager makes 
it a trivial process to propagate changes cluster wide. Nice and neat 
with very little administrative overhead.

Regards,
Chris





BHurwitz@twt.com wrote:
> Hi Chris,
> 
> I need to install all of the external CPAN modules used by BioPerl on all
> of the nodes on my linux farm.  These nodes are on a private network and
> therefore cannot see the FTP site for CPAN.  I downloaded all of the
> modules included in the current Bundle:BioPerl by using the "get" command
> rather than the "install" command in CPAN.pm on the head node of my cluster
> (which is connected to the rest of the world), but I am not sure how to
> install from these downloaded modules on my slave nodes.  Is there a quick
> way of doing this if I move them into a shared directory using CPAN or do I
> have to install each one separately?  I have trolled through the
> documentation and I can't seem to find anything that works.  The beauty of
> CPAN is downloading the latest and greatest version from the FTP site, and
> there doesn't seem to be anything on installing these locally.  If you
> happen to know this, I would greatly appreciate your advice!  I am starting
> to pull out my hair.
> 
> 
> Thanks :)
> Bonnie

-- 
Chris Dagdigian, <dag@sonsorol.org>
Independent life science IT & informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net