[Bioperl-l] Public Release of the Xobjects gene expression package

Nathan O. Siemers siemersn@bms.com
Sun, 29 Jul 2001 17:40:55 -0400


Folks,

        As discussed with some people at the BOSC, we are releasing
        our core package for manipulating gene expression information,
        dubbed "Xobjects".  The current tarball for this package can
        be downloaded from my personal low bandwidth server:

        http://www.fiveprime.com

        I can imagine uploading this to CPAN, but I thought I would
        let you chew on it a bit first and let us know what you think.

        Xobjects is a fairly simple system that relies on hashes of
        gene expression information, produced by "loaders" for the
        particular technology (see AffyCHP.pm and ArrayVS.pm in the
        distribution for example loading modules).  Once these data
        are loaded, almost any combination of aggregation,
        normalization/scaling, and ratio-taking may be performed.  The
        objects themselves (hashes of data with associated methods)
        may be built up into a tree of relationships that can
        accurately and flexibly reflect the particular biological
        design of the experiment in question.  There is a simple
        example in the Xobject.pm POD.

        Once created and transformed, various output methods allow you
        to deliver the processed data into excel, genecluster, and
        various other formats for further analysis, plus some
        primitive web outputs.  

        
        Xobjects is released under the terms of the LGPL.  If there
        are any serious issues with the LGPL and BioPerl, let me know.

        Have fun, and please bear with us - I wish I could say that
        this was a "perfect" release, but there are of course loose
        ends.  Here is the list of Caveats I can think of:

        This is *not* integrated into the BioPerl Object Models.

        Some of the POD about data structures may be slightly stale,
        but the truth can be gleaned from the code easily.

        To our chagrin, Chart::GNUPlot no longer seems to exist at
        CPAN.  Two methods we wrote for creating scattercharts and
        histograms of data depend on this.  We have for now commented
        out the respective areas so that the software will build on
        systems without Chart::GnuPlot.  A replacement for these
        methods should be put in place, and they should be separate
        from the Stats.pm statistical module where they currently
        reside.

        Stats.pm works, but is ugly (my first perl module years ago
        now), and needs to be replaced by the newer stat module that
        has finally appeared on cpan.  When Xobjects was first
        written, the public Statistics::Descriptive was not sufficient
        to get our work done.

        We had to rip out some database-specific code that lets us
        load xobjects from our home-brewed gatc relational database.
        We would be happy to share the methods, but because of
        customizations they may not be functional out of the box.

        We also removed our calls to our internal databases in
        TieGene.pm, the module that retrieves descriptive information
        about probes on the arrays.  All you need to do is supply the
        appropriate hashes (from databases, flat files, etc) for your
        data and you should be up and running.



        Let us know of any problems!

        Thanks,

        nathan


        Nathan Siemers
        Xin Huang
        Donald Jackson
        

       

        

        
        
           

-- 
Nathan Siemers, Group Leader, Bioinformatics-Applied Genomics
Bristol-Myers Squibb Pharmaceutical Research Institute, Hopewell 3-0.07
P.O. Box 5400, Princeton, NJ 08543-5400, (609)818-6568
nathan.siemers@bms.com