[Bioperl-l] question about the nature of bioperl

Tue, 20 Aug 2002 22:08:31 +0100 (BST)

On Tue, 20 Aug 2002, nkuipers wrote:

> I mean no offense to anyone, especially being new to this mailing list, but I 
> am starting to wonder just what people expect from this project.  Is this 
> supposed to be a do-it-all bioinformatics kit or a set of basic tools that 
> people are free to use, fleshing it out with their own code as per their 
> specific application?  It just seems to me that, if not already, the project 
> is on its way to being what is referred to as a "bloated monster" in computer 
> science classes.  Everyone has their "bug" catches, specific formats, and 
> there are multiple versions flying around with varying degrees of 
> documentation and testedness.  Whoa horse.  Stop.  Trying to account for every 
> single format or user-defined case is in my opinion folly and impossible, 
> especially given the nature of bioinformatics.  Define a broad but simple 
> suite that is flexible to specifics and leave the rest to the users.  That's 
> what Perl was made for by definition: TIMTOWTDI.  This was (is?) probably the 
> idea with bioperl also, but in browsing the hierarchy diagrams and reading the 
> emails, it sounds like a big confused mess that several(?) people are trying 
> so hard to keep in order but the task is too big.  Simplify simplify.  I think 
> there comes a point where too many "bugs" (real or not) means more than 
> debugging.  Easier said than done I know.  Heh.  Pay me no mind.

Hmmm. Depending on your perspective the glass is either half empty (lots
of bugs, people running around, lots of weird use cases tripping people
up) or half full (lots of people using it, lots of reuse of objects etc).

Remember on the mailing list people are massively biased towards emailing
in things when it doesn't work. When it does work they don't send a quick
email "boy... did that parser work out ok just as it said it would do!"

So - do we know whether people are using it or not? To be honest, I know
that *I* use it alot and it *works for me*. That's certainly good
enough. I also have a certain amount of pride that I know that a fair
whack - indeed an embarassing number of tests pass which are pretty
relevant to - at the very least - what I do.

Then I put alongside two projects I have personally work on which reuse
bioperl code - Pfam database at Sanger and the Ensembl project. In both
cases in fact the bioperl parts are somewhat deep and probably ossified
(or in the process of ossification), but that's ok - they still work.

Then I can go on and think about GBrowse stuff from Lincoln and Elia's
pipeline and in fact Richard Copley - not a man to lightly jump into
packages and use stuff admitted to me that he "doesn't write his own
parsers". And I have no idea what else is out there, but I am pretty sure
*lots* of people use it. Or is everyone on this list watching Jason fix
Blast parsing bugs just for the hell of it (and if so ... lets tell him
now, because I am sure would happily leave it be!)

I think we are doing fine. The problem about open source development is
you only hear people bitch. Noone really praises you. 

(PS - I have also been knee deep in other open source projects, eg Gnome,
and we are no better and no worse than them in keeping software
sanity. Large software with a lifetime of years is *hard to write* and
there will always be crufty areas. You just have to push those out into
the cornors of the space. I think this is true of all large projects...)