Bioperl: article for Dr. Dobb's Journal

Ewan Birney birney@sanger.ac.uk
Fri, 9 Oct 1998 17:34:55 +0100 (BST)


On Fri, 9 Oct 1998, Lincoln Stein wrote:

> Ewan Birney writes:
>  > It's a very nice article. How much do you need cut out? Here are some
>  > suggestions:
> 
> The article needs to be cut by about 50% (I'd actually asked to make
> it a two-parter originally, but got turned down).  If I cut out the
> alignment stuff there will still need to be some substantial trimming
> in the rest.  Alternatively, I could focus on the alignment algorithm
> entirely, and this is what the editor has suggested.  I hate leaving
> out all the OO stuff, however.
> 

To be honest I think the OOP stuff is more important than the algorithm
and the fact that perl is the *ideal* language to glue and provide a
development 'framework' is v. important. But the algorithm might look
more sexy to people. I'd go OOP-Perl to say that it is more than a 
web/systems glue language.


[snip]

>  > I think your biggest saving would be to drop the alignment class stuff
>  > all together. It's sad because that's where this stops becoming simple
>  > datastructures and starts getting interesting (and of course, I find
>  > alignments v.interesting), but I think trying to explain OOP-perl,
>  > bioinformatics and dynamic programming all in one small article is taking
>  > on quite a job.
> 
> Do you think the alignment part is strong enough to stand on its own?
> The code actually runs pretty slowly and uses a lot of memory (and
> uses a horrible trick in which strings are turned into numbers
> automagically).  Maybe I should focus on the algorithm and then show
> how it can be turned into an XS module.
> 

Perhaps. Does DDJ really want an explanation of dynamic programming? It
isn't very 'perly' then, and alot of people have written about dynamic
programming alot (ie - you'd have to watch out that you didn't tick off
some computer science types by your explanation - I tend to do this alot 
<shucks>).

I think it is foolish to write dynamic programming in perl if it is a
serious thing to be used in anger. DP is a v. cpu intensive algorithm
which is almost perfect for a RISC chip + a good C optimiser. I think the
algorithm -> C implementation + C API -> bioperl intergration via XS is a
much more realistic example of this... It makes the article much more
'here is a complex algorithm that we want to provide sensibly for non-C
users to use'.

I might point out of course that the current dump from the bio-perl cvs 
directory has a protein smith-waterman implementation written in C and
stuck in via XS - it produces a Bio::SimpleAlign object which is a pure
perl object. Quite an interesting starting point if you are looking for
pre-cooked implementations... (guess who wrote it <grin>)



> 
>  > b) I think the point about perl is that not only is it a rapid development
>  > cycle but that existing command line based solutions can be worked into
>  > it, as can C based APIs (a la AcePerl and the bioperl alignment
>  > routines).
> 
> Very good point.  I'll add that to the intro.
> 

I've been claiming that Perl (not java) is the ideal driver language for
'components' of code that you want to put together - some components
written in Perl, some in C/C++, some CORBA'ized. (I had some odd looks at
Objects in bioinformatics when I said that...).

There are lots of things you can focus on in this article. I guess you're
going to have to weigh up 'readability' 'sexiness' and 'importance'. 

I'm happy to reread anything if you like. Have fun!

> Lincoln
> 
> -- 
> ========================================================================
> Lincoln D. Stein                           Cold Spring Harbor Laboratory
> lstein@cshl.org			                  Cold Spring Harbor, NY
> ========================================================================
> 

Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================