back in action / preSeq.pm

Steven E. Brenner brenner@akamail.com
Fri, 27 Jun 1997 23:39:10 +0900 (JST)


Sorry for the previous, useless, mailing.

> My makefiles for PreSeq and Parse work fine, the only thing I need to do is write some type of 
> robust shell script called 'configure' that will do some quick in-place editing on PreSeq.pm and 
> Parse.pm that will be able turn off/on the use of Parse.pm

One comment: to aid compatibility with non-Unix platforms, the default
(i.e., if the configure program isn't run) should be for Parse.pm to be
off.  This will probably also be apropriate for the majority of unix
installations. 


> Also- inside GI we have finally gotten around to replacing our numerous scripts that have been 
> lying around for Blast/Fasta search result parsing. We now have a really nice object oriented 
> Blast.pm module and the Fasta.pm will be done shortly. Both of these modules are quite 
> standalone, capable of firing up a local or networked search and returning all kinds of 
> different info raw or in various sorted ways. I was thinking that although these modules do not 
> quite fit into our schema, they might be quite useful to the individual researcher.

Sounds interesting, and I think that we should release them under the Bio
hierarchy if GI is willing.  I'm not sure if we should have a Bio::Misc,
to lump all such things under, of it is better just to put them all at the
root under Bio::.  What do people think?  (I have an opinion, but I'd like
to hear other thoughts.) 


BTW, like everyone else, I also have a set of modules for parsing BLAST
and FASTA output.  My parsers are really designed for speed, since I (when
I'm doing this sort of work), I need to deal with the output of 100,000s
of runs a day.   The internal code (espeically for the Fasta) is pretty
clean, but the interface is a bit clunky.



> One problem that was totally glossed over at the OIB meeting is that a majority of researchers 
> out there do not have the support of bioinformatics groups containing dozens of computer 
> scientists.

A good point, and the sort of issue frequently missed at these sorts of
gatherings.


> While perl might not scale to super-industrial bioinformatics applications,

Actually, Perl has been shown to scale to very large scale projects; it is
much of what holds together the results at the MIT genome lab, and I think
that much of what goes on at TIGR and Sanger also relys on it.  Perl is
replacing duct tape for holding the universe together.


>  there is 
> a huge need for simple, robust tools that are useable by individuals. I really think that this 
> is one area where BioPerl can be of tremendous help. Any thoughts?

But, as you say, small-scale work is where Perl is essential, because no
individual can afford the time to invest in complicated tools with a huge
learning curve.  I agree that BioPerl can really provide an accessible
means of using powerful resources.  Excellent point, and something we
should stress in the Perl Journal article -- and in our code development!

Cheers,

  Steve