[Bioperl-l] want to bring tools together to help small labs

Fernan Aguero fernan@iib.unsam.edu.ar
Fri, 2 Nov 2001 10:47:45 -0300


+----[ T.D. Houfek (tdhoufek@unity.ncsu.edu) dijo sobre "[Bioperl-l] want to bring tools together to help small labs":
|
| Hi all,
| 
| Recently Jason Stajich visited our lab and gave us a lot of good information
| as well as encouragement to participate here.  But I'm new to this forum,
| so please excuse me (yet still tell me) if I stray too far from its proper subject
| matter.  Besides whatever my lab puts on our plate at any given moment,
| we're chiefly interested in working on freely available open-source software
| geared towards the needs of small-to-medum size laboratories doing
| sequencing.  Smaller labs, with their correspondingly small computer hardware
| and bioinformatics salary budgets, have an extremely daunting task on
| their hands even if their ambitions for analysis are modest.  Ultimately
| there is no cure for this problem, but we'd like to do something to ease
| the pain... and I'd greatly appreciate any help anyone can give us.
| 
| Since small labs do more EST sequencing than large genomic assemblies, I'd
| like to develop a distributable Linux/UNIX web application package that:
| 	a) facilitates batching of various analyses for ESTs
| 	b) allows specification of different processing "pipelines" for
| 	   different sets of incoming data.
| 	c) stores sequence data, quality data, meta-data, analysis
| 	   results, etc in a relational database.
| 	d) gives easy web browsing access to this data, allowing specification of
| 	   different levels of access permissions for different data sets.
| 	e) seriously eases data management burdens, including:
| 	   	1) file organization
| 		2) sequence data quality control
| 		3) data backups
| 		4) logging of analysis histories
| 	f) installs easily
| 	g) allows almost all ongoing administration to be done by
| 	   researchers or technicians  (non-power-users) through CGI.
| 	h) requires only one fairly decent ( <=$5,000 ) computer, but
| 	   allows a number of ways to distribute the system over more
| 	   machines (so that a lab can separate the workhorse and the
| 	   web server, or grow a small compute farm).
| 
| There being no point to reinventing the wheel, I'd like to use BioPerl /
| BioJava / etc wherever I can.  If anyone has any thoughts about how such
| might (or might not) fit into such a scheme, or has helpful information
| about what smaller labs they have known might want or need, I'd be most
| grateful!
| 
| T.D. Houfek
| 
|
+----]

Dear T.D.and bioperlers:

Have you had any positive responses and/or suggestions? are there
other people interested in this?

Some time ago we needed the same thing and looked into how ensembl was
doing things since we also wanted to use a relational database backend
to store info and then generate the web pages on the fly through CGI
scripts. For us it was too complex and also had a bias toward higher
eukaryotes, which we did not needed (we work with bacteria and
protozoa). 
In the end we developed our own db schema, scripts and so on ... but
as a first attempt at it I know it is far from being _the_right_thing_
First of all it is too much customized to our own projects and way of
working.

So we now would like to have something more generic, more modularized
and ... simple. So if we agree on the goals (I would also like to add
EST clustering to the list) perhaps we could join our effort.

I haven't looked into bioperl lately, but I thought there was a
bioperl-db or something like this ... I don't seem to find it right
now. I had the idea (perhaps misguided) that it was a generic db
schema and modules to store sequence info and annotation, am i right?

If this is of no interest to the list we can discuss it in private.

Regards,

Fernan

-- 

|  F e r n a n   A g u e r o  |  B i o i n f o r m a t i c s  |
|   fernan@iib.unsam.edu.ar   |      genoma.unsam.edu.ar      |