[BioSQL-l] Exploring data in database

Kris Boulez kris.boulez at algonomics.com
Thu Mar 27 10:58:07 EST 2003


Quoting Hilmar Lapp (hlapp at gnf.org):
> I've been through this before. For what it's worth, here's what I 
> experienced and learned.
> 
Thanks for the elaborate answer.

[ ... ]
> 
> 	- They will want to run simple queries first, but very quickly will 
> want to drill down and run complex and unpredictable queries. Writing 
> tools to support hard-coded queries is not going to cut it, and writing 
> tools to flexibly and visually build complex queries is a reinvention 
> of a well-invented (though commercial) wheel, and IMHO hence a total 
> waste of the Bioinformaticists' time.
> 
This exactly describe the situation I'm facing.

> 	- These days most bench scientists are at least moderately 
> computer-savvy. Given willingness and curiosity for their data, I 
> haven't encountered lab scientists who could not be taught to build 
> queries visually in MS Access on 1 table or joining 2 tables. It also 
> didn't take them long to even run aggregate queries.
> 
> So, for what it's worth, here's what I would do.
> 
> 1) Scrap MySQL, use PostgreSQL or any other serious RDBMS that honors 
> the fact that there are (normalized) models and (denormalized) views.
> 
 [ I do not want to start a flame war, just asking ]

>From different sides I here good stuff about PostgreSQL. I've been using
MySQL for years, just because it was there and did the job.
Is there some documentation that describes the differences between MySQL
and PostgreSQL ? I found multiple manuals on the web site, but I would
rarther not have to wade through all these pages to find the
syntax/concepts that make PostgreSQL special.




> 2) Design a proper relational data model (that is nice to you as the 
> data maintainer, not nice to the scientists as data viewers or miners), 
> then write views on top of it that most closely resemble the data types 
> in which your lab scientists think.
> 
> 3) Hand out plain descriptions of your views (just name and column 
> names) and ask your scientists whether it makes sense to them. If not, 
> you need to improve or rewrite the views until it does. The lab 
> scientists have actually a very good idea what data they're working 
> with. You just need to match it. Your views will be denormalized. 
> Provide one view for each data entity (not relational entity), and 
> provide views that pre-join commonly joined datatypes.
> 
That is mostly the way I was working now. I found out very quickly that
a scientist isn't intrested in subtleties like 1-1, 1-n or n-n relations
:).


Kris,


More information about the BioSQL-l mailing list