[BioSQL-l] Affymetrix SQL for PostgreSQL
Hilmar Lapp
hlapp at gnf.org
Thu May 1 15:20:08 EDT 2003
Sounds great. Here are a few comments as for my $0.02 ...
There's probably as many expression data schemas out there as labs
hosting expression data. There's not that many big efforts making a
generalizing attempt, but there are some (GEO, ArrayExpress, GeneX,
RAD, SMD, and I'm sure a couple more).
If gene expression tables in the 'official' BioSQL (everyone can - and
many will - have his/her own, extended or whatever, build), a design
that attempts to be generic and technology agnostic would be most
attractive to me.
Gene expression not having been within the scope of BioSQL yet ever,
I'd prefer to take as much advantage of existing open-source schemas as
possible, since then the reality-check has already happened and the
software support may come with it.
Lately GMOD/Chado faced a similar situation, and Allen who I believe
took the lead on that project settled on integrating the respective
parts of GUS/RAD.
Allen, how did that work out? Could we just build on your work and RAD?
Marc, what made you decide to disregard the big expression schemas? (No
offense whatsoever, I'm just curious.)
The way I could envision a different design of a gene expression model
in BioSQL is as a warehouse star-schema, where there'd be essentially
one (or very few) analytical data tables, and all the rest is hosted by
the existing biosql tables (i.e., mostly the term table). It would be
understood then that people would host their expression data in another
schema, and the biosql table(s) would be used as a warehouse only.
-hilmar
On Thursday, May 1, 2003, at 12:08 PM, Marc Colosimo wrote:
>
> Since I couldn't easily find a good schema, I made my own based on
> Affymetrixs GATC schema. My hope is that as I develope it, that it will
> use parts of BioSQL to handle the non-array stuff (taxon, sequence
> databases, etc...). I only have a few tables made and they are not
> normalized (one actually I think is best de-normalized). Oh, I am
> keeping
> in mind MIAME stuff.
>
> I have one script that is almost finished that loads in CEL files. I
> just
> have a few complex regexs to make/debug and add support for bulk
> loading
> on a local machine (piping it to psql). Now that I have played around
> with
> DBI, loading CDF files are next.
>
> If people are interested in the code to try it out, let me know.
>
> Marc
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list