From gwu at molbio.mgh.harvard.edu  Tue Jan 27 17:39:57 2009
From: gwu at molbio.mgh.harvard.edu (gwu)
Date: Tue, 27 Jan 2009 17:39:57 -0500
Subject: [BioSQL-l] Genbank loading time
Message-ID: <497F8D3D.5060907@molbio.mgh.harvard.edu>

Hi Everyone,

I recently visited the BioWarehouse web site and the document shows 
loading the whole Genbank into their database takes the data loader 68 
hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a 
similar test done with BioSQL?

Gang Wu

From holland at eaglegenomics.com  Tue Jan 27 17:57:59 2009
From: holland at eaglegenomics.com (Richard Holland)
Date: Tue, 27 Jan 2009 22:57:59 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F8D3D.5060907@molbio.mgh.harvard.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
Message-ID: <497F9177.7040309@eaglegenomics.com>

It would depend on the toolkit you use. BioWarehouse is a complete API,
whereas BioSQL is just a schema and the way in which it is populated
(and therefore how long that takes) depends on your toolkit.

Currently I'm aware of loaders existing for BioJava, BioPerl, and
possibly also BioPython. However each of them load the same data in
subtly different ways, so can't be directly compared in terms of which
one is faster than the other.

I vaguely remember seeing some performance figures for the
BioJava/Genbank/BioSQL combination somewhere, but it's been a while! I'm
not sure where they were documented though - I certainly haven't got
them written down anywhere. Mark Schreiber might know as he definitely
did some testing of this - Mark, can you remember what the figures were
for BioJava?

As for BioPerl/BioPython/etc. I expect their respective project authors
will respond to this thread accordingly with the figures from their own
domains!

cheers,
Richard

gwu wrote:
> Hi Everyone,
> 
> I recently visited the BioWarehouse web site and the document shows
> loading the whole Genbank into their database takes the data loader 68
> hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a
> similar test done with BioSQL?
> 
> Gang Wu
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
> 

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/

From hlapp at gmx.net  Wed Jan 28 00:09:04 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 28 Jan 2009 00:09:04 -0500
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F9177.7040309@eaglegenomics.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
Message-ID: <72E5157F-02BC-40F6-A59D-3E887A5207C8@gmx.net>

The loader for BioPerl is load_seqdatabase.pl, which is part of  
bioperl-db. With machines current as of 3-4 years ago, I saw upload  
speeds of between 5 and 15 sequences per second for richly annotated  
sequences (human/mouse RefSeqs).

If you are talking about all of GenBank, the far majority of that will  
be ESTs and sequencing reads (do you really want to load those?),  
which are typically sparsely annotated if at all, and so should be  
faster. mRNA and cDNA sequences will be more in the above range.

I have never loaded all of GenBank into a database (and I'm not sure  
why anyone would want to do this) and so don't have a comparison  
figure for the total for that.

Finally, several instances of load_seqdatabase.pl can be nicely run in  
parallel on multi-core machines.

	-hilmar

On Jan 27, 2009, at 5:57 PM, Richard Holland wrote:

> It would depend on the toolkit you use. BioWarehouse is a complete  
> API,
> whereas BioSQL is just a schema and the way in which it is populated
> (and therefore how long that takes) depends on your toolkit.
>
> Currently I'm aware of loaders existing for BioJava, BioPerl, and
> possibly also BioPython. However each of them load the same data in
> subtly different ways, so can't be directly compared in terms of which
> one is faster than the other.
>
> I vaguely remember seeing some performance figures for the
> BioJava/Genbank/BioSQL combination somewhere, but it's been a while!  
> I'm
> not sure where they were documented though - I certainly haven't got
> them written down anywhere. Mark Schreiber might know as he definitely
> did some testing of this - Mark, can you remember what the figures  
> were
> for BioJava?
>
> As for BioPerl/BioPython/etc. I expect their respective project  
> authors
> will respond to this thread accordingly with the figures from their  
> own
> domains!
>
> cheers,
> Richard
>
> gwu wrote:
>> Hi Everyone,
>>
>> I recently visited the BioWarehouse web site and the document shows
>> loading the whole Genbank into their database takes the data loader  
>> 68
>> hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a
>> similar test done with BioSQL?
>>
>> Gang Wu
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>
>
> -- 
> Richard Holland, BSc MBCS
> Finance Director, Eagle Genomics Ltd
> M: +44 7500 438846 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Wed Jan 28 06:50:50 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 11:50:50 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F9177.7040309@eaglegenomics.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
Message-ID: <320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>

On Tue, Jan 27, 2009 at 10:57 PM, Richard Holland wrote:
>
> As for BioPerl/BioPython/etc. I expect their respective project authors
> will respond to this thread accordingly with the figures from their own
> domains!

I can tell you importing GenBank files into BioSQL with Biopython is
faster than BioPerl, sometimes several times faster, but this will
depend on the nature of the files (e.g. genomes versus ESTs).
http://lists.open-bio.org/pipermail/biosql-l/2008-August/001320.html
http://lists.open-bio.org/pipermail/biopython-dev/2008-April/003625.html

I don't have any BioJava comparison figures.  In any case, as Richard
points out, there will be slight differences in the different Bio*
tools how exactly how the data is parsed and stored.

I've never tries to import the whole of GenBank, so I don't have any
numbers for you there.

Peter
(Biopython)

From biopython at maubp.freeserve.co.uk  Wed Jan 28 11:40:55 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 16:40:55 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
Message-ID: <320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>

On Wed, Jan 28, 2009 at 4:29 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I don't think sequence loading via load_seqdatabase.pl uses BioPerl.  If one
> uses BioPerl and bioperl-db the following can explain at least some of the
> reason why loading is slow:
> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
> We also go through the extra hand-wringing with Bio::Species objects
> (something I don't think the other Bio* worry about).

Looking at the source code for the load_seqdatabase.pl script included
with bioperl-db, my impression is it uses Bio::DB::BioDB to talk to
the database, and Bio::SeqIO to parse the input sequence files (in
this case, Bio::SeqIO::genbank is used).  See:

http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-db/trunk/scripts/biosql/load_seqdatabase.pl

> Regardless, it's not an easy problem to work around.  There are such things
> as Moose, and Perl6 is now in alpha...

I'll take your word for it - I'm in no position to improve anyone's Perl code ;)

Peter

From cjfields at illinois.edu  Wed Jan 28 11:29:50 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 10:29:50 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
Message-ID: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>


On Jan 28, 2009, at 5:50 AM, Peter wrote:

> On Tue, Jan 27, 2009 at 10:57 PM, Richard Holland wrote:
>>
>> As for BioPerl/BioPython/etc. I expect their respective project  
>> authors
>> will respond to this thread accordingly with the figures from their  
>> own
>> domains!
>
> I can tell you importing GenBank files into BioSQL with Biopython is
> faster than BioPerl, sometimes several times faster, but this will
> depend on the nature of the files (e.g. genomes versus ESTs).
> http://lists.open-bio.org/pipermail/biosql-l/2008-August/001320.html
> http://lists.open-bio.org/pipermail/biopython-dev/2008-April/003625.html

I don't think sequence loading via load_seqdatabase.pl uses BioPerl.   
If one uses BioPerl and bioperl-db the following can explain at least  
some of the reason why loading is slow:

http://www.bioperl.org/wiki/Why_BioPerl_is_slow

We also go through the extra hand-wringing with Bio::Species objects  
(something I don't think the other Bio* worry about).

Regardless, it's not an easy problem to work around.  There are such  
things as Moose, and Perl6 is now in alpha...

chris

> I don't have any BioJava comparison figures.  In any case, as Richard
> points out, there will be slight differences in the different Bio*
> tools how exactly how the data is parsed and stored.
>
> I've never tries to import the whole of GenBank, so I don't have any
> numbers for you there.
>
> Peter
> (Biopython)

From cjfields at illinois.edu  Wed Jan 28 11:53:49 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 10:53:49 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
Message-ID: <37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>


On Jan 28, 2009, at 10:40 AM, Peter wrote:

> On Wed, Jan 28, 2009 at 4:29 PM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>>
>> I don't think sequence loading via load_seqdatabase.pl uses  
>> BioPerl.  If one
>> uses BioPerl and bioperl-db the following can explain at least some  
>> of the
>> reason why loading is slow:
>> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
>> We also go through the extra hand-wringing with Bio::Species objects
>> (something I don't think the other Bio* worry about).
>
> Looking at the source code for the load_seqdatabase.pl script included
> with bioperl-db, my impression is it uses Bio::DB::BioDB to talk to
> the database, and Bio::SeqIO to parse the input sequence files (in
> this case, Bio::SeqIO::genbank is used).  See:
>
> http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-db/trunk/scripts/biosql/load_seqdatabase.pl

My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm  
wondering, though, whether it would be feasible to have a direct  
loader for the most common database formats (GenBank/EMBL/Swiss),  
something similar to the taxonomy loader that doesn't rely on any  
specific Bio* package.

>> Regardless, it's not an easy problem to work around.  There are  
>> such things
>> as Moose, and Perl6 is now in alpha...
>
> I'll take your word for it - I'm in no position to improve anyone's  
> Perl code ;)
>
> Peter

Well, the problem lies with perl5's welded-on OO which isn't easy to  
work around, particularly inheritance issues.  Supposedly Moose helps  
speed things up a bit; it doesn't hurt that it is based somewhat on  
perl6's Objects:

http://feather.perl6.nl/syn/S12.html

chris


From hlapp at gmx.net  Wed Jan 28 12:06:01 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 28 Jan 2009 12:06:01 -0500
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
Message-ID: <0BD2B914-3E57-4266-AE4E-EA8B2F1DD307@gmx.net>


On Jan 28, 2009, at 11:29 AM, Chris Fields wrote:

> I don't think sequence loading via load_seqdatabase.pl uses BioPerl.


It does, actually. All the input parsing is done by BioPerl. Bioperl- 
db only does the persistence, and the script itself handles all the  
command line options, opens files, yadda yadda ...

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Wed Jan 28 12:17:57 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 17:17:57 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
Message-ID: <320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>

On 1/28/09, Chris Fields <cjfields at illinois.edu> wrote:
>
> My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm
> wondering, though, whether it would be feasible to have a direct loader for
> the most common database formats (GenBank/EMBL/Swiss), something
> similar to the taxonomy loader that doesn't rely on any specific Bio* package.
>

You could re-invent the wheel, and write yet another
GenBank/EMBL/Swiss parser in standalone perl for use within
load_seqdatabase.pl but I really don't see any point to this.  Reusing
the BioPerl parser seems most sensible, especially given that
bioperl-db is an extension to bioperl in the first place - and the
BioPerl parsers already exist and are well tested.

Peter

From cjfields at illinois.edu  Wed Jan 28 12:47:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 11:47:20 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
Message-ID: <D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>

On Jan 28, 2009, at 11:17 AM, Peter wrote:

> On 1/28/09, Chris Fields <cjfields at illinois.edu> wrote:
>>
>> My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm
>> wondering, though, whether it would be feasible to have a direct  
>> loader for
>> the most common database formats (GenBank/EMBL/Swiss), something
>> similar to the taxonomy loader that doesn't rely on any specific  
>> Bio* package.
>>
>
> You could re-invent the wheel, and write yet another
> GenBank/EMBL/Swiss parser in standalone perl for use within
> load_seqdatabase.pl but I really don't see any point to this.  Reusing
> the BioPerl parser seems most sensible, especially given that
> bioperl-db is an extension to bioperl in the first place - and the
> BioPerl parsers already exist and are well tested.
>
> Peter

My point is, instead of first mapping record data to a specific object/ 
class then mapping the object data to the database, bypass the object  
completely and generically map relevant data directly in the database  
according to the BioSQL schema.

If anything this may force some consistency between the various Bio*  
languages.

chris


From biopython at maubp.freeserve.co.uk  Wed Jan 28 13:18:03 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 18:18:03 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
	<D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
Message-ID: <320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>

>> You could re-invent the wheel, and write yet another
>> GenBank/EMBL/Swiss parser in standalone perl for use within
>> load_seqdatabase.pl but I really don't see any point to this.  Reusing
>> the BioPerl parser seems most sensible, especially given that
>> bioperl-db is an extension to bioperl in the first place - and the
>> BioPerl parsers already exist and are well tested.
>>
>> Peter
>
> My point is, instead of first mapping record data to a specific object/class
> then mapping the object data to the database, bypass the object completely
> and generically map relevant data directly in the database according to the
> BioSQL schema.
>
> If anything this may force some consistency between the various Bio*
> languages.
>
> chris

Ah - so rather than using BioPerl/Biopython/BioJava to import your
sequence files into a BioSQL database, you'd like BioSQL to come with
its own script that does the job?  It would "solve" any
inconsistencies for getting files of data into the database if this
where the only sanctioned way to add records to the database.
However, there are a number of downsides - in addition to the
considerable extra effort needed to write and support another set of
parsers just for BioSQL (without reusing BioPerl/Biopython/BioJava).

What about BioPerl/Biopython/BioJava users who have sequence-record
objects in memory they want to record in the database?  These could
have been loaded from GenBank files originally and then manipulated
(e.g. adding additional crude annotation from running BLAST).  How
would they get them into the database - write them to a GenBank file
and then invoke the project neutral BioSQL provided script?

I think each project needs their own ORM bindings for both loading
data into and from the database.  Improving any inconsistencies in how
each ends up storing sequence files (e.g. GenBank files) can be worked
on gradually.

[Perhaps I have read more into your comment than you intended - if I
have got the wrong end of the stick, please clarify - thanks]

Still, a project neutral BioSQL bundled script (not depending on any
of BioPerl/Biopython/BioJava) for importing a GenBank file into a
database could serve as a "reference implementation" (the role I
currently assign to BioPerl's load_seqdatabase.pl).  And if this
proves faster than load_seqdatabase.pl that's a nice bonus.

Peter

From cjfields at illinois.edu  Wed Jan 28 13:57:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 12:57:25 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
	<D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
	<320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>
Message-ID: <770D510F-C6EA-455E-B017-766587E1B23F@illinois.edu>


On Jan 28, 2009, at 12:18 PM, Peter wrote:

>>> You could re-invent the wheel, and write yet another
>>> GenBank/EMBL/Swiss parser in standalone perl for use within
>>> load_seqdatabase.pl but I really don't see any point to this.   
>>> Reusing
>>> the BioPerl parser seems most sensible, especially given that
>>> bioperl-db is an extension to bioperl in the first place - and the
>>> BioPerl parsers already exist and are well tested.
>>>
>>> Peter
>>
>> My point is, instead of first mapping record data to a specific  
>> object/class
>> then mapping the object data to the database, bypass the object  
>> completely
>> and generically map relevant data directly in the database  
>> according to the
>> BioSQL schema.
>>
>> If anything this may force some consistency between the various Bio*
>> languages.
>>
>> chris
>
> Ah - so rather than using BioPerl/Biopython/BioJava to import your
> sequence files into a BioSQL database, you'd like BioSQL to come with
> its own script that does the job?  It would "solve" any
> inconsistencies for getting files of data into the database if this
> where the only sanctioned way to add records to the database.
> However, there are a number of downsides - in addition to the
> considerable extra effort needed to write and support another set of
> parsers just for BioSQL (without reusing BioPerl/Biopython/BioJava).
>
> What about BioPerl/Biopython/BioJava users who have sequence-record
> objects in memory they want to record in the database?  These could
> have been loaded from GenBank files originally and then manipulated
> (e.g. adding additional crude annotation from running BLAST).  How
> would they get them into the database - write them to a GenBank file
> and then invoke the project neutral BioSQL provided script?

No, one would use the same adaptors as before (bioperl-db for BioPerl,  
for instance).

> I think each project needs their own ORM bindings for both loading
> data into and from the database.  Improving any inconsistencies in how
> each ends up storing sequence files (e.g. GenBank files) can be worked
> on gradually.
>
> [Perhaps I have read more into your comment than you intended - if I
> have got the wrong end of the stick, please clarify - thanks]
>
> Still, a project neutral BioSQL bundled script (not depending on any
> of BioPerl/Biopython/BioJava) for importing a GenBank file into a
> database could serve as a "reference implementation" (the role I
> currently assign to BioPerl's load_seqdatabase.pl).  And if this
> proves faster than load_seqdatabase.pl that's a nice bonus.
>
> Peter

That's what I'm thinking, essentially; something that is Bio*-neutral  
that can be tested against.  And it should be faster at least from the  
standpoint of not having to generate tons of objects.

It's icing if it evolves past the point of a simple reference  
implementation into something that is useful as a fast BioSQL loader.

chris


From cjfields at illinois.edu  Thu Jan 29 08:37:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 07:37:31 -0600
Subject: [BioSQL-l] [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of
	BioPerl-db
In-Reply-To: <AC794CC5-EC13-4BAF-84D5-9CD1CB34220B@inserm.fr>
References: <AC794CC5-EC13-4BAF-84D5-9CD1CB34220B@inserm.fr>
Message-ID: <C07BB428-099F-4492-8591-66F3F6988C64@illinois.edu>

That one may be database-dependent; it passes for mysql 5.1.26-rc.   
What is your db (mysql, Pg, oracle) and version?

Hilmar, any ideas?

chris

On Jan 29, 2009, at 6:28 AM, Johann PELLET wrote:

> Dear Chris,
>
> I have the following error on my Mac machine: (BioPerl 1.6, BioPerl- 
> run
> 1.6) when I try to install Bioperl-db ( biosql-1.0.1):
>
> t/01dbadaptor.....1/23
> #   Failed test in t/01dbadaptor.t at line 44.
> #          got: undef
> #     expected: ''
> # Looks like you failed 1 test of 23.
> t/01dbadaptor..... Dubious, test returned 1 (wstat 256, 0x100)
> Failed 1/23 subtests
> t/02species.......ok
> t/03simpleseq.....ok
> t/04swiss.........ok
> t/05seqfeature....ok
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........5/23
> #   Failed (TODO) test in t/09fuzzy2.t at line 64.
> #          got: undef
> #     expected: 'Q9QYG8'
> t/09fuzzy2........ok
> t/10ensembl.......ok
> t/11locuslink.....ok
> t/12ontology......ok
> t/13remove........ok
> t/14query.........ok
> t/15cluster.......ok
> t/16obda..........ok
>
> Test Summary Report
> -------------------
> t/01dbadaptor (Wstat: 256 Tests: 23 Failed: 1)
>  Failed test:  16
>  Non-zero exit status: 1
> Files=16, Tests=1479, 15 wallclock secs ( 0.27 usr  0.10 sys + 11.15  
> cusr  1.11 csys = 12.63 CPU)
> Result: FAIL
> Failed 1/16 test programs. 1/1479 subtests failed.
>
> -- --
>
> Johann Pellet
> IE Bioinformatique
> INSERM U851, I-MAP CERVI
> 21, Avenue Tony Garnier
> 69365 Lyon cedex 07 France
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From michael.watson at bbsrc.ac.uk  Thu Jan 29 09:41:05 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Thu, 29 Jan 2009 14:41:05 -0000
Subject: [BioSQL-l] Web front-ends to BioSQL
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>

Hi

I am thinking about a project involving storage of sequences in a
relational DB and of course thought of BioSQL - but I wondered if anyone
has written a very quick and simple front end to the database
(submission and searching) in something like CGI, mod_perl or PHP?

Thanks
Mick

Head of Informatics
Institute for Animal Health
Compton
Berks
RG20 7NN
01635 578411 

http://www.iah.ac.uk/research/bioinformatics/bioinf.shtml

The information contained in this message may be confidential or legally
privileged and is intended solely for the addressee. 
If you have received this message in error please delete it & notify the
originator immediately.
Unauthorised use, disclosure, copying or alteration of this message is
forbidden & may be unlawful. 
The contents of this e-mail are the views of the sender and do not
necessarily represent the views of the Institute. 
This email and associated attachments has been checked locally for
viruses but we can accept no responsibility once it has left our
systems.
Communications on Institute computers are monitored to secure the
effective operation of the systems and for other lawful purposes. 


From cjfields at illinois.edu  Thu Jan 29 09:54:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 08:54:46 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>

Gbrowse, maybe?  There is a BioSQL plugin for it (Bio::DB::Das::BioSQL):

http://gmod.org/wiki/GBrowse#About_Databases

chris

On Jan 29, 2009, at 8:41 AM, michael watson (IAH-C) wrote:

> Hi
>
> I am thinking about a project involving storage of sequences in a
> relational DB and of course thought of BioSQL - but I wondered if  
> anyone
> has written a very quick and simple front end to the database
> (submission and searching) in something like CGI, mod_perl or PHP?
>
> Thanks
> Mick
>
> Head of Informatics
> Institute for Animal Health
> Compton
> Berks
> RG20 7NN
> 01635 578411
>
> http://www.iah.ac.uk/research/bioinformatics/bioinf.shtml
>
> The information contained in this message may be confidential or  
> legally
> privileged and is intended solely for the addressee.
> If you have received this message in error please delete it & notify  
> the
> originator immediately.
> Unauthorised use, disclosure, copying or alteration of this message is
> forbidden & may be unlawful.
> The contents of this e-mail are the views of the sender and do not
> necessarily represent the views of the Institute.
> This email and associated attachments has been checked locally for
> viruses but we can accept no responsibility once it has left our
> systems.
> Communications on Institute computers are monitored to secure the
> effective operation of the systems and for other lawful purposes.
>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From holland at eaglegenomics.com  Thu Jan 29 11:10:42 2009
From: holland at eaglegenomics.com (Richard Holland)
Date: Thu, 29 Jan 2009 16:10:42 +0000
Subject: [BioSQL-l] Eagle Genomics is hiring
Message-ID: <4981D502.1000905@eaglegenomics.com>

Hi all,

Apologies if this is inappropriate for the list, but I thought it would
be a good way to reach the kind of people we're looking for.

Richard

=====

Senior Bioinformatics Software Developer
Eagle Genomics Ltd., Cambridge, UK
http://www.eaglegenomics.com/

We are a young and exciting bioinformatics company looking to
revolutionise the way in which industry and academia work together. We
are based at the heart of Europe's largest biotech cluster in Cambridge,
UK. As we expand our client base, we're looking to build a talented and
committed team of experts. We are currently looking for a software
developer to work on a wide range of complex projects, and who is happy
to work face-to-face with our customers. Ideally you will have had
substantial prior experience working in a life science company or
research institute, however we will also consider graduates with a track
record in bioinformatics.

In addition to your superb technical skills, you will also:
* have the ability to quickly translate scientific problems into real
software solutions,
* be able to put technical concepts into simple language for end users
to understand,
* be able to pick up new skills and techniques in record time,
* work well in a collaborative team environment,
* be creative, innovative, and forward-thinking.

You will have hands-on experience in some of the following:
* Java,
* Perl,
* SQL query design,
* Relational database schema design,
* Open-source bioinformatics toolkits such as BioJava, BioPerl, BioSQL,
etc.,
* Ensembl,
* BioMart,
* DAS,
* Taverna,
* Oracle Life Sciences Platform,
* Oracle database administration,
* MySQL database administration,
* VMware virtual machines,
* Grid computing and parallelisation.

The preferred candidate will be able to work from our offices in
Cambridge, but we would also consider telecommuting arrangements.

We offer a competitive salary and a range of company benefits.

To apply, please send your CV and cover letter as PDF documents to
jobs at eaglegenomics.com. If you have any questions about the position or
would like to discuss it further before applying, please use the same
email address. We are only able to offer positions to EEA citizens and
permanent residents, or Tier 1 migrants under the new UK points-based
immigration scheme.

Individual contracting arrangements could be considered but we will
prefer those candidates who can work with us as employees. No agencies
please.

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/

From jimp at compbio.dundee.ac.uk  Thu Jan 29 12:44:12 2009
From: jimp at compbio.dundee.ac.uk (James Procter)
Date: Thu, 29 Jan 2009 17:44:12 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
Message-ID: <4981EAEC.4070508@compbio.dundee.ac.uk>


Chris Fields wrote:
> Gbrowse, maybe?  There is a BioSQL plugin for it (Bio::DB::Das::BioSQL):
> 
> http://gmod.org/wiki/GBrowse#About_Databases
I'm also in the market for a quick and easy front end - from what I've
heard from a colleague, GBrowse can be tricky to install. Also - for my
application we'd like to easily gather sets of proteins and then explore
their annotation. This is a little out of the scope of GBrowse.

I think there might be a niche needing filling here - would anyone be
interested in pooling code/resources ?

Jim.

-- 
-------------------------------------------------------------------
J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
The University of Dundee is a Scottish Registered Charity, No. SC015096.

From raoul.bonnal at itb.cnr.it  Thu Jan 29 10:06:37 2009
From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal)
Date: Thu, 29 Jan 2009 16:06:37 +0100
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <200901291606.37472.raoul.bonnal@itb.cnr.it>

Il gioved? 29 gennaio 2009 15:41:05 michael watson (IAH-C) ha scritto:
> Hi
>
> I am thinking about a project involving storage of sequences in a
> relational DB and of course thought of BioSQL - but I wondered if anyone
> has written a very quick and simple front end to the database
> (submission and searching) in something like CGI, mod_perl or PHP?

I'm did some tests with ActiveRecord + Rails, and DataMapper + Merb, using 
Ruby. Using that orm the difficult is that the schema doesn't agree with their 
names conventions.

--
Ra


From gthorisson at gmail.com  Thu Jan 29 13:29:08 2009
From: gthorisson at gmail.com (Gudmundur A. Thorisson)
Date: Thu, 29 Jan 2009 18:29:08 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <4981EAEC.4070508@compbio.dundee.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
Message-ID: <50326857-0614-4B43-909A-466403669E52@gmail.com>

Jim. If a Java web-app would be acceptable as the platform for this,  
there is something called Molgenis developed by a group in the  
Netherlands that we are collaborating with. It's a Java-based code- 
generation framework used by several mouse genomics groups for  
microarray data and the like, and  is under consideration by ourselves  
for use in our project:

http://molgenis.sourceforge.net

We were thinking of mixing this in with BioSQL/BioJava for certain  
management & curation tasks. Here's a couple of papers if you care to  
have a closer look:

Smedley et al. Solutions for data integration in functional genomics:  
a critical assessment and case study. Brief Bioinformatics (2008) vol.  
9 (6) pp. 532-44
Swertz et al. Beyond standardization: dynamic software infrastructures  
for systems biology. Nat Rev Genet (2007) vol. 8 (3) pp. 235-43

Best regards ,


              Mummi, Leicester
-----------------------------------------------------------
  Gudmundur A. Thorisson, PhD student,  Brookes lab
  Department of Genetics
  University of Leicester
  University Road
  Leicester, LE1 7RH, UK
  E-mail: gthorisson at gmail.com
  Tel: +44 (0)116 229 7273


On 29 Jan 2009, at 17:44, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.
>
> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From cjfields at illinois.edu  Thu Jan 29 13:45:05 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 12:45:05 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <4981EAEC.4070508@compbio.dundee.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
Message-ID: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>


On Jan 29, 2009, at 11:44 AM, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.

I don't find Gbrowse itself tricky as much as getting BioPerl  
installed.  One can use Gbrowse for what you want but there are  
probably better resources (Ensembl, maybe).

chris

> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From mark.schreiber at novartis.com  Thu Jan 29 21:51:34 2009
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Fri, 30 Jan 2009 10:51:34 +0800
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
Message-ID: <OF64095EEE.D0ADAFA1-ON4825754E.000EE8F6-4825754E.000FB4C3@ah.novartis.com>

Hi -

I have partly auto and partly manually generated an EJB 3 binding to 
BioSQL that can be used with JPA. Notably this uses the new EJB model not 
the nasty old one so it is very easy to use. As all EJB's are now plain 
old java beans it is also very easy to use these objects in web services 
and JSP pages (maybe PHP too??).

Also, because the EJB's and JPA is now more flexible you don't need a full 
java app container (JBOSS, Glassfish) but can instead use them in 
standalone programs although with a container you do get other benefits of 
transaction control/ security/ load balance etc for free.  Also if you do 
use a web interface the web front end will probably be in Tomcat and you 
can use this as a light container for talking to the biosql entity beans. 
If you think there will be more than a few users I would probably advocate 
using Glassfish or similar app server because there are many advantages 
that out weigh the relatively small overhead.

The EJB binding is not part of BioJava but is a candiate for inclusion in 
BioJava3.  I can provide you with code if you are interested. I would also 
be keen to see this get some use.

Best regards,

- Mark

biosql-l-bounces at lists.open-bio.org wrote on 01/30/2009 02:45:05 AM:

> 
> On Jan 29, 2009, at 11:44 AM, James Procter wrote:
> 
> >
> > Chris Fields wrote:
> >> Gbrowse, maybe?  There is a BioSQL plugin for it 
> >> (Bio::DB::Das::BioSQL):
> >>
> >> http://gmod.org/wiki/GBrowse#About_Databases
> > I'm also in the market for a quick and easy front end - from what I've
> > heard from a colleague, GBrowse can be tricky to install. Also - for 
> > my
> > application we'd like to easily gather sets of proteins and then 
> > explore
> > their annotation. This is a little out of the scope of GBrowse.
> 
> I don't find Gbrowse itself tricky as much as getting BioPerl 
> installed.  One can use Gbrowse for what you want but there are 
> probably better resources (Ensembl, maybe).
> 
> chris
> 
> > I think there might be a niche needing filling here - would anyone be
> > interested in pooling code/resources ?
> >
> > Jim.
> >
> > -- 
> > -------------------------------------------------------------------
> > J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> > Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> > The University of Dundee is a Scottish Registered Charity, No. 
> > SC015096.
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biosql-l
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

_________________________

CONFIDENTIALITY NOTICE

The information contained in this e-mail message is intended only for the 
exclusive use of the individual or entity named above and may contain 
information that is privileged, confidential or exempt from disclosure 
under applicable law. If the reader of this message is not the intended 
recipient, or the employee or agent responsible for delivery of the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please 
notify the sender immediately by e-mail and delete the material from any 
computer.  Thank you.

From michael.watson at bbsrc.ac.uk  Fri Jan 30 06:03:12 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 30 Jan 2009 11:03:12 -0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>

Dear All

Thank you for the responses.  I think it is clear there is a need - all
over the World there are groups of various sizes who try to collate and
curate sequences for their organism of choice, from fish virus databases
with 200 records, to flu databases with many thousands.  I'm in contact
with a tiny percentage of these groups, and there is a clear need for:

- common DB schema (tick, we can use BioSQL)
- Web app for:
	- submitting new sequences
	- curating and editing sequences
	- comparing sequences - align, draw trees etc
	- showing sequences on maps (i.e. location of sample)
	- submitting sequences to GenBank
	- retrieving sequences from GenBank

With all of the Bio* projects, this shouldn't be too hard to do, but as
ever it needs bodies to do it... I took a quick look at Galaxy but that
isn't really what was needed.

Thanks again

Mick

-----Original Message-----
From: biosql-l-bounces at lists.open-bio.org
[mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: 29 January 2009 18:45
To: James Procter
Cc: biosql-l at lists.open-bio.org
Subject: Re: [BioSQL-l] Web front-ends to BioSQL


On Jan 29, 2009, at 11:44 AM, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.

I don't find Gbrowse itself tricky as much as getting BioPerl  
installed.  One can use Gbrowse for what you want but there are  
probably better resources (Ensembl, maybe).

chris

> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l


From hlapp at gmx.net  Fri Jan 30 10:23:24 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 30 Jan 2009 10:23:24 -0500
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>

Having such a webapp would be pretty cool, and I agree with the  
argument below that there are numerous small groups or individuals  
with this need. (we have some ourselves here ...)

One word of caution as to where to look for lessons I think is the  
infamous GMOD gene page and standard web front-end, which has been  
labored on in various incarnations for more than half a decade,  
without producing a compelling and broadly adopted result. People's  
needs and technology obsessions vary from place to place.

One possibly hugely complicating factor for the GMOD web front-end was  
that the target audience were model organism websites, which  
themselves have a large and diverse stakeholder community, so  
flexibility and configurability became overriding requirements  
resulting in bloat of code stacks and features.

My personal take is that for this to be broadly useful, the primary  
target audience should probably be programmers, or programming-savvy  
scientists, who can extend and customize a core application at will.  
In other words, much in line with the philosophy behind the Bio*  
libraries.

Other than that, keep it simple so I don't have to learn yet another  
(namely your templating or clever XML configuration scheme) language  
to extend it. I sat next to Mark when he generated a bare-bones BioSQL- 
binding in EJB literally in minutes, which I thought was cool. People  
rave about Ruby and RoR too as for ease of getting started. By far the  
most people out there will be familiar with Perl, but I'm not sure  
what the web application framework would be there that would put me at  
ease. In the end what may count more than anything else is critical  
mass even if it's not everyone's darling language.

My $0.02, and I'd be keen so see what comes out of this. If there's  
something I can do to tip the balance towards something tangible  
happening, let me know.

	-hilmar

On Jan 30, 2009, at 6:03 AM, michael watson (IAH-C) wrote:

> Dear All
>
> Thank you for the responses.  I think it is clear there is a need -  
> all
> over the World there are groups of various sizes who try to collate  
> and
> curate sequences for their organism of choice, from fish virus  
> databases
> with 200 records, to flu databases with many thousands.  I'm in  
> contact
> with a tiny percentage of these groups, and there is a clear need for:
>
> - common DB schema (tick, we can use BioSQL)
> - Web app for:
> 	- submitting new sequences
> 	- curating and editing sequences
> 	- comparing sequences - align, draw trees etc
> 	- showing sequences on maps (i.e. location of sample)
> 	- submitting sequences to GenBank
> 	- retrieving sequences from GenBank
>
> With all of the Bio* projects, this shouldn't be too hard to do, but  
> as
> ever it needs bodies to do it... I took a quick look at Galaxy but  
> that
> isn't really what was needed.
>
> Thanks again
>
> Mick
>
> -----Original Message-----
> From: biosql-l-bounces at lists.open-bio.org
> [mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: 29 January 2009 18:45
> To: James Procter
> Cc: biosql-l at lists.open-bio.org
> Subject: Re: [BioSQL-l] Web front-ends to BioSQL
>
>
> On Jan 29, 2009, at 11:44 AM, James Procter wrote:
>
>>
>> Chris Fields wrote:
>>> Gbrowse, maybe?  There is a BioSQL plugin for it
>>> (Bio::DB::Das::BioSQL):
>>>
>>> http://gmod.org/wiki/GBrowse#About_Databases
>> I'm also in the market for a quick and easy front end - from what  
>> I've
>> heard from a colleague, GBrowse can be tricky to install. Also - for
>> my
>> application we'd like to easily gather sets of proteins and then
>> explore
>> their annotation. This is a little out of the scope of GBrowse.
>
> I don't find Gbrowse itself tricky as much as getting BioPerl
> installed.  One can use Gbrowse for what you want but there are
> probably better resources (Ensembl, maybe).
>
> chris
>
>> I think there might be a niche needing filling here - would anyone be
>> interested in pooling code/resources ?
>>
>> Jim.
>>
>> -- 
>> -------------------------------------------------------------------
>> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
>> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
>> The University of Dundee is a Scottish Registered Charity, No.
>> SC015096.
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Fri Jan 30 14:45:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 30 Jan 2009 13:45:30 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
Message-ID: <5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>


On Jan 30, 2009, at 9:23 AM, Hilmar Lapp wrote:

> Having such a webapp would be pretty cool, and I agree with the  
> argument below that there are numerous small groups or individuals  
> with this need. (we have some ourselves here ...)
>
> One word of caution as to where to look for lessons I think is the  
> infamous GMOD gene page and standard web front-end, which has been  
> labored on in various incarnations for more than half a decade,  
> without producing a compelling and broadly adopted result. People's  
> needs and technology obsessions vary from place to place.
>
> One possibly hugely complicating factor for the GMOD web front-end  
> was that the target audience were model organism websites, which  
> themselves have a large and diverse stakeholder community, so  
> flexibility and configurability became overriding requirements  
> resulting in bloat of code stacks and features.
>
> My personal take is that for this to be broadly useful, the primary  
> target audience should probably be programmers, or programming-savvy  
> scientists, who can extend and customize a core application at will.  
> In other words, much in line with the philosophy behind the Bio*  
> libraries.
>
> Other than that, keep it simple so I don't have to learn yet another  
> (namely your templating or clever XML configuration scheme) language  
> to extend it. I sat next to Mark when he generated a bare-bones  
> BioSQL-binding in EJB literally in minutes, which I thought was  
> cool. People rave about Ruby and RoR too as for ease of getting  
> started. By far the most people out there will be familiar with  
> Perl, but I'm not sure what the web application framework would be  
> there that would put me at ease. In the end what may count more than  
> anything else is critical mass even if it's not everyone's darling  
> language.

Perl web application framework: Catalyst and Jifty (have not tried  
them myself).  RoR gets a lot of press, but I understand the RoR devs  
tend not to listen to the core ruby devs and (as a consequence) had  
recently run into issues with the 1.8.7 ruby release, detailed by the  
always-entertaining chromatic here:

http://use.perl.org/~chromatic/journal/37125

chris

> My $0.02, and I'd be keen so see what comes out of this. If there's  
> something I can do to tip the balance towards something tangible  
> happening, let me know.
>
> 	-hilmar

From gthorisson at gmail.com  Fri Jan 30 14:57:42 2009
From: gthorisson at gmail.com (Gudmundur A. Thorisson)
Date: Fri, 30 Jan 2009 19:57:42 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
Message-ID: <D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>

We use Catalyst MVC framework for our project (http:// 
www.hgvbaseg2p.org). Very good stuff, we combine it with the  
DBIx::Class ORM and Template Toolkit as the templating engine. Totally  
recommended.


                 Mummi

On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>
>
> Perl web application framework: Catalyst and Jifty (have not tried  
> them myself).  RoR gets a lot of press, but I understand the RoR  
> devs tend not to listen to the core ruby devs and (as a consequence)  
> had recently run into issues with the 1.8.7 ruby release, detailed  
> by the always-entertaining chromatic here:
>
> http://use.perl.org/~chromatic/journal/37125
>
> chris
>
>> My $0.02, and I'd be keen so see what comes out of this. If there's  
>> something I can do to tip the balance towards something tangible  
>> happening, let me know.
>>
>> 	-hilmar
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From cjfields at illinois.edu  Fri Jan 30 15:08:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 30 Jan 2009 14:08:11 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
	<D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
Message-ID: <99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>

Another article (as pointed out by Heikki on bioperl-l):

http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0

The last section is all on MVC-oriented frameworks.

chris

On Jan 30, 2009, at 1:57 PM, Gudmundur A. Thorisson wrote:

> We use Catalyst MVC framework for our project (http://www.hgvbaseg2p.org 
> ). Very good stuff, we combine it with the DBIx::Class ORM and  
> Template Toolkit as the templating engine. Totally recommended.
>
>
>                Mummi
>
> On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>>
>>
>> Perl web application framework: Catalyst and Jifty (have not tried  
>> them myself).  RoR gets a lot of press, but I understand the RoR  
>> devs tend not to listen to the core ruby devs and (as a  
>> consequence) had recently run into issues with the 1.8.7 ruby  
>> release, detailed by the always-entertaining chromatic here:
>>
>> http://use.perl.org/~chromatic/journal/37125
>>
>> chris
>>
>>> My $0.02, and I'd be keen so see what comes out of this. If  
>>> there's something I can do to tip the balance towards something  
>>> tangible happening, let me know.
>>>
>>> 	-hilmar
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From markjschreiber at gmail.com  Sat Jan 31 06:03:53 2009
From: markjschreiber at gmail.com (Mark Schreiber)
Date: Sat, 31 Jan 2009 19:03:53 +0800
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
	<D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
	<99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>
Message-ID: <93b45ca50901310303t37905e8ak3819c05f4b94c287@mail.gmail.com>

Hi -

My feeling is that the diversity of languages and frameworks within
languages would mean that a generic web front end to BioSQL will and
should never materialize. What would be a lot more sensible is a
generic API in the form of a webservice or collection of webservices
that could be used by (theoretically) any web frame work to generate a
website.

User preferences and requirements will be far too diverse for a
generic web front end.

- Mark

On 1/31/09, Chris Fields <cjfields at illinois.edu> wrote:
> Another article (as pointed out by Heikki on bioperl-l):
>
> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0
>
> The last section is all on MVC-oriented frameworks.
>
> chris
>
> On Jan 30, 2009, at 1:57 PM, Gudmundur A. Thorisson wrote:
>
>> We use Catalyst MVC framework for our project (http://www.hgvbaseg2p.org
>> ). Very good stuff, we combine it with the DBIx::Class ORM and
>> Template Toolkit as the templating engine. Totally recommended.
>>
>>
>>                Mummi
>>
>> On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>>>
>>>
>>> Perl web application framework: Catalyst and Jifty (have not tried
>>> them myself).  RoR gets a lot of press, but I understand the RoR
>>> devs tend not to listen to the core ruby devs and (as a
>>> consequence) had recently run into issues with the 1.8.7 ruby
>>> release, detailed by the always-entertaining chromatic here:
>>>
>>> http://use.perl.org/~chromatic/journal/37125
>>>
>>> chris
>>>
>>>> My $0.02, and I'd be keen so see what comes out of this. If
>>>> there's something I can do to tip the balance towards something
>>>> tangible happening, let me know.
>>>>
>>>> 	-hilmar
>>> _______________________________________________
>>> BioSQL-l mailing list
>>> BioSQL-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>

From gwu at molbio.mgh.harvard.edu  Tue Jan 27 22:39:57 2009
From: gwu at molbio.mgh.harvard.edu (gwu)
Date: Tue, 27 Jan 2009 17:39:57 -0500
Subject: [BioSQL-l] Genbank loading time
Message-ID: <497F8D3D.5060907@molbio.mgh.harvard.edu>

Hi Everyone,

I recently visited the BioWarehouse web site and the document shows 
loading the whole Genbank into their database takes the data loader 68 
hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a 
similar test done with BioSQL?

Gang Wu


From holland at eaglegenomics.com  Tue Jan 27 22:57:59 2009
From: holland at eaglegenomics.com (Richard Holland)
Date: Tue, 27 Jan 2009 22:57:59 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F8D3D.5060907@molbio.mgh.harvard.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
Message-ID: <497F9177.7040309@eaglegenomics.com>

It would depend on the toolkit you use. BioWarehouse is a complete API,
whereas BioSQL is just a schema and the way in which it is populated
(and therefore how long that takes) depends on your toolkit.

Currently I'm aware of loaders existing for BioJava, BioPerl, and
possibly also BioPython. However each of them load the same data in
subtly different ways, so can't be directly compared in terms of which
one is faster than the other.

I vaguely remember seeing some performance figures for the
BioJava/Genbank/BioSQL combination somewhere, but it's been a while! I'm
not sure where they were documented though - I certainly haven't got
them written down anywhere. Mark Schreiber might know as he definitely
did some testing of this - Mark, can you remember what the figures were
for BioJava?

As for BioPerl/BioPython/etc. I expect their respective project authors
will respond to this thread accordingly with the figures from their own
domains!

cheers,
Richard

gwu wrote:
> Hi Everyone,
> 
> I recently visited the BioWarehouse web site and the document shows
> loading the whole Genbank into their database takes the data loader 68
> hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a
> similar test done with BioSQL?
> 
> Gang Wu
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
> 

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/


From hlapp at gmx.net  Wed Jan 28 05:09:04 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 28 Jan 2009 00:09:04 -0500
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F9177.7040309@eaglegenomics.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
Message-ID: <72E5157F-02BC-40F6-A59D-3E887A5207C8@gmx.net>

The loader for BioPerl is load_seqdatabase.pl, which is part of  
bioperl-db. With machines current as of 3-4 years ago, I saw upload  
speeds of between 5 and 15 sequences per second for richly annotated  
sequences (human/mouse RefSeqs).

If you are talking about all of GenBank, the far majority of that will  
be ESTs and sequencing reads (do you really want to load those?),  
which are typically sparsely annotated if at all, and so should be  
faster. mRNA and cDNA sequences will be more in the above range.

I have never loaded all of GenBank into a database (and I'm not sure  
why anyone would want to do this) and so don't have a comparison  
figure for the total for that.

Finally, several instances of load_seqdatabase.pl can be nicely run in  
parallel on multi-core machines.

	-hilmar

On Jan 27, 2009, at 5:57 PM, Richard Holland wrote:

> It would depend on the toolkit you use. BioWarehouse is a complete  
> API,
> whereas BioSQL is just a schema and the way in which it is populated
> (and therefore how long that takes) depends on your toolkit.
>
> Currently I'm aware of loaders existing for BioJava, BioPerl, and
> possibly also BioPython. However each of them load the same data in
> subtly different ways, so can't be directly compared in terms of which
> one is faster than the other.
>
> I vaguely remember seeing some performance figures for the
> BioJava/Genbank/BioSQL combination somewhere, but it's been a while!  
> I'm
> not sure where they were documented though - I certainly haven't got
> them written down anywhere. Mark Schreiber might know as he definitely
> did some testing of this - Mark, can you remember what the figures  
> were
> for BioJava?
>
> As for BioPerl/BioPython/etc. I expect their respective project  
> authors
> will respond to this thread accordingly with the figures from their  
> own
> domains!
>
> cheers,
> Richard
>
> gwu wrote:
>> Hi Everyone,
>>
>> I recently visited the BioWarehouse web site and the document shows
>> loading the whole Genbank into their database takes the data loader  
>> 68
>> hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a
>> similar test done with BioSQL?
>>
>> Gang Wu
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>
>
> -- 
> Richard Holland, BSc MBCS
> Finance Director, Eagle Genomics Ltd
> M: +44 7500 438846 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Wed Jan 28 11:50:50 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 11:50:50 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <497F9177.7040309@eaglegenomics.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
Message-ID: <320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>

On Tue, Jan 27, 2009 at 10:57 PM, Richard Holland wrote:
>
> As for BioPerl/BioPython/etc. I expect their respective project authors
> will respond to this thread accordingly with the figures from their own
> domains!

I can tell you importing GenBank files into BioSQL with Biopython is
faster than BioPerl, sometimes several times faster, but this will
depend on the nature of the files (e.g. genomes versus ESTs).
http://lists.open-bio.org/pipermail/biosql-l/2008-August/001320.html
http://lists.open-bio.org/pipermail/biopython-dev/2008-April/003625.html

I don't have any BioJava comparison figures.  In any case, as Richard
points out, there will be slight differences in the different Bio*
tools how exactly how the data is parsed and stored.

I've never tries to import the whole of GenBank, so I don't have any
numbers for you there.

Peter
(Biopython)


From biopython at maubp.freeserve.co.uk  Wed Jan 28 16:40:55 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 16:40:55 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
Message-ID: <320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>

On Wed, Jan 28, 2009 at 4:29 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I don't think sequence loading via load_seqdatabase.pl uses BioPerl.  If one
> uses BioPerl and bioperl-db the following can explain at least some of the
> reason why loading is slow:
> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
> We also go through the extra hand-wringing with Bio::Species objects
> (something I don't think the other Bio* worry about).

Looking at the source code for the load_seqdatabase.pl script included
with bioperl-db, my impression is it uses Bio::DB::BioDB to talk to
the database, and Bio::SeqIO to parse the input sequence files (in
this case, Bio::SeqIO::genbank is used).  See:

http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-db/trunk/scripts/biosql/load_seqdatabase.pl

> Regardless, it's not an easy problem to work around.  There are such things
> as Moose, and Perl6 is now in alpha...

I'll take your word for it - I'm in no position to improve anyone's Perl code ;)

Peter


From cjfields at illinois.edu  Wed Jan 28 16:29:50 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 10:29:50 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
Message-ID: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>


On Jan 28, 2009, at 5:50 AM, Peter wrote:

> On Tue, Jan 27, 2009 at 10:57 PM, Richard Holland wrote:
>>
>> As for BioPerl/BioPython/etc. I expect their respective project  
>> authors
>> will respond to this thread accordingly with the figures from their  
>> own
>> domains!
>
> I can tell you importing GenBank files into BioSQL with Biopython is
> faster than BioPerl, sometimes several times faster, but this will
> depend on the nature of the files (e.g. genomes versus ESTs).
> http://lists.open-bio.org/pipermail/biosql-l/2008-August/001320.html
> http://lists.open-bio.org/pipermail/biopython-dev/2008-April/003625.html

I don't think sequence loading via load_seqdatabase.pl uses BioPerl.   
If one uses BioPerl and bioperl-db the following can explain at least  
some of the reason why loading is slow:

http://www.bioperl.org/wiki/Why_BioPerl_is_slow

We also go through the extra hand-wringing with Bio::Species objects  
(something I don't think the other Bio* worry about).

Regardless, it's not an easy problem to work around.  There are such  
things as Moose, and Perl6 is now in alpha...

chris

> I don't have any BioJava comparison figures.  In any case, as Richard
> points out, there will be slight differences in the different Bio*
> tools how exactly how the data is parsed and stored.
>
> I've never tries to import the whole of GenBank, so I don't have any
> numbers for you there.
>
> Peter
> (Biopython)


From cjfields at illinois.edu  Wed Jan 28 16:53:49 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 10:53:49 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
Message-ID: <37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>


On Jan 28, 2009, at 10:40 AM, Peter wrote:

> On Wed, Jan 28, 2009 at 4:29 PM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>>
>> I don't think sequence loading via load_seqdatabase.pl uses  
>> BioPerl.  If one
>> uses BioPerl and bioperl-db the following can explain at least some  
>> of the
>> reason why loading is slow:
>> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
>> We also go through the extra hand-wringing with Bio::Species objects
>> (something I don't think the other Bio* worry about).
>
> Looking at the source code for the load_seqdatabase.pl script included
> with bioperl-db, my impression is it uses Bio::DB::BioDB to talk to
> the database, and Bio::SeqIO to parse the input sequence files (in
> this case, Bio::SeqIO::genbank is used).  See:
>
> http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-db/trunk/scripts/biosql/load_seqdatabase.pl

My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm  
wondering, though, whether it would be feasible to have a direct  
loader for the most common database formats (GenBank/EMBL/Swiss),  
something similar to the taxonomy loader that doesn't rely on any  
specific Bio* package.

>> Regardless, it's not an easy problem to work around.  There are  
>> such things
>> as Moose, and Perl6 is now in alpha...
>
> I'll take your word for it - I'm in no position to improve anyone's  
> Perl code ;)
>
> Peter

Well, the problem lies with perl5's welded-on OO which isn't easy to  
work around, particularly inheritance issues.  Supposedly Moose helps  
speed things up a bit; it doesn't hurt that it is based somewhat on  
perl6's Objects:

http://feather.perl6.nl/syn/S12.html

chris


From hlapp at gmx.net  Wed Jan 28 17:06:01 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 28 Jan 2009 12:06:01 -0500
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
Message-ID: <0BD2B914-3E57-4266-AE4E-EA8B2F1DD307@gmx.net>


On Jan 28, 2009, at 11:29 AM, Chris Fields wrote:

> I don't think sequence loading via load_seqdatabase.pl uses BioPerl.


It does, actually. All the input parsing is done by BioPerl. Bioperl- 
db only does the persistence, and the script itself handles all the  
command line options, opens files, yadda yadda ...

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Wed Jan 28 17:17:57 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 17:17:57 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
Message-ID: <320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>

On 1/28/09, Chris Fields <cjfields at illinois.edu> wrote:
>
> My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm
> wondering, though, whether it would be feasible to have a direct loader for
> the most common database formats (GenBank/EMBL/Swiss), something
> similar to the taxonomy loader that doesn't rely on any specific Bio* package.
>

You could re-invent the wheel, and write yet another
GenBank/EMBL/Swiss parser in standalone perl for use within
load_seqdatabase.pl but I really don't see any point to this.  Reusing
the BioPerl parser seems most sensible, especially given that
bioperl-db is an extension to bioperl in the first place - and the
BioPerl parsers already exist and are well tested.

Peter


From cjfields at illinois.edu  Wed Jan 28 17:47:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 11:47:20 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
Message-ID: <D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>

On Jan 28, 2009, at 11:17 AM, Peter wrote:

> On 1/28/09, Chris Fields <cjfields at illinois.edu> wrote:
>>
>> My bad, I'm thinking of the taxonomy loader (need more coffee).  I'm
>> wondering, though, whether it would be feasible to have a direct  
>> loader for
>> the most common database formats (GenBank/EMBL/Swiss), something
>> similar to the taxonomy loader that doesn't rely on any specific  
>> Bio* package.
>>
>
> You could re-invent the wheel, and write yet another
> GenBank/EMBL/Swiss parser in standalone perl for use within
> load_seqdatabase.pl but I really don't see any point to this.  Reusing
> the BioPerl parser seems most sensible, especially given that
> bioperl-db is an extension to bioperl in the first place - and the
> BioPerl parsers already exist and are well tested.
>
> Peter

My point is, instead of first mapping record data to a specific object/ 
class then mapping the object data to the database, bypass the object  
completely and generically map relevant data directly in the database  
according to the BioSQL schema.

If anything this may force some consistency between the various Bio*  
languages.

chris


From biopython at maubp.freeserve.co.uk  Wed Jan 28 18:18:03 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 28 Jan 2009 18:18:03 +0000
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
	<D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
Message-ID: <320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>

>> You could re-invent the wheel, and write yet another
>> GenBank/EMBL/Swiss parser in standalone perl for use within
>> load_seqdatabase.pl but I really don't see any point to this.  Reusing
>> the BioPerl parser seems most sensible, especially given that
>> bioperl-db is an extension to bioperl in the first place - and the
>> BioPerl parsers already exist and are well tested.
>>
>> Peter
>
> My point is, instead of first mapping record data to a specific object/class
> then mapping the object data to the database, bypass the object completely
> and generically map relevant data directly in the database according to the
> BioSQL schema.
>
> If anything this may force some consistency between the various Bio*
> languages.
>
> chris

Ah - so rather than using BioPerl/Biopython/BioJava to import your
sequence files into a BioSQL database, you'd like BioSQL to come with
its own script that does the job?  It would "solve" any
inconsistencies for getting files of data into the database if this
where the only sanctioned way to add records to the database.
However, there are a number of downsides - in addition to the
considerable extra effort needed to write and support another set of
parsers just for BioSQL (without reusing BioPerl/Biopython/BioJava).

What about BioPerl/Biopython/BioJava users who have sequence-record
objects in memory they want to record in the database?  These could
have been loaded from GenBank files originally and then manipulated
(e.g. adding additional crude annotation from running BLAST).  How
would they get them into the database - write them to a GenBank file
and then invoke the project neutral BioSQL provided script?

I think each project needs their own ORM bindings for both loading
data into and from the database.  Improving any inconsistencies in how
each ends up storing sequence files (e.g. GenBank files) can be worked
on gradually.

[Perhaps I have read more into your comment than you intended - if I
have got the wrong end of the stick, please clarify - thanks]

Still, a project neutral BioSQL bundled script (not depending on any
of BioPerl/Biopython/BioJava) for importing a GenBank file into a
database could serve as a "reference implementation" (the role I
currently assign to BioPerl's load_seqdatabase.pl).  And if this
proves faster than load_seqdatabase.pl that's a nice bonus.

Peter


From cjfields at illinois.edu  Wed Jan 28 18:57:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 28 Jan 2009 12:57:25 -0600
Subject: [BioSQL-l] Genbank loading time
In-Reply-To: <320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>
References: <497F8D3D.5060907@molbio.mgh.harvard.edu>
	<497F9177.7040309@eaglegenomics.com>
	<320fb6e00901280350g363aa41ai7edc8181c606e26e@mail.gmail.com>
	<556F8B66-D407-46C1-A4AF-79469D9814FA@illinois.edu>
	<320fb6e00901280840q796bf5cawf085ad3a7c18bbdd@mail.gmail.com>
	<37CEB7ED-ECD6-4186-BF84-72B704B3A5E8@illinois.edu>
	<320fb6e00901280917q42c39590jf54e0144c0e6bc28@mail.gmail.com>
	<D1285537-8703-49D2-A35C-D51A60839310@illinois.edu>
	<320fb6e00901281018t3148af9exda473c101c15bcc8@mail.gmail.com>
Message-ID: <770D510F-C6EA-455E-B017-766587E1B23F@illinois.edu>


On Jan 28, 2009, at 12:18 PM, Peter wrote:

>>> You could re-invent the wheel, and write yet another
>>> GenBank/EMBL/Swiss parser in standalone perl for use within
>>> load_seqdatabase.pl but I really don't see any point to this.   
>>> Reusing
>>> the BioPerl parser seems most sensible, especially given that
>>> bioperl-db is an extension to bioperl in the first place - and the
>>> BioPerl parsers already exist and are well tested.
>>>
>>> Peter
>>
>> My point is, instead of first mapping record data to a specific  
>> object/class
>> then mapping the object data to the database, bypass the object  
>> completely
>> and generically map relevant data directly in the database  
>> according to the
>> BioSQL schema.
>>
>> If anything this may force some consistency between the various Bio*
>> languages.
>>
>> chris
>
> Ah - so rather than using BioPerl/Biopython/BioJava to import your
> sequence files into a BioSQL database, you'd like BioSQL to come with
> its own script that does the job?  It would "solve" any
> inconsistencies for getting files of data into the database if this
> where the only sanctioned way to add records to the database.
> However, there are a number of downsides - in addition to the
> considerable extra effort needed to write and support another set of
> parsers just for BioSQL (without reusing BioPerl/Biopython/BioJava).
>
> What about BioPerl/Biopython/BioJava users who have sequence-record
> objects in memory they want to record in the database?  These could
> have been loaded from GenBank files originally and then manipulated
> (e.g. adding additional crude annotation from running BLAST).  How
> would they get them into the database - write them to a GenBank file
> and then invoke the project neutral BioSQL provided script?

No, one would use the same adaptors as before (bioperl-db for BioPerl,  
for instance).

> I think each project needs their own ORM bindings for both loading
> data into and from the database.  Improving any inconsistencies in how
> each ends up storing sequence files (e.g. GenBank files) can be worked
> on gradually.
>
> [Perhaps I have read more into your comment than you intended - if I
> have got the wrong end of the stick, please clarify - thanks]
>
> Still, a project neutral BioSQL bundled script (not depending on any
> of BioPerl/Biopython/BioJava) for importing a GenBank file into a
> database could serve as a "reference implementation" (the role I
> currently assign to BioPerl's load_seqdatabase.pl).  And if this
> proves faster than load_seqdatabase.pl that's a nice bonus.
>
> Peter

That's what I'm thinking, essentially; something that is Bio*-neutral  
that can be tested against.  And it should be faster at least from the  
standpoint of not having to generate tons of objects.

It's icing if it evolves past the point of a simple reference  
implementation into something that is useful as a fast BioSQL loader.

chris


From cjfields at illinois.edu  Thu Jan 29 13:37:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 07:37:31 -0600
Subject: [BioSQL-l] [Bioperl-l] [ANNOUNCEMENT] Alpha 1.6 releases of
	BioPerl-db
In-Reply-To: <AC794CC5-EC13-4BAF-84D5-9CD1CB34220B@inserm.fr>
References: <AC794CC5-EC13-4BAF-84D5-9CD1CB34220B@inserm.fr>
Message-ID: <C07BB428-099F-4492-8591-66F3F6988C64@illinois.edu>

That one may be database-dependent; it passes for mysql 5.1.26-rc.   
What is your db (mysql, Pg, oracle) and version?

Hilmar, any ideas?

chris

On Jan 29, 2009, at 6:28 AM, Johann PELLET wrote:

> Dear Chris,
>
> I have the following error on my Mac machine: (BioPerl 1.6, BioPerl- 
> run
> 1.6) when I try to install Bioperl-db ( biosql-1.0.1):
>
> t/01dbadaptor.....1/23
> #   Failed test in t/01dbadaptor.t at line 44.
> #          got: undef
> #     expected: ''
> # Looks like you failed 1 test of 23.
> t/01dbadaptor..... Dubious, test returned 1 (wstat 256, 0x100)
> Failed 1/23 subtests
> t/02species.......ok
> t/03simpleseq.....ok
> t/04swiss.........ok
> t/05seqfeature....ok
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........5/23
> #   Failed (TODO) test in t/09fuzzy2.t at line 64.
> #          got: undef
> #     expected: 'Q9QYG8'
> t/09fuzzy2........ok
> t/10ensembl.......ok
> t/11locuslink.....ok
> t/12ontology......ok
> t/13remove........ok
> t/14query.........ok
> t/15cluster.......ok
> t/16obda..........ok
>
> Test Summary Report
> -------------------
> t/01dbadaptor (Wstat: 256 Tests: 23 Failed: 1)
>  Failed test:  16
>  Non-zero exit status: 1
> Files=16, Tests=1479, 15 wallclock secs ( 0.27 usr  0.10 sys + 11.15  
> cusr  1.11 csys = 12.63 CPU)
> Result: FAIL
> Failed 1/16 test programs. 1/1479 subtests failed.
>
> -- --
>
> Johann Pellet
> IE Bioinformatique
> INSERM U851, I-MAP CERVI
> 21, Avenue Tony Garnier
> 69365 Lyon cedex 07 France
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From michael.watson at bbsrc.ac.uk  Thu Jan 29 14:41:05 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Thu, 29 Jan 2009 14:41:05 -0000
Subject: [BioSQL-l] Web front-ends to BioSQL
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>

Hi

I am thinking about a project involving storage of sequences in a
relational DB and of course thought of BioSQL - but I wondered if anyone
has written a very quick and simple front end to the database
(submission and searching) in something like CGI, mod_perl or PHP?

Thanks
Mick

Head of Informatics
Institute for Animal Health
Compton
Berks
RG20 7NN
01635 578411 

http://www.iah.ac.uk/research/bioinformatics/bioinf.shtml

The information contained in this message may be confidential or legally
privileged and is intended solely for the addressee. 
If you have received this message in error please delete it & notify the
originator immediately.
Unauthorised use, disclosure, copying or alteration of this message is
forbidden & may be unlawful. 
The contents of this e-mail are the views of the sender and do not
necessarily represent the views of the Institute. 
This email and associated attachments has been checked locally for
viruses but we can accept no responsibility once it has left our
systems.
Communications on Institute computers are monitored to secure the
effective operation of the systems and for other lawful purposes. 


From cjfields at illinois.edu  Thu Jan 29 14:54:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 08:54:46 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>

Gbrowse, maybe?  There is a BioSQL plugin for it (Bio::DB::Das::BioSQL):

http://gmod.org/wiki/GBrowse#About_Databases

chris

On Jan 29, 2009, at 8:41 AM, michael watson (IAH-C) wrote:

> Hi
>
> I am thinking about a project involving storage of sequences in a
> relational DB and of course thought of BioSQL - but I wondered if  
> anyone
> has written a very quick and simple front end to the database
> (submission and searching) in something like CGI, mod_perl or PHP?
>
> Thanks
> Mick
>
> Head of Informatics
> Institute for Animal Health
> Compton
> Berks
> RG20 7NN
> 01635 578411
>
> http://www.iah.ac.uk/research/bioinformatics/bioinf.shtml
>
> The information contained in this message may be confidential or  
> legally
> privileged and is intended solely for the addressee.
> If you have received this message in error please delete it & notify  
> the
> originator immediately.
> Unauthorised use, disclosure, copying or alteration of this message is
> forbidden & may be unlawful.
> The contents of this e-mail are the views of the sender and do not
> necessarily represent the views of the Institute.
> This email and associated attachments has been checked locally for
> viruses but we can accept no responsibility once it has left our
> systems.
> Communications on Institute computers are monitored to secure the
> effective operation of the systems and for other lawful purposes.
>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From holland at eaglegenomics.com  Thu Jan 29 16:10:42 2009
From: holland at eaglegenomics.com (Richard Holland)
Date: Thu, 29 Jan 2009 16:10:42 +0000
Subject: [BioSQL-l] Eagle Genomics is hiring
Message-ID: <4981D502.1000905@eaglegenomics.com>

Hi all,

Apologies if this is inappropriate for the list, but I thought it would
be a good way to reach the kind of people we're looking for.

Richard

=====

Senior Bioinformatics Software Developer
Eagle Genomics Ltd., Cambridge, UK
http://www.eaglegenomics.com/

We are a young and exciting bioinformatics company looking to
revolutionise the way in which industry and academia work together. We
are based at the heart of Europe's largest biotech cluster in Cambridge,
UK. As we expand our client base, we're looking to build a talented and
committed team of experts. We are currently looking for a software
developer to work on a wide range of complex projects, and who is happy
to work face-to-face with our customers. Ideally you will have had
substantial prior experience working in a life science company or
research institute, however we will also consider graduates with a track
record in bioinformatics.

In addition to your superb technical skills, you will also:
* have the ability to quickly translate scientific problems into real
software solutions,
* be able to put technical concepts into simple language for end users
to understand,
* be able to pick up new skills and techniques in record time,
* work well in a collaborative team environment,
* be creative, innovative, and forward-thinking.

You will have hands-on experience in some of the following:
* Java,
* Perl,
* SQL query design,
* Relational database schema design,
* Open-source bioinformatics toolkits such as BioJava, BioPerl, BioSQL,
etc.,
* Ensembl,
* BioMart,
* DAS,
* Taverna,
* Oracle Life Sciences Platform,
* Oracle database administration,
* MySQL database administration,
* VMware virtual machines,
* Grid computing and parallelisation.

The preferred candidate will be able to work from our offices in
Cambridge, but we would also consider telecommuting arrangements.

We offer a competitive salary and a range of company benefits.

To apply, please send your CV and cover letter as PDF documents to
jobs at eaglegenomics.com. If you have any questions about the position or
would like to discuss it further before applying, please use the same
email address. We are only able to offer positions to EEA citizens and
permanent residents, or Tier 1 migrants under the new UK points-based
immigration scheme.

Individual contracting arrangements could be considered but we will
prefer those candidates who can work with us as employees. No agencies
please.

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/


From jimp at compbio.dundee.ac.uk  Thu Jan 29 17:44:12 2009
From: jimp at compbio.dundee.ac.uk (James Procter)
Date: Thu, 29 Jan 2009 17:44:12 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
Message-ID: <4981EAEC.4070508@compbio.dundee.ac.uk>


Chris Fields wrote:
> Gbrowse, maybe?  There is a BioSQL plugin for it (Bio::DB::Das::BioSQL):
> 
> http://gmod.org/wiki/GBrowse#About_Databases
I'm also in the market for a quick and easy front end - from what I've
heard from a colleague, GBrowse can be tricky to install. Also - for my
application we'd like to easily gather sets of proteins and then explore
their annotation. This is a little out of the scope of GBrowse.

I think there might be a niche needing filling here - would anyone be
interested in pooling code/resources ?

Jim.

-- 
-------------------------------------------------------------------
J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
The University of Dundee is a Scottish Registered Charity, No. SC015096.


From raoul.bonnal at itb.cnr.it  Thu Jan 29 15:06:37 2009
From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal)
Date: Thu, 29 Jan 2009 16:06:37 +0100
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <200901291606.37472.raoul.bonnal@itb.cnr.it>

Il gioved? 29 gennaio 2009 15:41:05 michael watson (IAH-C) ha scritto:
> Hi
>
> I am thinking about a project involving storage of sequences in a
> relational DB and of course thought of BioSQL - but I wondered if anyone
> has written a very quick and simple front end to the database
> (submission and searching) in something like CGI, mod_perl or PHP?

I'm did some tests with ActiveRecord + Rails, and DataMapper + Merb, using 
Ruby. Using that orm the difficult is that the schema doesn't agree with their 
names conventions.

--
Ra


From gthorisson at gmail.com  Thu Jan 29 18:29:08 2009
From: gthorisson at gmail.com (Gudmundur A. Thorisson)
Date: Thu, 29 Jan 2009 18:29:08 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <4981EAEC.4070508@compbio.dundee.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
Message-ID: <50326857-0614-4B43-909A-466403669E52@gmail.com>

Jim. If a Java web-app would be acceptable as the platform for this,  
there is something called Molgenis developed by a group in the  
Netherlands that we are collaborating with. It's a Java-based code- 
generation framework used by several mouse genomics groups for  
microarray data and the like, and  is under consideration by ourselves  
for use in our project:

http://molgenis.sourceforge.net

We were thinking of mixing this in with BioSQL/BioJava for certain  
management & curation tasks. Here's a couple of papers if you care to  
have a closer look:

Smedley et al. Solutions for data integration in functional genomics:  
a critical assessment and case study. Brief Bioinformatics (2008) vol.  
9 (6) pp. 532-44
Swertz et al. Beyond standardization: dynamic software infrastructures  
for systems biology. Nat Rev Genet (2007) vol. 8 (3) pp. 235-43

Best regards ,


              Mummi, Leicester
-----------------------------------------------------------
  Gudmundur A. Thorisson, PhD student,  Brookes lab
  Department of Genetics
  University of Leicester
  University Road
  Leicester, LE1 7RH, UK
  E-mail: gthorisson at gmail.com
  Tel: +44 (0)116 229 7273


On 29 Jan 2009, at 17:44, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.
>
> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From cjfields at illinois.edu  Thu Jan 29 18:45:05 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 29 Jan 2009 12:45:05 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <4981EAEC.4070508@compbio.dundee.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
Message-ID: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>


On Jan 29, 2009, at 11:44 AM, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.

I don't find Gbrowse itself tricky as much as getting BioPerl  
installed.  One can use Gbrowse for what you want but there are  
probably better resources (Ensembl, maybe).

chris

> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From mark.schreiber at novartis.com  Fri Jan 30 02:51:34 2009
From: mark.schreiber at novartis.com (mark.schreiber at novartis.com)
Date: Fri, 30 Jan 2009 10:51:34 +0800
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
Message-ID: <OF64095EEE.D0ADAFA1-ON4825754E.000EE8F6-4825754E.000FB4C3@ah.novartis.com>

Hi -

I have partly auto and partly manually generated an EJB 3 binding to 
BioSQL that can be used with JPA. Notably this uses the new EJB model not 
the nasty old one so it is very easy to use. As all EJB's are now plain 
old java beans it is also very easy to use these objects in web services 
and JSP pages (maybe PHP too??).

Also, because the EJB's and JPA is now more flexible you don't need a full 
java app container (JBOSS, Glassfish) but can instead use them in 
standalone programs although with a container you do get other benefits of 
transaction control/ security/ load balance etc for free.  Also if you do 
use a web interface the web front end will probably be in Tomcat and you 
can use this as a light container for talking to the biosql entity beans. 
If you think there will be more than a few users I would probably advocate 
using Glassfish or similar app server because there are many advantages 
that out weigh the relatively small overhead.

The EJB binding is not part of BioJava but is a candiate for inclusion in 
BioJava3.  I can provide you with code if you are interested. I would also 
be keen to see this get some use.

Best regards,

- Mark

biosql-l-bounces at lists.open-bio.org wrote on 01/30/2009 02:45:05 AM:

> 
> On Jan 29, 2009, at 11:44 AM, James Procter wrote:
> 
> >
> > Chris Fields wrote:
> >> Gbrowse, maybe?  There is a BioSQL plugin for it 
> >> (Bio::DB::Das::BioSQL):
> >>
> >> http://gmod.org/wiki/GBrowse#About_Databases
> > I'm also in the market for a quick and easy front end - from what I've
> > heard from a colleague, GBrowse can be tricky to install. Also - for 
> > my
> > application we'd like to easily gather sets of proteins and then 
> > explore
> > their annotation. This is a little out of the scope of GBrowse.
> 
> I don't find Gbrowse itself tricky as much as getting BioPerl 
> installed.  One can use Gbrowse for what you want but there are 
> probably better resources (Ensembl, maybe).
> 
> chris
> 
> > I think there might be a niche needing filling here - would anyone be
> > interested in pooling code/resources ?
> >
> > Jim.
> >
> > -- 
> > -------------------------------------------------------------------
> > J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> > Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> > The University of Dundee is a Scottish Registered Charity, No. 
> > SC015096.
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biosql-l
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

_________________________

CONFIDENTIALITY NOTICE

The information contained in this e-mail message is intended only for the 
exclusive use of the individual or entity named above and may contain 
information that is privileged, confidential or exempt from disclosure 
under applicable law. If the reader of this message is not the intended 
recipient, or the employee or agent responsible for delivery of the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please 
notify the sender immediately by e-mail and delete the material from any 
computer.  Thank you.


From michael.watson at bbsrc.ac.uk  Fri Jan 30 11:03:12 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 30 Jan 2009 11:03:12 -0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>

Dear All

Thank you for the responses.  I think it is clear there is a need - all
over the World there are groups of various sizes who try to collate and
curate sequences for their organism of choice, from fish virus databases
with 200 records, to flu databases with many thousands.  I'm in contact
with a tiny percentage of these groups, and there is a clear need for:

- common DB schema (tick, we can use BioSQL)
- Web app for:
	- submitting new sequences
	- curating and editing sequences
	- comparing sequences - align, draw trees etc
	- showing sequences on maps (i.e. location of sample)
	- submitting sequences to GenBank
	- retrieving sequences from GenBank

With all of the Bio* projects, this shouldn't be too hard to do, but as
ever it needs bodies to do it... I took a quick look at Galaxy but that
isn't really what was needed.

Thanks again

Mick

-----Original Message-----
From: biosql-l-bounces at lists.open-bio.org
[mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: 29 January 2009 18:45
To: James Procter
Cc: biosql-l at lists.open-bio.org
Subject: Re: [BioSQL-l] Web front-ends to BioSQL


On Jan 29, 2009, at 11:44 AM, James Procter wrote:

>
> Chris Fields wrote:
>> Gbrowse, maybe?  There is a BioSQL plugin for it  
>> (Bio::DB::Das::BioSQL):
>>
>> http://gmod.org/wiki/GBrowse#About_Databases
> I'm also in the market for a quick and easy front end - from what I've
> heard from a colleague, GBrowse can be tricky to install. Also - for  
> my
> application we'd like to easily gather sets of proteins and then  
> explore
> their annotation. This is a little out of the scope of GBrowse.

I don't find Gbrowse itself tricky as much as getting BioPerl  
installed.  One can use Gbrowse for what you want but there are  
probably better resources (Ensembl, maybe).

chris

> I think there might be a niche needing filling here - would anyone be
> interested in pooling code/resources ?
>
> Jim.
>
> -- 
> -------------------------------------------------------------------
> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
> The University of Dundee is a Scottish Registered Charity, No.  
> SC015096.
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l


From hlapp at gmx.net  Fri Jan 30 15:23:24 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 30 Jan 2009 10:23:24 -0500
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>

Having such a webapp would be pretty cool, and I agree with the  
argument below that there are numerous small groups or individuals  
with this need. (we have some ourselves here ...)

One word of caution as to where to look for lessons I think is the  
infamous GMOD gene page and standard web front-end, which has been  
labored on in various incarnations for more than half a decade,  
without producing a compelling and broadly adopted result. People's  
needs and technology obsessions vary from place to place.

One possibly hugely complicating factor for the GMOD web front-end was  
that the target audience were model organism websites, which  
themselves have a large and diverse stakeholder community, so  
flexibility and configurability became overriding requirements  
resulting in bloat of code stacks and features.

My personal take is that for this to be broadly useful, the primary  
target audience should probably be programmers, or programming-savvy  
scientists, who can extend and customize a core application at will.  
In other words, much in line with the philosophy behind the Bio*  
libraries.

Other than that, keep it simple so I don't have to learn yet another  
(namely your templating or clever XML configuration scheme) language  
to extend it. I sat next to Mark when he generated a bare-bones BioSQL- 
binding in EJB literally in minutes, which I thought was cool. People  
rave about Ruby and RoR too as for ease of getting started. By far the  
most people out there will be familiar with Perl, but I'm not sure  
what the web application framework would be there that would put me at  
ease. In the end what may count more than anything else is critical  
mass even if it's not everyone's darling language.

My $0.02, and I'd be keen so see what comes out of this. If there's  
something I can do to tip the balance towards something tangible  
happening, let me know.

	-hilmar

On Jan 30, 2009, at 6:03 AM, michael watson (IAH-C) wrote:

> Dear All
>
> Thank you for the responses.  I think it is clear there is a need -  
> all
> over the World there are groups of various sizes who try to collate  
> and
> curate sequences for their organism of choice, from fish virus  
> databases
> with 200 records, to flu databases with many thousands.  I'm in  
> contact
> with a tiny percentage of these groups, and there is a clear need for:
>
> - common DB schema (tick, we can use BioSQL)
> - Web app for:
> 	- submitting new sequences
> 	- curating and editing sequences
> 	- comparing sequences - align, draw trees etc
> 	- showing sequences on maps (i.e. location of sample)
> 	- submitting sequences to GenBank
> 	- retrieving sequences from GenBank
>
> With all of the Bio* projects, this shouldn't be too hard to do, but  
> as
> ever it needs bodies to do it... I took a quick look at Galaxy but  
> that
> isn't really what was needed.
>
> Thanks again
>
> Mick
>
> -----Original Message-----
> From: biosql-l-bounces at lists.open-bio.org
> [mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: 29 January 2009 18:45
> To: James Procter
> Cc: biosql-l at lists.open-bio.org
> Subject: Re: [BioSQL-l] Web front-ends to BioSQL
>
>
> On Jan 29, 2009, at 11:44 AM, James Procter wrote:
>
>>
>> Chris Fields wrote:
>>> Gbrowse, maybe?  There is a BioSQL plugin for it
>>> (Bio::DB::Das::BioSQL):
>>>
>>> http://gmod.org/wiki/GBrowse#About_Databases
>> I'm also in the market for a quick and easy front end - from what  
>> I've
>> heard from a colleague, GBrowse can be tricky to install. Also - for
>> my
>> application we'd like to easily gather sets of proteins and then
>> explore
>> their annotation. This is a little out of the scope of GBrowse.
>
> I don't find Gbrowse itself tricky as much as getting BioPerl
> installed.  One can use Gbrowse for what you want but there are
> probably better resources (Ensembl, maybe).
>
> chris
>
>> I think there might be a niche needing filling here - would anyone be
>> interested in pooling code/resources ?
>>
>> Jim.
>>
>> -- 
>> -------------------------------------------------------------------
>> J. B. Procter  (ENFIN/VAMSAS)  Barton Bioinformatics Research Group
>> Phone/Fax:+44(0)1382 388734/345764  http://www.compbio.dundee.ac.uk
>> The University of Dundee is a Scottish Registered Charity, No.
>> SC015096.
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Fri Jan 30 19:45:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 30 Jan 2009 13:45:30 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
Message-ID: <5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>


On Jan 30, 2009, at 9:23 AM, Hilmar Lapp wrote:

> Having such a webapp would be pretty cool, and I agree with the  
> argument below that there are numerous small groups or individuals  
> with this need. (we have some ourselves here ...)
>
> One word of caution as to where to look for lessons I think is the  
> infamous GMOD gene page and standard web front-end, which has been  
> labored on in various incarnations for more than half a decade,  
> without producing a compelling and broadly adopted result. People's  
> needs and technology obsessions vary from place to place.
>
> One possibly hugely complicating factor for the GMOD web front-end  
> was that the target audience were model organism websites, which  
> themselves have a large and diverse stakeholder community, so  
> flexibility and configurability became overriding requirements  
> resulting in bloat of code stacks and features.
>
> My personal take is that for this to be broadly useful, the primary  
> target audience should probably be programmers, or programming-savvy  
> scientists, who can extend and customize a core application at will.  
> In other words, much in line with the philosophy behind the Bio*  
> libraries.
>
> Other than that, keep it simple so I don't have to learn yet another  
> (namely your templating or clever XML configuration scheme) language  
> to extend it. I sat next to Mark when he generated a bare-bones  
> BioSQL-binding in EJB literally in minutes, which I thought was  
> cool. People rave about Ruby and RoR too as for ease of getting  
> started. By far the most people out there will be familiar with  
> Perl, but I'm not sure what the web application framework would be  
> there that would put me at ease. In the end what may count more than  
> anything else is critical mass even if it's not everyone's darling  
> language.

Perl web application framework: Catalyst and Jifty (have not tried  
them myself).  RoR gets a lot of press, but I understand the RoR devs  
tend not to listen to the core ruby devs and (as a consequence) had  
recently run into issues with the 1.8.7 ruby release, detailed by the  
always-entertaining chromatic here:

http://use.perl.org/~chromatic/journal/37125

chris

> My $0.02, and I'd be keen so see what comes out of this. If there's  
> something I can do to tip the balance towards something tangible  
> happening, let me know.
>
> 	-hilmar


From gthorisson at gmail.com  Fri Jan 30 19:57:42 2009
From: gthorisson at gmail.com (Gudmundur A. Thorisson)
Date: Fri, 30 Jan 2009 19:57:42 +0000
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
Message-ID: <D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>

We use Catalyst MVC framework for our project (http:// 
www.hgvbaseg2p.org). Very good stuff, we combine it with the  
DBIx::Class ORM and Template Toolkit as the templating engine. Totally  
recommended.


                 Mummi

On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>
>
> Perl web application framework: Catalyst and Jifty (have not tried  
> them myself).  RoR gets a lot of press, but I understand the RoR  
> devs tend not to listen to the core ruby devs and (as a consequence)  
> had recently run into issues with the 1.8.7 ruby release, detailed  
> by the always-entertaining chromatic here:
>
> http://use.perl.org/~chromatic/journal/37125
>
> chris
>
>> My $0.02, and I'd be keen so see what comes out of this. If there's  
>> something I can do to tip the balance towards something tangible  
>> happening, let me know.
>>
>> 	-hilmar
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From cjfields at illinois.edu  Fri Jan 30 20:08:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 30 Jan 2009 14:08:11 -0600
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk><49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu><4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
	<D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
Message-ID: <99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>

Another article (as pointed out by Heikki on bioperl-l):

http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0

The last section is all on MVC-oriented frameworks.

chris

On Jan 30, 2009, at 1:57 PM, Gudmundur A. Thorisson wrote:

> We use Catalyst MVC framework for our project (http://www.hgvbaseg2p.org 
> ). Very good stuff, we combine it with the DBIx::Class ORM and  
> Template Toolkit as the templating engine. Totally recommended.
>
>
>                Mummi
>
> On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>>
>>
>> Perl web application framework: Catalyst and Jifty (have not tried  
>> them myself).  RoR gets a lot of press, but I understand the RoR  
>> devs tend not to listen to the core ruby devs and (as a  
>> consequence) had recently run into issues with the 1.8.7 ruby  
>> release, detailed by the always-entertaining chromatic here:
>>
>> http://use.perl.org/~chromatic/journal/37125
>>
>> chris
>>
>>> My $0.02, and I'd be keen so see what comes out of this. If  
>>> there's something I can do to tip the balance towards something  
>>> tangible happening, let me know.
>>>
>>> 	-hilmar
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l


From markjschreiber at gmail.com  Sat Jan 31 11:03:53 2009
From: markjschreiber at gmail.com (Mark Schreiber)
Date: Sat, 31 Jan 2009 19:03:53 +0800
Subject: [BioSQL-l] Web front-ends to BioSQL
In-Reply-To: <99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>
References: <8975119BCD0AC5419D61A9CF1A923E9507E270EF@iahce2ksrv1.iah.bbsrc.ac.uk>
	<49DFF09F-8169-4D40-94FB-CDCDFC330E82@illinois.edu>
	<4981EAEC.4070508@compbio.dundee.ac.uk>
	<982A9E86-4CEA-428C-AF0E-5065C2036C91@illinois.edu>
	<8975119BCD0AC5419D61A9CF1A923E9507E2711C@iahce2ksrv1.iah.bbsrc.ac.uk>
	<903901EE-777B-43A8-9CDC-ED400B3E60BB@gmx.net>
	<5B046A75-AFD3-4CEB-B190-A27106828E9C@illinois.edu>
	<D4BACF35-0C7B-417A-9812-F0C0E77921CF@gmail.com>
	<99475964-CFB3-4A27-8024-8A14876533E0@illinois.edu>
Message-ID: <93b45ca50901310303t37905e8ak3819c05f4b94c287@mail.gmail.com>

Hi -

My feeling is that the diversity of languages and frameworks within
languages would mean that a generic web front end to BioSQL will and
should never materialize. What would be a lot more sensible is a
generic API in the form of a webservice or collection of webservices
that could be used by (theoretically) any web frame work to generate a
website.

User preferences and requirements will be far too diverse for a
generic web front end.

- Mark

On 1/31/09, Chris Fields <cjfields at illinois.edu> wrote:
> Another article (as pointed out by Heikki on bioperl-l):
>
> http://www.heise-online.co.uk/open/Healthcheck-Perl-The-Perl-Future--/features/112388/0
>
> The last section is all on MVC-oriented frameworks.
>
> chris
>
> On Jan 30, 2009, at 1:57 PM, Gudmundur A. Thorisson wrote:
>
>> We use Catalyst MVC framework for our project (http://www.hgvbaseg2p.org
>> ). Very good stuff, we combine it with the DBIx::Class ORM and
>> Template Toolkit as the templating engine. Totally recommended.
>>
>>
>>                Mummi
>>
>> On 30 Jan 2009, at 19:45, Chris Fields wrote:
>>>>
>>>
>>> Perl web application framework: Catalyst and Jifty (have not tried
>>> them myself).  RoR gets a lot of press, but I understand the RoR
>>> devs tend not to listen to the core ruby devs and (as a
>>> consequence) had recently run into issues with the 1.8.7 ruby
>>> release, detailed by the always-entertaining chromatic here:
>>>
>>> http://use.perl.org/~chromatic/journal/37125
>>>
>>> chris
>>>
>>>> My $0.02, and I'd be keen so see what comes out of this. If
>>>> there's something I can do to tip the balance towards something
>>>> tangible happening, let me know.
>>>>
>>>> 	-hilmar
>>> _______________________________________________
>>> BioSQL-l mailing list
>>> BioSQL-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>