From hlapp at gnf.org  Wed Nov  2 02:55:49 2005
From: hlapp at gnf.org (Hilmar Lapp)
Date: Wed Nov  2 03:37:46 2005
Subject: [BioSQL-l] BioJavaX ready for testing
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D560265652E@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D560265652E@BIONIC.biopolis.one-north.com>
Message-ID: <c050eaa262dc8ec3500f0331b9ef4a53@gnf.org>

Sounds pretty cool! -hilmar

On Oct 31, 2005, at 1:28 AM, Richard HOLLAND wrote:

> Hello people!
>
> Mark is away so I'm taking the liberty of sneaking this one out... :)
>
> I've cross-posted this to both BioJava and BioSQL as much of what is 
> new in BioJavaX will probably be of interest to BioSQL users too.
>
> We've been doing a lot of work recently on creating some extensions to 
> BioJava called BioJavaX. Primarily the purpose of these extensions is 
> to provide better interaction with BioSQL databases, which has been 
> achieved using Hibernate (www.hibernate.org). You can now fully 
> interact with every column of every table in BioSQL, using Hibernate's 
> own HQL language to construct queries that result in sets of BioJavaX 
> objects. Selects, inserts, updates, primary key assignment, foreign 
> key relations, and deletes are all handled transparently by Hibernate, 
> removing the need for any SQL at all to be included in BioJavaX.
>
> As a side effect of constructing a Hibernate-compatible extension to 
> the BioJava object model, we were required to define objects that hold 
> much more detailed information about themselves. For instance, a 
> Sequence object cannot tell you what namespace it lives in in the 
> BioSQL database, but our extension to it, RichSequence, can. As 
> RichSequence extends Sequence and doesn't replace it, this means you 
> can use the new objects with your existing code without any hassle 
> casting them.
>
> To be able to load information from files into these new RichSequence 
> objects in a meaningful way, we had to create a more detailed 
> SeqIOListener, called RichSeqIOListener. Then, we had to create new 
> file parsers for the common file formats which were able to extract 
> more detailed information than before in order to satisfy the 
> RichSeqIOListener.
>
> It's pretty safe to say that the file parsers in BioJavaX are leagues 
> ahead of the existing ones in BioJava, even if I do say so myself. :P 
> The downside of this extra detail though is that the parsers are much 
> more sensitive and will not play well at all with incomplete or 
> incorrectly formed files. If someone can edit them to be less 
> sensitive whilst still retaining the level of detail required, that'd 
> be great.
>
> We've included parsers for FASTA, GenBank, EMBL, UniProt, INSDseq, 
> EMBLxml, UniProtXML, and an extra one for parsing NCBI Taxonomy data.
>
> Do note that BioJavaX cannot fully convert sequences created using the 
> old BioJava model into the new BioJavaX model. It'll do its best, but 
> the RichSequence object you'll end up with will have lots of 
> properties set to null and a tonne of annotations instead, pretty much 
> the same as the original Sequence object I suppose. So its best to try 
> to avoid conversions and deal with RichSequence objects from the 
> ground up. This is particularly important to consider when converting 
> a BioSQL database previously used with BioJava into one for use with 
> BioJavaX. You'll also find that if you pass a converted old-style 
> Sequence object to one of the new file parsers for writing it may fail 
> or produce output with lots of missing fields, as it will not find the 
> information it is looking for in the places it expects.
>
> The whole lot is specifically designed to mimic and be compatible with 
> BioSQL, but you don't need to have a BioSQL database to use it. 
> Everything is standalone and will work just fine without a backing 
> data source. Also there is no reason why you couldn't create a new set 
> of Hibernate mappings that map the BioJavaX object model to some other 
> relational database schema of your choice.
>
> The upshot of it all is the org.biojavax package, which you can find 
> in biojava-live branch on CVS. Development is pretty much complete, 
> and it now needs some serious testing.
>
> We need volunteers to:
>
> 	a) test the BioSQL interaction via Hibernate with the various 
> database flavours supported (HSQL, Oracle, MySQL, PostGreSQL)
> 	b) test the various file formats, particularly looking for 
> special-case exceptions which the parsers may not be aware of yet
> 	c) do some load-testing and help us find ways to improve it if it 
> turns out to be too slow when under pressure
>
> Documentation of the new features can be found in DocBook XML format 
> in docs/docbook/BioJavaX.xml in the biojava-live branch of CVS. It's 
> as detailed as I could make it without getting bored to death writing 
> it. I've never been the world's best documentation writer, so if 
> anyone would like to help improve it you're more than welcome.
>
> Our plan is to make all this an official part of BioJava come the 1.5 
> release, whenever that may be. For now though it is very very much a 
> testing-stage thing, not even an alpha release.
>
> Questions on a postcard to either Mark or myself. Feedback most 
> welcome.
>
> cheers,
> Richard
>
>
> Richard Holland
> Bioinformatics Specialist
> Genome Institute of Singapore
> 60 Biopolis Street, #02-01 Genome, Singapore 138672
> Tel: (65) 6478 8000   DID: (65) 6478 8199
> Email: hollandr@gis.a-star.edu.sg
> ---------------------------------------------
> This email is confidential and may be privileged. If you are not the 
> intended recipient, please delete it and notify us immediately. Please 
> do not copy or use it for any purpose, or disclose its content to any 
> other person. Thank you.
> ---------------------------------------------
>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hlapp at gmx.net  Thu Nov  3 11:53:55 2005
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu Nov  3 11:59:04 2005
Subject: [BioSQL-l] biosql usage/user survey
Message-ID: <9692f0e9a791c7d0bf942e497668fdce@gmx.net>

Hi all,

I am writing up a paper on BioSQL and would like to include some 
current usage figures to support its utility.

Therefore, if you are using BioSQL I'd be glad if you could drop me an 
email; if you can include a word or two (not more than 1 sentence) on 
what you use it for that'd be great too.

Thanks in advance,

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

From muratem at eng.uah.edu  Tue Nov 15 14:57:29 2005
From: muratem at eng.uah.edu (Mike Muratet)
Date: Tue Nov 15 15:10:46 2005
Subject: [BioSQL-l] biosql & bioperl-db on Solaris
Message-ID: <Pine.GSO.4.60.0511151339320.13978@zeus>

Greetings

This is not strictly a bioperl or biosql issue, but is related. In the 
process of trying to install bioperl-db on a Solaris 9 system, I had to go 
back and install the latest bioperl, DBI, AND DBD modules. The DBD install 
via CPAN is failing for an ELF64 incompatibility when it tries to link to 
the mysql client libs because I also installed the 
latest 64bit mysql. According to the mysql docs, 64bit is 4% slower 
than the 32bit version, but you get more threads and memory (which would 
seem to be faster in the long run).

Although I have installed gcc and gnu make, CPAN is picking up the native 
Sun compiler (and why do they spread things out over so many 
subdirectories?) and linker.

Has anyone else come up against this before (I didn't find anything in the 
archives)? Is the best course to install a 32bit mysql? Should I forgo 
CPAN and try to do a manual install where I might have some control over 
Makefile.PL? Has anybody tried biosql on a 64bit system and does it make a 
difference?

Thanks

Mike
From hollandr at gis.a-star.edu.sg  Tue Nov 15 20:47:07 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Tue Nov 15 20:58:06 2005
Subject: [BioSQL-l] biosql & bioperl-db on Solaris
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com>

Hi,

This may seem a bit basic but it's worth a go.

CPAN looks for "cc" - which on a Sun machine is the Sun compiler. If you
have "gcc" installed then you may need to create a soft-link to it
called "cc" and ensure that softlink appears earlier in your path than
the location of the Sun compiler. You must then export the path or add
it to your profile as CPAN starts new sub-shells internally).

eg. if your gcc binary is /usr/bin/gcc, and assuming a bash-like shell:

	mkdir ~/bin
	ln -s /usr/bin/gcc ~/bin/cc
	PATH=~/bin:$PATH
	export PATH
	
Then run CPAN and see what happens.

Hope it helps.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biosql-l-bounces@portal.open-bio.org 
> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
> Mike Muratet
> Sent: Wednesday, November 16, 2005 3:57 AM
> To: biosql-l@open-bio.org
> Cc: bioperl-l@bioperl.org
> Subject: [BioSQL-l] biosql & bioperl-db on Solaris
> 
> 
> Greetings
> 
> This is not strictly a bioperl or biosql issue, but is 
> related. In the 
> process of trying to install bioperl-db on a Solaris 9 
> system, I had to go 
> back and install the latest bioperl, DBI, AND DBD modules. 
> The DBD install 
> via CPAN is failing for an ELF64 incompatibility when it 
> tries to link to 
> the mysql client libs because I also installed the 
> latest 64bit mysql. According to the mysql docs, 64bit is 4% slower 
> than the 32bit version, but you get more threads and memory 
> (which would 
> seem to be faster in the long run).
> 
> Although I have installed gcc and gnu make, CPAN is picking 
> up the native 
> Sun compiler (and why do they spread things out over so many 
> subdirectories?) and linker.
> 
> Has anyone else come up against this before (I didn't find 
> anything in the 
> archives)? Is the best course to install a 32bit mysql? 
> Should I forgo 
> CPAN and try to do a manual install where I might have some 
> control over 
> Makefile.PL? Has anybody tried biosql on a 64bit system and 
> does it make a 
> difference?
> 
> Thanks
> 
> Mike
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
> 

From hlapp at gnf.org  Tue Nov 15 21:21:45 2005
From: hlapp at gnf.org (Hilmar Lapp)
Date: Tue Nov 15 21:38:32 2005
Subject: [BioSQL-l] biosql & bioperl-db on Solaris
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com>
Message-ID: <127e679dc3afe7d59f62ed4bd1719450@gnf.org>

I believe the DBI docs make a pretty strong statement about perl, your 
database client library, and DBI/DBD having to be compiled by the same 
compiler using the same settings.

This may be a bit too strong of a statement, but I'd definitely make 
sure that if you want 64bit in the DBD driver perl is also 64-bit 
compiled, and that they are all compiled by the same brand of compiler.

I.e., you may have to compile perl and the mysql client lib yourself, 
using either the Sun native compiler, or gcc, for all of them 
consistently. Also, I'd definitely forgo CPAN for DBI and DBD::mysql; 
building and installing manually is fairly easy if you don't have 
dependencies (perl Makefile.PL; make; make test; make install), and the 
INSTALL or README document will tell you how to control the choice of 
compiler (often it's like 'make CC=gcc' or some such).

	-hilmar

On Nov 15, 2005, at 5:47 PM, Richard HOLLAND wrote:

> Hi,
>
> This may seem a bit basic but it's worth a go.
>
> CPAN looks for "cc" - which on a Sun machine is the Sun compiler. If 
> you
> have "gcc" installed then you may need to create a soft-link to it
> called "cc" and ensure that softlink appears earlier in your path than
> the location of the Sun compiler. You must then export the path or add
> it to your profile as CPAN starts new sub-shells internally).
>
> eg. if your gcc binary is /usr/bin/gcc, and assuming a bash-like shell:
>
> 	mkdir ~/bin
> 	ln -s /usr/bin/gcc ~/bin/cc
> 	PATH=~/bin:$PATH
> 	export PATH
> 	
> Then run CPAN and see what happens.
>
> Hope it helps.
>
> cheers,
> Richard
>
> Richard Holland
> Bioinformatics Specialist
> GIS extension 8199
> ---------------------------------------------
> This email is confidential and may be privileged. If you are not the
> intended recipient, please delete it and notify us immediately. Please
> do not copy or use it for any purpose, or disclose its content to any
> other person. Thank you.
> ---------------------------------------------
>
>
>> -----Original Message-----
>> From: biosql-l-bounces@portal.open-bio.org
>> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of
>> Mike Muratet
>> Sent: Wednesday, November 16, 2005 3:57 AM
>> To: biosql-l@open-bio.org
>> Cc: bioperl-l@bioperl.org
>> Subject: [BioSQL-l] biosql & bioperl-db on Solaris
>>
>>
>> Greetings
>>
>> This is not strictly a bioperl or biosql issue, but is
>> related. In the
>> process of trying to install bioperl-db on a Solaris 9
>> system, I had to go
>> back and install the latest bioperl, DBI, AND DBD modules.
>> The DBD install
>> via CPAN is failing for an ELF64 incompatibility when it
>> tries to link to
>> the mysql client libs because I also installed the
>> latest 64bit mysql. According to the mysql docs, 64bit is 4% slower
>> than the 32bit version, but you get more threads and memory
>> (which would
>> seem to be faster in the long run).
>>
>> Although I have installed gcc and gnu make, CPAN is picking
>> up the native
>> Sun compiler (and why do they spread things out over so many
>> subdirectories?) and linker.
>>
>> Has anyone else come up against this before (I didn't find
>> anything in the
>> archives)? Is the best course to install a 32bit mysql?
>> Should I forgo
>> CPAN and try to do a manual install where I might have some
>> control over
>> Makefile.PL? Has anybody tried biosql on a 64bit system and
>> does it make a
>> difference?
>>
>> Thanks
>>
>> Mike
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l@open-bio.org
>> http://open-bio.org/mailman/listinfo/biosql-l
>>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hrundi.barkshi at googlemail.com  Tue Nov 15 17:52:24 2005
From: hrundi.barkshi at googlemail.com (Hrundi Barkshi)
Date: Wed Nov 16 07:46:12 2005
Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL?
Message-ID: <dd6fa38d0511151452r54e6fbb8l@mail.gmail.com>

Hi,

I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the
example under the test folder ./t/data

perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace
uniprot --debug --format swiss  swiss.da

Now I would like to revert the sequence querying the db

I have tried using bioentry2flat but I get the following error:


$ perl  bioentry2flat.pl BioSQL3 Q07021Connecting with mysql:BioSQL3:root:dbpass

------------- EXCEPTION  -------------
MSG: Attempting to write with no seq!
STACK Bio::SeqIO::embl::write_seq
/usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394
STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661
STACK toplevel bioentry2flat.pl:69

-----------------------------------

Any advise?
Thanks in advance
Hrundi

From osborne1 at optonline.net  Tue Nov 15 17:37:02 2005
From: osborne1 at optonline.net (Brian Osborne)
Date: Wed Nov 16 07:46:23 2005
Subject: [BioSQL-l] Re: [Bioperl-l] biosql & bioperl-db on Solaris
In-Reply-To: <Pine.GSO.4.60.0511151339320.13978@zeus>
Message-ID: <BF9FCD3E.5F3D%osborne1@optonline.net>

Mike,

If I'm understanding this correctly you'd like the build to use gcc. Did you
try things like:

setenv CC gcc

?

The equivalent of the command above in the CPAN shell would be:

cpan>o conf makepl_arg CC=gcc


Brian O.

> 
> This is not strictly a bioperl or biosql issue, but is related. In the
> process of trying to install bioperl-db on a Solaris 9 system, I had to go
> back and install the latest bioperl, DBI, AND DBD modules. The DBD install
> via CPAN is failing for an ELF64 incompatibility when it tries to link to
> the mysql client libs because I also installed the
> latest 64bit mysql. According to the mysql docs, 64bit is 4% slower
> than the 32bit version, but you get more threads and memory (which would
> seem to be faster in the long run).
> 
> Although I have installed gcc and gnu make, CPAN is picking up the native
> Sun compiler (and why do they spread things out over so many
> subdirectories?) and linker.
> 
> Has anyone else come up against this before (I didn't find anything in the
> archives)? Is the best course to install a 32bit mysql? Should I forgo
> CPAN and try to do a manual install where I might have some control over
> Makefile.PL? Has anybody tried biosql on a 64bit system and does it make a
> difference?
> 
> Thanks
> 
> Mike
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gnf.org  Wed Nov 16 11:40:40 2005
From: hlapp at gnf.org (Hilmar Lapp)
Date: Wed Nov 16 11:40:00 2005
Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL?
In-Reply-To: <dd6fa38d0511151452r54e6fbb8l@mail.gmail.com>
References: <dd6fa38d0511151452r54e6fbb8l@mail.gmail.com>
Message-ID: <005be200d2d71eccb72fcfd32c5e9d8a@gnf.org>

It doesn't find the sequence. Note that bioentry2flat.pl is mostly for 
demonstration purposes and not necessarily a production quality script.

Having said that and observing that it hard-codes the namespace to 
sprot_hum, which is not the one you used when loading, it should be 
easy to either modify the script to use a different namespace, or to 
accept it as a command line, or alternatively to use the same namespace 
when loading.

Let me know if the script still doesn't work if you use the same 
namespace by any of the options above.

	-hilmar

On Nov 15, 2005, at 2:52 PM, Hrundi Barkshi wrote:

> Hi,
>
> I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the
> example under the test folder ./t/data
>
> perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace
> uniprot --debug --format swiss  swiss.da
>
> Now I would like to revert the sequence querying the db
>
> I have tried using bioentry2flat but I get the following error:
>
>
> $ perl  bioentry2flat.pl BioSQL3 Q07021Connecting with 
> mysql:BioSQL3:root:dbpass
>
> ------------- EXCEPTION  -------------
> MSG: Attempting to write with no seq!
> STACK Bio::SeqIO::embl::write_seq
> /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394
> STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661
> STACK toplevel bioentry2flat.pl:69
>
> -----------------------------------
>
> Any advise?
> Thanks in advance
> Hrundi
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hrundi.barkshi at googlemail.com  Wed Nov 16 13:13:01 2005
From: hrundi.barkshi at googlemail.com (Hrundi Barkshi)
Date: Thu Dec  1 11:52:18 2005
Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL?
In-Reply-To: <005be200d2d71eccb72fcfd32c5e9d8a@gnf.org>
References: <dd6fa38d0511151452r54e6fbb8l@mail.gmail.com>
	<005be200d2d71eccb72fcfd32c5e9d8a@gnf.org>
Message-ID: <dd6fa38d0511161013n13e3776q@mail.gmail.com>

I changed that and make
 $biodbname = 'bioperl'; according to the entry in the database
 mysql> select * from biodatabase;
+----------------+---------+-----------+-------------+
| biodatabase_id | name | authority | description |
+----------------+---------+-----------+-------------+
| 3 | bioperl | NULL | NULL |
+----------------+---------+-----------+-------------+
1 row in set (0.01 sec)

Now works perfect.

Thanks
Hrundi


On 16/11/05, Hilmar Lapp <hlapp@gnf.org> wrote:
> It doesn't find the sequence. Note that bioentry2flat.pl is mostly for
> demonstration purposes and not necessarily a production quality script.
>
> Having said that and observing that it hard-codes the namespace to
> sprot_hum, which is not the one you used when loading, it should be
> easy to either modify the script to use a different namespace, or to
> accept it as a command line, or alternatively to use the same namespace
> when loading.
>
> Let me know if the script still doesn't work if you use the same
> namespace by any of the options above.
>
> -hilmar
>
> On Nov 15, 2005, at 2:52 PM, Hrundi Barkshi wrote:
>
> > Hi,
> >
> > I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the
> > example under the test folder ./t/data
> >
> > perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace
> > uniprot --debug --format swiss swiss.da
> >
> > Now I would like to revert the sequence querying the db
> >
> > I have tried using bioentry2flat but I get the following error:
> >
> >
> > $ perl bioentry2flat.pl BioSQL3 Q07021Connecting with
> > mysql:BioSQL3:root:dbpass
> >
> > ------------- EXCEPTION -------------
> > MSG: Attempting to write with no seq!
> > STACK Bio::SeqIO::embl::write_seq
> > /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394
> > STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661
> > STACK toplevel bioentry2flat.pl:69
> >
> > -----------------------------------
> >
> > Any advise?
> > Thanks in advance
> > Hrundi
> >
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l@open-bio.org
> > http://open-bio.org/mailman/listinfo/biosql-l
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp email: lapp at gnf.org <http://gnf.org>
> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
>