From hlapp at gnf.org Wed Nov 2 02:55:49 2005 From: hlapp at gnf.org (Hilmar Lapp) Date: Wed Nov 2 03:37:46 2005 Subject: [BioSQL-l] BioJavaX ready for testing In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D560265652E@BIONIC.biopolis.one-north.com> References: <6D9E9B9DF347EF4385F6271C64FB8D560265652E@BIONIC.biopolis.one-north.com> Message-ID: Sounds pretty cool! -hilmar On Oct 31, 2005, at 1:28 AM, Richard HOLLAND wrote: > Hello people! > > Mark is away so I'm taking the liberty of sneaking this one out... :) > > I've cross-posted this to both BioJava and BioSQL as much of what is > new in BioJavaX will probably be of interest to BioSQL users too. > > We've been doing a lot of work recently on creating some extensions to > BioJava called BioJavaX. Primarily the purpose of these extensions is > to provide better interaction with BioSQL databases, which has been > achieved using Hibernate (www.hibernate.org). You can now fully > interact with every column of every table in BioSQL, using Hibernate's > own HQL language to construct queries that result in sets of BioJavaX > objects. Selects, inserts, updates, primary key assignment, foreign > key relations, and deletes are all handled transparently by Hibernate, > removing the need for any SQL at all to be included in BioJavaX. > > As a side effect of constructing a Hibernate-compatible extension to > the BioJava object model, we were required to define objects that hold > much more detailed information about themselves. For instance, a > Sequence object cannot tell you what namespace it lives in in the > BioSQL database, but our extension to it, RichSequence, can. As > RichSequence extends Sequence and doesn't replace it, this means you > can use the new objects with your existing code without any hassle > casting them. > > To be able to load information from files into these new RichSequence > objects in a meaningful way, we had to create a more detailed > SeqIOListener, called RichSeqIOListener. Then, we had to create new > file parsers for the common file formats which were able to extract > more detailed information than before in order to satisfy the > RichSeqIOListener. > > It's pretty safe to say that the file parsers in BioJavaX are leagues > ahead of the existing ones in BioJava, even if I do say so myself. :P > The downside of this extra detail though is that the parsers are much > more sensitive and will not play well at all with incomplete or > incorrectly formed files. If someone can edit them to be less > sensitive whilst still retaining the level of detail required, that'd > be great. > > We've included parsers for FASTA, GenBank, EMBL, UniProt, INSDseq, > EMBLxml, UniProtXML, and an extra one for parsing NCBI Taxonomy data. > > Do note that BioJavaX cannot fully convert sequences created using the > old BioJava model into the new BioJavaX model. It'll do its best, but > the RichSequence object you'll end up with will have lots of > properties set to null and a tonne of annotations instead, pretty much > the same as the original Sequence object I suppose. So its best to try > to avoid conversions and deal with RichSequence objects from the > ground up. This is particularly important to consider when converting > a BioSQL database previously used with BioJava into one for use with > BioJavaX. You'll also find that if you pass a converted old-style > Sequence object to one of the new file parsers for writing it may fail > or produce output with lots of missing fields, as it will not find the > information it is looking for in the places it expects. > > The whole lot is specifically designed to mimic and be compatible with > BioSQL, but you don't need to have a BioSQL database to use it. > Everything is standalone and will work just fine without a backing > data source. Also there is no reason why you couldn't create a new set > of Hibernate mappings that map the BioJavaX object model to some other > relational database schema of your choice. > > The upshot of it all is the org.biojavax package, which you can find > in biojava-live branch on CVS. Development is pretty much complete, > and it now needs some serious testing. > > We need volunteers to: > > a) test the BioSQL interaction via Hibernate with the various > database flavours supported (HSQL, Oracle, MySQL, PostGreSQL) > b) test the various file formats, particularly looking for > special-case exceptions which the parsers may not be aware of yet > c) do some load-testing and help us find ways to improve it if it > turns out to be too slow when under pressure > > Documentation of the new features can be found in DocBook XML format > in docs/docbook/BioJavaX.xml in the biojava-live branch of CVS. It's > as detailed as I could make it without getting bored to death writing > it. I've never been the world's best documentation writer, so if > anyone would like to help improve it you're more than welcome. > > Our plan is to make all this an official part of BioJava come the 1.5 > release, whenever that may be. For now though it is very very much a > testing-stage thing, not even an alpha release. > > Questions on a postcard to either Mark or myself. Feedback most > welcome. > > cheers, > Richard > > > Richard Holland > Bioinformatics Specialist > Genome Institute of Singapore > 60 Biopolis Street, #02-01 Genome, Singapore 138672 > Tel: (65) 6478 8000 DID: (65) 6478 8199 > Email: hollandr@gis.a-star.edu.sg > --------------------------------------------- > This email is confidential and may be privileged. If you are not the > intended recipient, please delete it and notify us immediately. Please > do not copy or use it for any purpose, or disclose its content to any > other person. Thank you. > --------------------------------------------- > > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hlapp at gmx.net Thu Nov 3 11:53:55 2005 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu Nov 3 11:59:04 2005 Subject: [BioSQL-l] biosql usage/user survey Message-ID: <9692f0e9a791c7d0bf942e497668fdce@gmx.net> Hi all, I am writing up a paper on BioSQL and would like to include some current usage figures to support its utility. Therefore, if you are using BioSQL I'd be glad if you could drop me an email; if you can include a word or two (not more than 1 sentence) on what you use it for that'd be great too. Thanks in advance, -hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From muratem at eng.uah.edu Tue Nov 15 14:57:29 2005 From: muratem at eng.uah.edu (Mike Muratet) Date: Tue Nov 15 15:10:46 2005 Subject: [BioSQL-l] biosql & bioperl-db on Solaris Message-ID: Greetings This is not strictly a bioperl or biosql issue, but is related. In the process of trying to install bioperl-db on a Solaris 9 system, I had to go back and install the latest bioperl, DBI, AND DBD modules. The DBD install via CPAN is failing for an ELF64 incompatibility when it tries to link to the mysql client libs because I also installed the latest 64bit mysql. According to the mysql docs, 64bit is 4% slower than the 32bit version, but you get more threads and memory (which would seem to be faster in the long run). Although I have installed gcc and gnu make, CPAN is picking up the native Sun compiler (and why do they spread things out over so many subdirectories?) and linker. Has anyone else come up against this before (I didn't find anything in the archives)? Is the best course to install a 32bit mysql? Should I forgo CPAN and try to do a manual install where I might have some control over Makefile.PL? Has anybody tried biosql on a 64bit system and does it make a difference? Thanks Mike From hollandr at gis.a-star.edu.sg Tue Nov 15 20:47:07 2005 From: hollandr at gis.a-star.edu.sg (Richard HOLLAND) Date: Tue Nov 15 20:58:06 2005 Subject: [BioSQL-l] biosql & bioperl-db on Solaris Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com> Hi, This may seem a bit basic but it's worth a go. CPAN looks for "cc" - which on a Sun machine is the Sun compiler. If you have "gcc" installed then you may need to create a soft-link to it called "cc" and ensure that softlink appears earlier in your path than the location of the Sun compiler. You must then export the path or add it to your profile as CPAN starts new sub-shells internally). eg. if your gcc binary is /usr/bin/gcc, and assuming a bash-like shell: mkdir ~/bin ln -s /usr/bin/gcc ~/bin/cc PATH=~/bin:$PATH export PATH Then run CPAN and see what happens. Hope it helps. cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: biosql-l-bounces@portal.open-bio.org > [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of > Mike Muratet > Sent: Wednesday, November 16, 2005 3:57 AM > To: biosql-l@open-bio.org > Cc: bioperl-l@bioperl.org > Subject: [BioSQL-l] biosql & bioperl-db on Solaris > > > Greetings > > This is not strictly a bioperl or biosql issue, but is > related. In the > process of trying to install bioperl-db on a Solaris 9 > system, I had to go > back and install the latest bioperl, DBI, AND DBD modules. > The DBD install > via CPAN is failing for an ELF64 incompatibility when it > tries to link to > the mysql client libs because I also installed the > latest 64bit mysql. According to the mysql docs, 64bit is 4% slower > than the 32bit version, but you get more threads and memory > (which would > seem to be faster in the long run). > > Although I have installed gcc and gnu make, CPAN is picking > up the native > Sun compiler (and why do they spread things out over so many > subdirectories?) and linker. > > Has anyone else come up against this before (I didn't find > anything in the > archives)? Is the best course to install a 32bit mysql? > Should I forgo > CPAN and try to do a manual install where I might have some > control over > Makefile.PL? Has anybody tried biosql on a 64bit system and > does it make a > difference? > > Thanks > > Mike > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > From hlapp at gnf.org Tue Nov 15 21:21:45 2005 From: hlapp at gnf.org (Hilmar Lapp) Date: Tue Nov 15 21:38:32 2005 Subject: [BioSQL-l] biosql & bioperl-db on Solaris In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com> References: <6D9E9B9DF347EF4385F6271C64FB8D5602656B0D@BIONIC.biopolis.one-north.com> Message-ID: <127e679dc3afe7d59f62ed4bd1719450@gnf.org> I believe the DBI docs make a pretty strong statement about perl, your database client library, and DBI/DBD having to be compiled by the same compiler using the same settings. This may be a bit too strong of a statement, but I'd definitely make sure that if you want 64bit in the DBD driver perl is also 64-bit compiled, and that they are all compiled by the same brand of compiler. I.e., you may have to compile perl and the mysql client lib yourself, using either the Sun native compiler, or gcc, for all of them consistently. Also, I'd definitely forgo CPAN for DBI and DBD::mysql; building and installing manually is fairly easy if you don't have dependencies (perl Makefile.PL; make; make test; make install), and the INSTALL or README document will tell you how to control the choice of compiler (often it's like 'make CC=gcc' or some such). -hilmar On Nov 15, 2005, at 5:47 PM, Richard HOLLAND wrote: > Hi, > > This may seem a bit basic but it's worth a go. > > CPAN looks for "cc" - which on a Sun machine is the Sun compiler. If > you > have "gcc" installed then you may need to create a soft-link to it > called "cc" and ensure that softlink appears earlier in your path than > the location of the Sun compiler. You must then export the path or add > it to your profile as CPAN starts new sub-shells internally). > > eg. if your gcc binary is /usr/bin/gcc, and assuming a bash-like shell: > > mkdir ~/bin > ln -s /usr/bin/gcc ~/bin/cc > PATH=~/bin:$PATH > export PATH > > Then run CPAN and see what happens. > > Hope it helps. > > cheers, > Richard > > Richard Holland > Bioinformatics Specialist > GIS extension 8199 > --------------------------------------------- > This email is confidential and may be privileged. If you are not the > intended recipient, please delete it and notify us immediately. Please > do not copy or use it for any purpose, or disclose its content to any > other person. Thank you. > --------------------------------------------- > > >> -----Original Message----- >> From: biosql-l-bounces@portal.open-bio.org >> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of >> Mike Muratet >> Sent: Wednesday, November 16, 2005 3:57 AM >> To: biosql-l@open-bio.org >> Cc: bioperl-l@bioperl.org >> Subject: [BioSQL-l] biosql & bioperl-db on Solaris >> >> >> Greetings >> >> This is not strictly a bioperl or biosql issue, but is >> related. In the >> process of trying to install bioperl-db on a Solaris 9 >> system, I had to go >> back and install the latest bioperl, DBI, AND DBD modules. >> The DBD install >> via CPAN is failing for an ELF64 incompatibility when it >> tries to link to >> the mysql client libs because I also installed the >> latest 64bit mysql. According to the mysql docs, 64bit is 4% slower >> than the 32bit version, but you get more threads and memory >> (which would >> seem to be faster in the long run). >> >> Although I have installed gcc and gnu make, CPAN is picking >> up the native >> Sun compiler (and why do they spread things out over so many >> subdirectories?) and linker. >> >> Has anyone else come up against this before (I didn't find >> anything in the >> archives)? Is the best course to install a 32bit mysql? >> Should I forgo >> CPAN and try to do a manual install where I might have some >> control over >> Makefile.PL? Has anybody tried biosql on a 64bit system and >> does it make a >> difference? >> >> Thanks >> >> Mike >> _______________________________________________ >> BioSQL-l mailing list >> BioSQL-l@open-bio.org >> http://open-bio.org/mailman/listinfo/biosql-l >> > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hrundi.barkshi at googlemail.com Tue Nov 15 17:52:24 2005 From: hrundi.barkshi at googlemail.com (Hrundi Barkshi) Date: Wed Nov 16 07:46:12 2005 Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL? Message-ID: Hi, I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the example under the test folder ./t/data perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace uniprot --debug --format swiss swiss.da Now I would like to revert the sequence querying the db I have tried using bioentry2flat but I get the following error: $ perl bioentry2flat.pl BioSQL3 Q07021Connecting with mysql:BioSQL3:root:dbpass ------------- EXCEPTION ------------- MSG: Attempting to write with no seq! STACK Bio::SeqIO::embl::write_seq /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394 STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661 STACK toplevel bioentry2flat.pl:69 ----------------------------------- Any advise? Thanks in advance Hrundi From osborne1 at optonline.net Tue Nov 15 17:37:02 2005 From: osborne1 at optonline.net (Brian Osborne) Date: Wed Nov 16 07:46:23 2005 Subject: [BioSQL-l] Re: [Bioperl-l] biosql & bioperl-db on Solaris In-Reply-To: Message-ID: Mike, If I'm understanding this correctly you'd like the build to use gcc. Did you try things like: setenv CC gcc ? The equivalent of the command above in the CPAN shell would be: cpan>o conf makepl_arg CC=gcc Brian O. > > This is not strictly a bioperl or biosql issue, but is related. In the > process of trying to install bioperl-db on a Solaris 9 system, I had to go > back and install the latest bioperl, DBI, AND DBD modules. The DBD install > via CPAN is failing for an ELF64 incompatibility when it tries to link to > the mysql client libs because I also installed the > latest 64bit mysql. According to the mysql docs, 64bit is 4% slower > than the 32bit version, but you get more threads and memory (which would > seem to be faster in the long run). > > Although I have installed gcc and gnu make, CPAN is picking up the native > Sun compiler (and why do they spread things out over so many > subdirectories?) and linker. > > Has anyone else come up against this before (I didn't find anything in the > archives)? Is the best course to install a 32bit mysql? Should I forgo > CPAN and try to do a manual install where I might have some control over > Makefile.PL? Has anybody tried biosql on a 64bit system and does it make a > difference? > > Thanks > > Mike > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gnf.org Wed Nov 16 11:40:40 2005 From: hlapp at gnf.org (Hilmar Lapp) Date: Wed Nov 16 11:40:00 2005 Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL? In-Reply-To: References: Message-ID: <005be200d2d71eccb72fcfd32c5e9d8a@gnf.org> It doesn't find the sequence. Note that bioentry2flat.pl is mostly for demonstration purposes and not necessarily a production quality script. Having said that and observing that it hard-codes the namespace to sprot_hum, which is not the one you used when loading, it should be easy to either modify the script to use a different namespace, or to accept it as a command line, or alternatively to use the same namespace when loading. Let me know if the script still doesn't work if you use the same namespace by any of the options above. -hilmar On Nov 15, 2005, at 2:52 PM, Hrundi Barkshi wrote: > Hi, > > I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the > example under the test folder ./t/data > > perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace > uniprot --debug --format swiss swiss.da > > Now I would like to revert the sequence querying the db > > I have tried using bioentry2flat but I get the following error: > > > $ perl bioentry2flat.pl BioSQL3 Q07021Connecting with > mysql:BioSQL3:root:dbpass > > ------------- EXCEPTION ------------- > MSG: Attempting to write with no seq! > STACK Bio::SeqIO::embl::write_seq > /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394 > STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661 > STACK toplevel bioentry2flat.pl:69 > > ----------------------------------- > > Any advise? > Thanks in advance > Hrundi > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l@open-bio.org > http://open-bio.org/mailman/listinfo/biosql-l > -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 ------------------------------------------------------------- From hrundi.barkshi at googlemail.com Wed Nov 16 13:13:01 2005 From: hrundi.barkshi at googlemail.com (Hrundi Barkshi) Date: Thu Dec 1 11:52:18 2005 Subject: [BioSQL-l] how to reconstruct the sequences placed on bioSQL? In-Reply-To: <005be200d2d71eccb72fcfd32c5e9d8a@gnf.org> References: <005be200d2d71eccb72fcfd32c5e9d8a@gnf.org> Message-ID: I changed that and make $biodbname = 'bioperl'; according to the entry in the database mysql> select * from biodatabase; +----------------+---------+-----------+-------------+ | biodatabase_id | name | authority | description | +----------------+---------+-----------+-------------+ | 3 | bioperl | NULL | NULL | +----------------+---------+-----------+-------------+ 1 row in set (0.01 sec) Now works perfect. Thanks Hrundi On 16/11/05, Hilmar Lapp wrote: > It doesn't find the sequence. Note that bioentry2flat.pl is mostly for > demonstration purposes and not necessarily a production quality script. > > Having said that and observing that it hard-codes the namespace to > sprot_hum, which is not the one you used when loading, it should be > easy to either modify the script to use a different namespace, or to > accept it as a command line, or alternatively to use the same namespace > when loading. > > Let me know if the script still doesn't work if you use the same > namespace by any of the options above. > > -hilmar > > On Nov 15, 2005, at 2:52 PM, Hrundi Barkshi wrote: > > > Hi, > > > > I am a beginner to BioSQL and I upload (Using load_seqdatabase.pl) the > > example under the test folder ./t/data > > > > perl ~/tmp/biosql/load_seqdatabase.pl --dbname BioDB --namespace > > uniprot --debug --format swiss swiss.da > > > > Now I would like to revert the sequence querying the db > > > > I have tried using bioentry2flat but I get the following error: > > > > > > $ perl bioentry2flat.pl BioSQL3 Q07021Connecting with > > mysql:BioSQL3:root:dbpass > > > > ------------- EXCEPTION ------------- > > MSG: Attempting to write with no seq! > > STACK Bio::SeqIO::embl::write_seq > > /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO/embl.pm:394 > > STACK Bio::SeqIO::PRINT /usr/lib/perl5/site_perl/5.8.6/Bio/SeqIO.pm:661 > > STACK toplevel bioentry2flat.pl:69 > > > > ----------------------------------- > > > > Any advise? > > Thanks in advance > > Hrundi > > > > _______________________________________________ > > BioSQL-l mailing list > > BioSQL-l@open-bio.org > > http://open-bio.org/mailman/listinfo/biosql-l > > > -- > ------------------------------------------------------------- > Hilmar Lapp email: lapp at gnf.org > GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 > ------------------------------------------------------------- > > >