From mad at biol.unlp.edu.ar Fri Apr 6 15:01:50 2001 From: mad at biol.unlp.edu.ar (Sarachu Martin) Date: Fri, 6 Apr 2001 15:01:50 -0400 Subject: wEMBOSS - web interface for EMBOSS Message-ID: <200104061901.f36J1oD02783@nahuel.biol.unlp.edu.ar> Hi, I'm writing to announce wEMBOSS, a new web interface for EMBOSS. This interface was completely developed at the Argentinian EMBNet node Homepage at http://sol.biol.unlp.edu.ar/wEMBOSS You can test the interface at http://sol.biol.unlp.edu.ar/cgi-bin/embnet/wEMBOSS Mart?n From dmartin at davasg4.msiwtb.dundee.ac.uk Sat Apr 7 12:20:31 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Sat, 7 Apr 2001 17:20:31 +0100 Subject: wEMBOSS - web interface for EMBOSS In-Reply-To: <200104061901.f36J1oD02783@nahuel.biol.unlp.edu.ar> Message-ID: On Fri, 6 Apr 2001, Sarachu Martin wrote: > Hi, > > I'm writing to announce wEMBOSS, a new web interface for > EMBOSS. This interface was completely developed at the > Argentinian EMBNet node > Homepage at http://sol.biol.unlp.edu.ar/wEMBOSS > You can test the interface at > http://sol.biol.unlp.edu.ar/cgi-bin/embnet/wEMBOSS I get a 'not found' error.. I like the idea though.. ..d > > Mart?n > From dmartin at davasg4.msiwtb.dundee.ac.uk Thu Apr 12 14:56:11 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 12 Apr 2001 19:56:11 +0100 Subject: EMBOSS on OS X In-Reply-To: Message-ID: On Wed, 11 Apr 2001, Jean-Christophe Ame wrote: > I have installed EMBOSS on MacOSX since the release of the public > beta ant it works like a charm. There is no problem installing it, > one needs just to remove -lnsl everywhere in the configure file. Could you give me a detailed description and I will plug it in to the next edition of the admin guide. ie. a step by step process. I don't have a Mac yet (though I should be getting one soon). Maybe the EMBOSS developers could add a something to the configure script to deal with OS X? ..d > Jean-Christophe > > > >David Martin wrote (to the EMBOSS list): > >> > >> Has anyone tried compiling and running EMBOSS on MacOS X yet? I can > >> imagine this being very nice with the X interfaces (Colimate etc). > > > >And Richard Phelps wrote (to me personally): > >> > >> How might I install emboss on Mac Os X?? > > > >Well, personally, I think I've touched the keyboard of a Mac maybe five times > >in my life (they're not common here at all), so it's highly unlikely I'll > >have the pleasure of using MacOS X any time soon. > > > >However: > > > >- It would be interesting to know how easily an application with a relatively > > GNUish installation process compiles and installs on OS X. I would > > personally be very interested in feedback. (Shouldn't be too difficult by > > all reports.) > > > >- Secondly, there is a project underway (though it's in its early stages yet) > > to unify packaging systems across a number of BSD-based operating systems, > > including OS X. In fact, Wilfredo Sanches, the former design lead on OS X, > > is a developer on this project. Should their work come to fruition, it's > > possible that the existing FreeBSD port could eventually serve as the basis > > for installing EMBOSS on OS X as well. :-) See: > > > > http://www.openpackages.org/ > > > >-- Johann > > -- > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > ?cole Sup?rieure De Biotechnologie De Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 03 90 24 47 05 > Fax.: 03 90 24 46 86 > From dmartin at davasg4.msiwtb.dundee.ac.uk Thu Apr 12 19:28:03 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Fri, 13 Apr 2001 00:28:03 +0100 Subject: EMBOSS on OS X In-Reply-To: Message-ID: On Thu, 12 Apr 2001, Jean-Christophe Ame wrote: > Thanks for the reply. I'm forwarding this to emboss-dev to see if they want to pick it up. I'll wait till my Mac arrives then try it.. and be prepared for a few more questions. ..d > Hi David, > > > For MacOSX to compile EMBOSS you need: > > first in the configure file: > > 1. remove from the configure file all the -lnsl call, because Mac OSX > doesn't recognise it (don't ask me why) > 2. add dylib at this position: (this is for configure to recognize X > windows - I use Xtools from TENON at http:// www.tenon.com) > > if (xmkmf) >/dev/null 2>/dev/null && test -f Makefile; then > # GNU make sometimes prints "make[1]: Entering...", which would confuse us. > eval `${MAKE-make} acfindx 2>/dev/null | grep -v make` > # Open Windows xmkmf reportedly sets LIBDIR instead of USRLIBDIR. > for ac_extension in a dylib so sl; do > if test ! -f $ac_im_usrlibdir/libX11.$ac_extension && > test -f $ac_im_libdir/libX11.$ac_extension; then > ac_im_usrlibdir=$ac_im_libdir; break > fi > done > # Screen out bogus values from the imake configuration. They are > # bogus both because they are the default anyway, and because > > and > > do > for ac_extension in a dylib so sl; do > if test -r $ac_dir/lib${x_direct_test_library}.$ac_extension; then > ac_x_libraries=$ac_dir > break 2 > fi > done > done > fi > > Then, you need to copy some config.* files from the system for > configure to work: > > $ cp /usr/libexexec/config.x . > > I think the config.guess and config.sub must be updated in emboss. > > Then, > > $ ./configure > > $ make > > Also, MacOSX doesn't like the -lm library (it is included in the > system). It needs to be removed in the configure files where it > appears (I think in MSE's package). Otherwise it works very well. > > One time I succeeded in doing > aclocal -I m4 > autoconf > automake -a > ./configure = -disable-shared > make > > But, it doesn't work anymore as aclocal complains ??? > > Hope that will well. > Jean-Christophe > > > > > > > > > > > > >On Wed, 11 Apr 2001, Jean-Christophe Ame wrote: > > > >> I have installed EMBOSS on MacOSX since the release of the public > >> beta ant it works like a charm. There is no problem installing it, > >> one needs just to remove -lnsl everywhere in the configure file. > > > > > >Could you give me a detailed description and I will plug it in to the next > >edition of the admin guide. ie. a step by step process. I don't have a Mac > >yet (though I should be getting one soon). > > > >Maybe the EMBOSS developers could add a something to the configure script > >to deal with OS X? > > > >..d > > > > > >> Jean-Christophe > >> > >> > >> >David Martin wrote (to the EMBOSS list): > >> >> > >> >> Has anyone tried compiling and running EMBOSS on MacOS X yet? I can > >> >> imagine this being very nice with the X interfaces (Colimate etc). > >> > > >> >And Richard Phelps wrote (to me personally): > >> >> > >> >> How might I install emboss on Mac Os X?? > >> > > >> >Well, personally, I think I've touched the keyboard of a Mac > >>maybe five times > >> >in my life (they're not common here at all), so it's highly unlikely I'll > >> >have the pleasure of using MacOS X any time soon. > >> > > >> >However: > >> > > >> >- It would be interesting to know how easily an application with > >>a relatively > >> > GNUish installation process compiles and installs on OS X. I would > >> > personally be very interested in feedback. (Shouldn't be too > >>difficult by > >> > all reports.) > >> > > >> >- Secondly, there is a project underway (though it's in its early > >>stages yet) > >> > to unify packaging systems across a number of BSD-based > >>operating systems, > >> > including OS X. In fact, Wilfredo Sanches, the former design > >>lead on OS X, > >> > is a developer on this project. Should their work come to > >>fruition, it's > >> > possible that the existing FreeBSD port could eventually serve > >>as the basis > >> > for installing EMBOSS on OS X as well. :-) See: > >> > > >> > http://www.openpackages.org/ > >> > > >> >-- Johann > >> > >> -- > >> ________________________ > >> Jean-Christophe Am?, PhD > >> U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > >> ?cole Sup?rieure De Biotechnologie De Strasbourg > >> P?le API > >> Boulevard S?bastien-Brant > > > 67400 Illkirch > > > France > > > > > > tel.: 03 90 24 47 05 > > > Fax.: 03 90 24 46 86 > > > > > -- > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > ?cole Sup?rieure De Biotechnologie De Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 03 90 24 47 05 > Fax.: 03 90 24 46 86 > From dmartin at gen67172.msiwtb.dundee.ac.uk Tue Apr 17 12:40:39 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Tue, 17 Apr 2001 17:40:39 +0100 (BST) Subject: EMBOSS 1.12.0 Message-ID: On Tue, 17 Apr 2001 ableasby at hgmp.mrc.ac.uk wrote: > > EMBOSS 1.12.0 contains the following new programs in addition to > the usual load of library mods/fixes/etc. > > Distmat: Creates a distance matrix from multiple alignments of nucleotide or > protein sequences. The sequences need to be aligned before running > this program. The quality of the alignment is of paramount importance > in obtaining meaningful information from this analysis. This looks interesting. Could I suggest some refinements? 1. Add the sequence count to the top of the output. 2. Substituttion matrices for proteins. Can this matrix file be read in again by an EMBOSS library routine to reconstruct the matrix without having to write a new parser? I might be tempted to write an additivity/ultrametry checker. > Charge: Simple sliding window plot of charge vs position in a protein > sequence. > > Cai: Codon adaptive index calculation. A measurement of the level of > usage of the 64 codons in a sequence. > And again, every application has absolutely no trace information in the output such as date and time it was run, by whom, and with which datafiles/parameters. Is there a way to get a full list of parameters used in an EMBOSS application (ie everything that could be specified on the command line.. ..d From peter.rice at uk.lionbioscience.com Wed Apr 18 12:56:32 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 18 Apr 2001 17:56:32 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADDC740.47B48CF3@uk.lionbioscience.com> David Martin wrote: > And again, every application has absolutely no trace information in the > output such as date and time it was run, by whom, and with which > datafiles/parameters. Is there a way to get a full list of parameters > used in an EMBOSS application (ie everything that could be specified on > the command line.. Interesting. It would be very useful to add this. Needs a little thought about the best time to write everything. The output file is opened during ACD processing, but before all the user prompts have been done so it would need to be some post-processing. Maybe we should post-process all output files at the end of ajAcdInit and write this kind of header. How much detail would you need? All options would probably be excessive. Maybe only those options set by the user (by answering the prompt or on the command line)? A command line qualifier for output files could toggle the level of detail (or even turn it off). ... and of course we can do the same things for feature reports when they are committed ... -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Wed Apr 18 13:15:18 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Wed, 18 Apr 2001 18:15:18 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADDC740.47B48CF3@uk.lionbioscience.com> Message-ID: On Wed, 18 Apr 2001, Peter Rice wrote: > David Martin wrote: > > And again, every application has absolutely no trace information in the > > output such as date and time it was run, by whom, and with which > > datafiles/parameters. Is there a way to get a full list of parameters > > used in an EMBOSS application (ie everything that could be specified on > > the command line.. > > Interesting. It would be very useful to add this. Needs a little thought > about the best time to write everything. The output file is opened during > ACD processing, but before all the user prompts have been done so it would > need to be some post-processing. > > Maybe we should post-process all output files at the end of ajAcdInit and > write this kind of header. > > How much detail would you need? All options would probably be excessive. > Maybe only those options set by the user (by answering the prompt or on the > command line)? I would take all options specified in the ACD file plus any specific associated parameters so that essentially everything is stored. Some sort of XML would be nice and easy.. Eangles.dat no ... For the mean time it could be put behind # to hide it from post processiong programs but when EMBOSS outputs its data in a standard object model that could be XML, it has a ready made data management info built in. (cheap XML output as unparsed character data ... <[[ ..output here ]]> > > A command line qualifier for output files could toggle the level of detail > (or even turn it off). or even an environment variable > > ... and of course we can do the same things for feature reports when they > are committed ... indeed.. XML output as a standard option for all programs would be a very nice thing. It requires some reworking of how results are handled though to actually think 'object' rather than 'essay'. ..d From peter.rice at uk.lionbioscience.com Wed Apr 18 13:21:58 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 18 Apr 2001 18:21:58 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADDCD36.21861BAD@uk.lionbioscience.com> David Martin wrote: > I would take all options specified in the ACD file plus any specific > associated parameters so that essentially everything is stored. Very messy for applications with a long list of (possibly hidden) options. Could maybe cover all the required options (the ones that get prompted for). I would prefer just the ones that were set (i.e. enough to build a command line to repeat the run) > XML output as a standard option for all programs would be a very nice > thing. It requires some reworking of how results are handled though to > actually think 'object' rather than 'essay'. Do you have some standard XML in mind? Whenever I look into this I see a forest of DTDs and no clear standard. Of course, we could invent one... -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 03:55:34 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 08:55:34 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADDCD36.21861BAD@uk.lionbioscience.com> Message-ID: On Wed, 18 Apr 2001, Peter Rice wrote: > David Martin wrote: > > I would take all options specified in the ACD file plus any specific > > associated parameters so that essentially everything is stored. > > Very messy for applications with a long list of (possibly hidden) options. > Could maybe cover all the required options (the ones that get prompted > for). I would prefer just the ones that were set (i.e. enough to build a > command line to repeat the run) But what if the defaults have been modified in the ACD? It is about repeatability of experiments. Of course one could have an option to have either a minimal header or (as is now) no header at all. I think the current situation is poor scientific practice (ie the data isn't self documenting.. We were always taught to label and date everything, photos, spec traces, column traces, tubes in racks etc. so you _know_ what it is because it knows what it is.) Keeping just the command line and prompted options would be a minimal header (along with program name and date.) For full record keeping a full record should be kept. This can easily be parsed out by any vaguely competent unix hack and allows for proper traceability. > > > XML output as a standard option for all programs would be a very nice > > thing. It requires some reworking of how results are handled though to > > actually think 'object' rather than 'essay'. > > Do you have some standard XML in mind? Whenever I look into this I see a > forest of DTDs and no clear standard. Of course, we could invent one... Most of the DTD's out there are crap. I would go with one of our own to start with. If designed well this can easily be transformed into any format desired (HTML etc.) thus removing from the application writer the need to cope with every possible format. A post-process transformation can allow the user to specify an XSL stylesheet of choice (or DSSSL if that route is chosen). If EMBOSS follows the following scheme: ACD parse V Do the Science V Generate output in XML (full record) V Transform to the users desired form Then you can have the 'classical' form of output ie plain text, HTML, short XML output, full (untransformed) output, SQL, MS Word (only kidding), graphical output in whatever form (How about moving EMBOSS graphics from PL_PLOT to SVG?) It adds a layer of output independence to EMBOSS, provides a standard interface for external apps to pick up emboss output (think of all the fun you are having in dealing with unstandardised output models for the various programs and the joy of James Bonfield getting EMBOSS into SPIN.) This is not really very different to what happens with sequences which have their own internal representation and are transformed to the required format on output. ..d From peter.rice at uk.lionbioscience.com Thu Apr 19 04:02:11 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 19 Apr 2001 09:02:11 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADE9B83.B2922583@uk.lionbioscience.com> David Martin wrote: >For full record keeping a full record should be kept. This can easily be >parsed out by any vaguely competent unix hack and allows for proper >traceability. Although we do also need to consider the user who will want to read the output. An option to skip (or to limit the size of) the header, definable by an environment variable (or .embossrc configuration) can keep them happy though. > Most of the DTD's out there are crap. I would go with one of our own to > start with. If designed well this can easily be transformed into any > format desired (HTML etc.) thus removing from the application writer the > need to cope with every possible format. A post-process transformation > can allow the user to specify an XSL stylesheet of choice (or DSSSL if > that route is chosen). > > If EMBOSS follows the following scheme: > > ACD parse > V > Do the Science > V > Generate output in XML (full record) > V > Transform to the users desired form > > Then you can have the 'classical' form of output ie plain text, > HTML, short XML output, full (untransformed) output, SQL, MS Word (only > kidding), graphical output in whatever form (How about moving EMBOSS > graphics from PL_PLOT to SVG?) MS Word? That was originally planned - output files were going to have titles and headers and paragraphs so they could be written as plain text, HTML or RTF but the print functions were never implemented. Would be easy to do though :-) > This is not really very different to what happens with sequences which > have their own internal representation and are transformed to the required > format on output. Yup, that's what output reports will be for. A limited number of output formats for features, alignments, and so on - with the chance to choose the format you like best. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 04:15:50 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 09:15:50 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADE9B83.B2922583@uk.lionbioscience.com> Message-ID: On Thu, 19 Apr 2001, Peter Rice wrote: > David Martin wrote: > >For full record keeping a full record should be kept. This can easily be > >parsed out by any vaguely competent unix hack and allows for proper > >traceability. > > Although we do also need to consider the user who will want to read the > output. An option to skip (or to limit the size of) the header, definable > by an environment variable (or .embossrc configuration) can keep them happy > though. EMBOSS_FORMAT fasta does this for sequences. There is no reason why the default output cannot be 'classic' (or for die-hard MS shops, 'powerpoint' or 'msword97'). XLST looks to be very easy to do. Just write a suitable style sheet and add it to the local implementation. It would even allow the data to be checksummed and watermarked if required.. (but the end admin can implement this quite easily if given a standard output to work with). For an automated run where the data needs to be warehoused and traceable the system can add another command line option for post processing. It also allows remote operation of EMBOSS programs with the end format being determined at the users point of contact. > > > Most of the DTD's out there are crap. I would go with one of our own to > > start with. If designed well this can easily be transformed into any > > format desired (HTML etc.) thus removing from the application writer the > > need to cope with every possible format. A post-process transformation > > can allow the user to specify an XSL stylesheet of choice (or DSSSL if > > that route is chosen). > > > > If EMBOSS follows the following scheme: > > > > ACD parse > > V > > Do the Science > > V > > Generate output in XML (full record) > > V > > Transform to the users desired form > > > > Then you can have the 'classical' form of output ie plain text, > > HTML, short XML output, full (untransformed) output, SQL, MS Word (only > > kidding), graphical output in whatever form (How about moving EMBOSS > > graphics from PL_PLOT to SVG?) > > MS Word? That was originally planned - output files were going to have > titles and headers and paragraphs so they could be written as plain text, > HTML or RTF but the print functions were never implemented. Would be easy > to do though :-) Then dump it to XML, use XSLT to generate the various formats. If EMBOSS can generate premade powerpoint slides of the output then it really is on to a winner (because scientists are lazy beasts..) > > > This is not really very different to what happens with sequences which > > have their own internal representation and are transformed to the required > > format on output. > > Yup, that's what output reports will be for. A limited number of output > formats for features, alignments, and so on - with the chance to choose the > format you like best. > Indeed, with extensibility for local customisation so I can generate an MS Word document with local crest, font, format etc should I so desire, or a web page, or LaTeX output, or have it automatically generate a tutorial script that shows the command line used and all the prompted options as it would appear when driven from the comamnd line, or SQL to go straight into the LIMS system, or email to the user, or ... All down to the creativity of the admin who is working with a documented output format for which there are plenty of tools available to do the work. (EMBOSS still doesn't support BoulderIO format for sequences a la primer3.) ..d From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 05:57:58 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 10:57:58 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADEB335.C61A08C6@uk.lionbioscience.com> Message-ID: On Thu, 19 Apr 2001, Peter Rice wrote: > David, > > >Then dump it to XML, use XSLT to generate the various formats. > > Is there a DTD for this? XSLT is still new to me, but I guess I should > learn... XSLT is a description of rules for transforming an XML document into an arbitrary format/mess, including running external programs on part of the data. Have a look at the XML.com or w3c.org sites for more info. It looks quite amenable. > > >If EMBOSS can generate premade powerpoint slides of the output then it > >really is on to a winner (because scientists are lazy beasts..) > > Again, how? I guess it should be relatively easy, apart from designing the > background image :-) > If you can generate word documents then surely powerpoint is not too hard.. > >EMBOSS still doesn't support BoulderIO format for sequences a la > >primer3. > > True. Nor GFF with sequences in the header (though I am just implementing > that one to make reading GFF files easier) > > Does anyone have the BoulderIO format description? > Lincoln Stein? have a look also at www.no.embnet.org/Programs/SAL/primer05.php3 for primer 0.5 ..d > From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 08:27:09 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 13:27:09 +0100 (BST) Subject: Project meeting minutes? Message-ID: Are you still having meetings or have people given up talking to one another? The last minutes are from Feb 26th.. ..d From David.Lapointe at umassmed.edu Thu Apr 19 09:23:17 2001 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Thu, 19 Apr 2001 09:23:17 -0400 Subject: EMBOSS 1.12.0 Message-ID: The BoulderIO info is at http://stein.cshl.org/software/boulder/ Also an alternative to DTD is XML Schema though that is still a work in progress. David > -----Original Message----- > From: David Martin [mailto:dmartin at gen67172.msiwtb.dundee.ac.uk] > Sent: Thursday, April 19, 2001 5:58 AM > To: Peter Rice > Cc: emboss-dev at embnet.org > Subject: Re: EMBOSS 1.12.0 > > > On Thu, 19 Apr 2001, Peter Rice wrote: > > > David, > > > > >Then dump it to XML, use XSLT to generate the various formats. > > > > Is there a DTD for this? XSLT is still new to me, but I > guess I should > > learn... > > XSLT is a description of rules for transforming an XML > document into an > arbitrary format/mess, including running external programs on > part of the > data. Have a look at the XML.com or w3c.org sites for more > info. It looks > quite amenable. > > > > > >If EMBOSS can generate premade powerpoint slides of the > output then it > > >really is on to a winner (because scientists are lazy beasts..) > > > > Again, how? I guess it should be relatively easy, apart > from designing the > > background image :-) > > > > If you can generate word documents then surely powerpoint is not too > hard.. > > > >EMBOSS still doesn't support BoulderIO format for sequences a la > > >primer3. > > > > True. Nor GFF with sequences in the header (though I am > just implementing > > that one to make reading GFF files easier) > > > > Does anyone have the BoulderIO format description? > > > Lincoln Stein? have a look also at > www.no.embnet.org/Programs/SAL/primer05.php3 for primer 0.5 > > ..d > > > > > > > From jkb at mrc-lmb.cam.ac.uk Fri Apr 27 12:52:44 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Fri, 27 Apr 2001 17:52:44 +0100 Subject: EMBOSS program groups Message-ID: <20010427175244.A15923@arran.mrc-lmb.cam.ac.uk> Hi all, I'm looking at restructuring the program groups so that there are fewer groups, or at least make some groups to be subgroups (cascading menus in my interface). There seems to be a huge amount of redundancy with the average program being in 1.7 groups (ish ;-)). Some entire groups have huge overlap, such as Motifs and Pattern matching. What are people's thoughts on this? I'm willing to submit my changes back to the emboss team, but how many people are likely to be effected by this? We need this ourselves as the menus are simply too long at present, and it's virtually impossible to find anything. On a different note, I see that there's also a huge redundancy in programs. How should users choose between them? Eg, as a novice user of emboss, how would I know to use stretcher over needle? And similarly for water vs matcher. The documentation for needle implies that needle is for short sequences and stretcher is for longer sequences. However stretcher uses the Myers and Miller algorithm, which claims to be a full needleman-wunsch alignment algorithm anyway (but with linear-space memory requirements). Similarly matcher vs needle - both claim to be rigorous algorithms, but matcher uses less memory. It seems that so far the reason for redundancy is simply that multiple authors have submitted programs to do identical tasks, with different names. Would it be treading on peoples toes too much to suggest that some of this redudancy should be removed? In the cases outlined above, assuming the results really are comparable, then it seems clear that the more memory hungry versions should go. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 10:44:37 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 15:44:37 +0100 Subject: Proposal: new menu layout, with examples Message-ID: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Now that several interfaces to EMBOSS are arriving I feel that it is important to have a well structured menu system (as indicated in my last email). I now have a proposed layout, included as an attachment. It may be hard to see the problem with the existing setup, so I've produce a tcl/tk to display a menu tree and created menu specs for how it is now (1.11.0, so nearly now) and my new proposed setup. My new layout still has quite a few questionable choices, and perhaps I could duplicate more often (although personally I dislike that) and split into more cascading menus. However it's a start[1]. So if you're interested in developing interfaces to EMBOSS, or you are one of the developers, please download the following: ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/private/emboss_dev/em.tar.gz See the README file for information on how to run the tcl/tk app. It ought to work OK even on pretty old tcl/tk installations. James [1] I should also point out that in "Spin" the menu labels are actually the program doc lines with the program command name in brackets (eg "Finds DNA inverted repeats (einverted)"). However I haven't addressed the issue of a common language style for doc names yet. -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ -------------- next part -------------- +Comparison + Global alignment = stretcher = needle - + Local alignment = simplesw = water = matcher = supermatcher = est2genome - + Multiple alignment = cluster = emma - + Dot plots = dotmatcher = dotpath = dottup = polydot - + Other = megamerger = merger = diffseq = plotcon = seqmatchall = wordmatch = prettyplot - - +Translation = backtranseq = checktrans = coderet = cusp = getorf = plotorf = prettyseq = showorf = showseq = transeq - +Databases + Indexing = dbiblast = dbifasta = dbiflat = dbigcg = showdb - + SCOP = nrscope = scope = stamps - - +Edit = cutseq = degapseq = descseq = extractseq = maskfeat = maskseq = msbar = newseq = noreturn = nthseq = pasteseq = revseq = shuffleseq = splitter = trimseq = vectorstrip - +Search + Motifs/pattern matching = dreg = fuzznuc = fuzzpro = fuzztran = helixturnhelix = patmatdb = patmatmotifs = preg = printsextract = prosextract = pscan = tfscan - + Repeats = einverted = equicktandem = etandem = palindrome - + Profiles = cons = profit = prophecy = prophet - + Primers = prima = primersearch = stssearch - + CPG islands = cpgplot = cpgreport = newcpgreport = newcpgseek - + Restriction enzymes = recode = redata = remap = restover = restrict = silent - + Enzymes = findkm - + Misc = textsearch - - +Composition + DNA = banana = btwisted = chaos = chips = codcmp = complex = compseq = dan = freak = geecee = isochore = marscan = syco = wobble - + Protein = antigenic = charge = compseq = digest = domainer = emowse = garnier = helixturnhelix = hmoment = iep = octanol = oddcomp = pepcoil = pepinfo = pepnet = pepstats = pepwheel = pepwindow = pepwindowall = sigcleave = tmap - - +Display = cirdna = entret = infoseq = lindna = notseq = prettyplot = seqinfo = seqret = seqretall = seqretallfeat = seqretfeat = seqretset = seqretsplit = seqrettype = seqtofeat = showfeat = showseq - +Utils = embossdata = entrails = rebaseextract = seealso = tfextract = tfm = wossname - +Test = ajbad = ajfeatest = ajtest = ajtest2 = corbatest = demofeatures = demolist = demosequence = demostring = demotable = histogramtest = patmattest = plplottest = proteinmotifsearch = testplot = treetypedisplay - From peter.rice at uk.lionbioscience.com Mon Apr 30 11:00:32 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 16:00:32 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED7E10.AE78F038@uk.lionbioscience.com> James Bonfield wrote: > > Now that several interfaces to EMBOSS are arriving I feel that it is > important to have a well structured menu system (as indicated in my > last email). > > I now have a proposed layout, included as an attachment. It may be > hard to see the problem with the existing setup, so I've produce a > tcl/tk to display a menu tree and created menu specs for how it is > now (1.11.0, so nearly now) and my new proposed setup. My new layout > still has quite a few questionable choices, and perhaps I could > duplicate more often (although personally I > dislike that) and > split into more cascading menus. However it's a start[1]. Looks nice - much nicer than the old emboss_menus_orig version! Should we have 2 classes of groups in ACD files - one for menus with a short name or hierarchy as in this case, and a longer version (called 'keywords', perhaps, and derived from the current 'groups') for wossname to search? > [1] I should also point out that in "Spin" the menu labels are > actually the program doc lines with the program command name > in brackets (eg "Finds DNA inverted repeats (einverted)"). However > I haven't addressed the issue of a common language style for doc > names yet. The text is best for many novice users, but I would maybe prefer the name first (and sorted by name?) for expert users. But of course, it should be up to the users to say what they want :-) -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From gwilliam at hgmp.mrc.ac.uk Mon Apr 30 11:19:18 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 30 Apr 2001 16:19:18 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> James Bonfield wrote: > > Now that several interfaces to EMBOSS are arriving I feel that it is important > to have a well structured menu system (as indicated in my last email). Agreed! > I now have a proposed layout, included as an attachment. It may be hard to see > the problem with the existing setup, so I've produce a tcl/tk to display a > menu tree and created menu specs for how it is now (1.11.0, so nearly now) and > my new proposed setup. My new layout still has quite a few questionable > choices, and perhaps I could duplicate more often (although personally I > dislike that) and split into more cascading menus. However it's a start[1]. > > So if you're interested in developing interfaces to EMBOSS, or you are one of > the developers, please download the following: > > ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/private/emboss_dev/em.tar.gz > > See the README file for information on how to run the tcl/tk app. It ought to > work OK even on pretty old tcl/tk installations. Didn't work for me - I just got the wish window and a '%' prompt. I looked at your 'emboss_menu*' files though to see what your reorganisation is like. Some commments (more minor niggles than anything major): I like much of the reorganisation. Firstly, I am a 'splitter', I like having several paths to a program, so many of my comments related to terseness have been removed from this version of this message ;-) Some things seems to be in the wrong place: I would like to see a 'codon usage' section with things like 'cusp', 'chips' 'syco' 'cai' in. 'nthseq' - not really an 'edit' program - more a 'database organisation' program 'textsearch' - looks wrong in 'Search:Misc', maybe change to 'Search:Text' ? 'complex' is now deleted 'marscan' finds a MAR site, this is not really a property of DNA composition - I would like to see a 'Features' menu as well as 'Composition'. 'Features' would hold those programs which can/should/will write out GFF files. So this should go in 'Features:DNA' 'antigenic' - finds antigenic features so should go in 'Features:Protein' 'emowse' - not really protein composition, nore identification - should probably go somewhere under 'Search' 'helixturnhelix' - features 'pepnet', 'pepwheel' - more of 'Display' programs than 'Composition' programs? 'garnier', 'tmap' - secondary structure 'notseq' - not a 'Display' program - - more a 'database organisation' program, like 'nthseq' 'tfm', 'wossname' - I would rather see this under 'Help' than 'lost in 'Utils' > James > > [1] I should also point out that in "Spin" the menu labels are actually the > program doc lines with the program command name in brackets (eg "Finds DNA > inverted repeats (einverted)"). However I haven't addressed the issue of a > common language style for doc names yet. I enforce a description that has as many keywords as possible and that fits on a 80 character line when displayed by 'wossname'. I also try to enforce capital first letter and no ending full-stop (to shave off an extra character). I try to capitalise acronyms and database names. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 12:05:40 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 17:05:40 +0100 Subject: Proposal: new menu layout, with examples Message-ID: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 04:00:32PM +0100, Peter Rice wrote: > Looks nice - much nicer than the old emboss_menus_orig version! > > Should we have 2 classes of groups in ACD files - one for menus with a > short name or hierarchy as in this case, and a longer version (called > 'keywords', perhaps, and derived from the current 'groups') for wossname to > search? Thanks for the comments. That makes sense as we have two specific purposes: wossname and menus. I like the idea of being able to resort based on expert vs novice. It shouldn't be too hard to have the choice, but I'm not sure how easy it would be to do this on-the-fly (at least with my current code). On the topic of groups - just a warning that we'll also be looking at restructuring the ACD with the 'group:' type, which is now sounding like a staggeringly confusing choice of word! By example, take vectorstrip.acd: group: vecgrp [ info: "Vector sequence" type: frame ] bool: vectorfile [ param: Y def: Y prompt: "Are your vector sequences in a file?" ] infile: vectors [ param: Y req: @($(vectorfile)?1:0) nullok: Y def: "" prompt: "Name of vectorfile" ] string: linkerA [ req: @(!$(vectorfile)) prompt:"5' sequence" ] string: linkerB [ req: @(!$(vectorfile)) prompt: "3' sequence" ] endgroup: vecgrp In the GUI dialogue this comes out as a labelled frame. We also have the choice of "type: page" to use tabs in a tabbed notebook. This basically allows grouping of command line options into common areas and hence guides the user through filling out the various dialogues. (Although mostly the defaults will suffice.) This is a large amount of work though so we haven't yet started this, except where absolutely necessary due to the size of the dialogues (showseq). James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 12:10:12 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 17:10:12 +0100 Subject: Proposal: new menu layout, with examples In-Reply-To: <3AED8276.D8D4EF8@hgmp.mrc.ac.uk>; from gwilliam@hgmp.mrc.ac.uk on Mon, Apr 30, 2001 at 04:19:18PM +0100 References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> Message-ID: <20010430171012.E21424@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 04:19:18PM +0100, Gary Williams, Tel 01223 494522 wrote: > Some commments (more minor niggles than anything major): Thanks for this list of comments. I'll try and reorganise things further. > Firstly, I am a 'splitter', I like having several paths to a program, so > many of my comments related to terseness have been removed from this > version of this message ;-) Grin. I agree that having multiple paths can be useful for novice users, and there are a _few_ programs in multiple places (eg compseq) in my new menu layout. However I consider it to be a trade off between not being able to find something and not being able to see it due to too many programs listed. I guess we just differ on the break-even point :) > I would like to see a 'codon usage' section with things like 'cusp', > 'chips' 'syco' 'cai' in. That makes sense. I'm not really familier with most of the EMBOSS tools; or some of the fields it covers either. I'll notify the list when I update the menu files again. > I enforce a description that has as many keywords as possible and that > fits on a 80 character line when displayed by 'wossname'. > > I also try to enforce capital first letter and no ending full-stop (to > shave off an extra character). > > I try to capitalise acronyms and database names. Are such things documented, so that new developers follow the same rules? Of course it can be rather hard to get anyone to read documentation (myself included). James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From gwilliam at hgmp.mrc.ac.uk Mon Apr 30 12:12:44 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 30 Apr 2001 17:12:44 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> <20010430171012.E21424@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED8EFC.1612B8@hgmp.mrc.ac.uk> James Bonfield wrote: > > I enforce a description that has as many keywords as possible and that > > fits on a 80 character line when displayed by 'wossname'. > > > > I also try to enforce capital first letter and no ending full-stop (to > > shave off an extra character). > > > > I try to capitalise acronyms and database names. > > Are such things documented, so that new developers follow the same rules? Of > course it can be rather hard to get anyone to read documentation (myself > included). No - I just change the documentation after consultation with the author. I've not had any complaints yet. :-) -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From peter.rice at uk.lionbioscience.com Mon Apr 30 13:17:46 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 18:17:46 +0100 Subject: Proposal: new menu layout, with examples References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED9E3A.383A1649@uk.lionbioscience.com> Hi James, > On the topic of groups - just a warning that we'll also be looking > at restructuring the ACD with the 'group:' type, which is now > sounding like a staggeringly confusing choice of word! How about 'section' and 'endsection' ? Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 13:29:43 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 18:29:43 +0100 Subject: Proposal: new menu layout, with examples In-Reply-To: <3AED9E3A.383A1649@uk.lionbioscience.com>; from peter.rice@uk.lionbioscience.com on Mon, Apr 30, 2001 at 06:17:46PM +0100 References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> <3AED9E3A.383A1649@uk.lionbioscience.com> Message-ID: <20010430182943.A21399@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 06:17:46PM +0100, Peter Rice wrote: > Hi James, > > > On the topic of groups - just a warning that we'll also be looking > > at restructuring the ACD with the 'group:' type, which is now > > sounding like a staggeringly confusing choice of word! > > How about 'section' and 'endsection' ? Agreed. What about the attributes? At present I've got: type A choice between "frame" and "page". Defaults to frame. This controls whether the section should be a labelled frame or a page in a notebook. The distinction only really matters for GUI or WWW based systems. In Tcl the choice is pretty clear. In WWW I'd expect a frame to be a table; a page is trickier to implement (and may require javascript), but I've seen such things done regularly. info A heading for the frame or page. Defaults to "" A page without a heading would look rather odd, but a frame without a heading is OK - it's just a bordered block. book Only needed for type:frame. The notebook to associate this page to. Defaults to "", which implies all pages are part of the same notebook. This is only needed if a program wishes to make use of more than one tabbed notebook, in which case this is used to determine which book a page is within. border Only needed for type:frame. Defaults to 1. The border width of the frame. side Only needed for type:frame. Defaults to top. This is used for 'frame packing'. It is only needed if we wish to express complex layout designs. Eg: +----------------------------------+ | Section1 | | blah | | blah | | +-------------+ +--------------+ + | | Section2 | | Section3 | | | | blah | | blah | | | +-------------+ +--------------+ | +----------------------------------+ I think perhaps the border and side attributes are overkill, and I haven't yet used them except in test ACD files. Naturally all of these are really just "hints" to the interface, which may ultimately do whatever it prefers. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From peter.rice at uk.lionbioscience.com Mon Apr 30 13:39:23 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 18:39:23 +0100 Subject: Proposal: new menu layout, with examples References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> <3AED9E3A.383A1649@uk.lionbioscience.com> <20010430182943.A21399@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AEDA34B.BF3D0374@uk.lionbioscience.com> James Bonfield wrote: > On Mon, Apr 30, 2001 at 06:17:46PM +0100, Peter Rice wrote: > > How about 'section' and 'endsection' ? > > Agreed. > > What about the attributes? We also, of course, have the name which follows the 'section:') > At present I've got: > > type A choice between "frame" and "page". Tricky to test a list of values. frame:Y would be simpler, but we should add controlled vocabulary types some time (see 'side' below). > info A heading for the frame or page. Defaults to "" > A page without a heading would look rather odd, but a frame > without a heading is OK - it's just a bordered block. OK. Can also be used to prompt interactive users (e.g. "Gap penalties:" before prompting for the gap penalty values). Or we could have a separate 'prompt' for this, and use 'info' in the -help output (tricky though - qualifiers will be in sections but also in required/optional/advanced groupings so maybe -help should be left alone) > book Only needed for type:frame. > The notebook to associate this page to. Defaults to "", which > implies all pages are part of the same notebook. This is only > needed if a program wishes to make use of more than one tabbed > notebook, in which case this is used to determine which book a > page is within. Can we pick a more general name for this? > border Only needed for type:frame. Defaults to 1. > The border width of the frame. OK. > side Only needed for type:frame. Defaults to top. > This is used for 'frame packing'. It is only needed if we wish > to express complex layout designs. Eg: Looks like another controlled vocabulary. Also, looks like I can think of uses for it. Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From mad at biol.unlp.edu.ar Fri Apr 6 19:01:50 2001 From: mad at biol.unlp.edu.ar (Sarachu Martin) Date: Fri, 6 Apr 2001 15:01:50 -0400 Subject: wEMBOSS - web interface for EMBOSS Message-ID: <200104061901.f36J1oD02783@nahuel.biol.unlp.edu.ar> Hi, I'm writing to announce wEMBOSS, a new web interface for EMBOSS. This interface was completely developed at the Argentinian EMBNet node Homepage at http://sol.biol.unlp.edu.ar/wEMBOSS You can test the interface at http://sol.biol.unlp.edu.ar/cgi-bin/embnet/wEMBOSS Mart?n From dmartin at davasg4.msiwtb.dundee.ac.uk Sat Apr 7 16:20:31 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Sat, 7 Apr 2001 17:20:31 +0100 Subject: wEMBOSS - web interface for EMBOSS In-Reply-To: <200104061901.f36J1oD02783@nahuel.biol.unlp.edu.ar> Message-ID: On Fri, 6 Apr 2001, Sarachu Martin wrote: > Hi, > > I'm writing to announce wEMBOSS, a new web interface for > EMBOSS. This interface was completely developed at the > Argentinian EMBNet node > Homepage at http://sol.biol.unlp.edu.ar/wEMBOSS > You can test the interface at > http://sol.biol.unlp.edu.ar/cgi-bin/embnet/wEMBOSS I get a 'not found' error.. I like the idea though.. ..d > > Mart?n > From dmartin at davasg4.msiwtb.dundee.ac.uk Thu Apr 12 18:56:11 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 12 Apr 2001 19:56:11 +0100 Subject: EMBOSS on OS X In-Reply-To: Message-ID: On Wed, 11 Apr 2001, Jean-Christophe Ame wrote: > I have installed EMBOSS on MacOSX since the release of the public > beta ant it works like a charm. There is no problem installing it, > one needs just to remove -lnsl everywhere in the configure file. Could you give me a detailed description and I will plug it in to the next edition of the admin guide. ie. a step by step process. I don't have a Mac yet (though I should be getting one soon). Maybe the EMBOSS developers could add a something to the configure script to deal with OS X? ..d > Jean-Christophe > > > >David Martin wrote (to the EMBOSS list): > >> > >> Has anyone tried compiling and running EMBOSS on MacOS X yet? I can > >> imagine this being very nice with the X interfaces (Colimate etc). > > > >And Richard Phelps wrote (to me personally): > >> > >> How might I install emboss on Mac Os X?? > > > >Well, personally, I think I've touched the keyboard of a Mac maybe five times > >in my life (they're not common here at all), so it's highly unlikely I'll > >have the pleasure of using MacOS X any time soon. > > > >However: > > > >- It would be interesting to know how easily an application with a relatively > > GNUish installation process compiles and installs on OS X. I would > > personally be very interested in feedback. (Shouldn't be too difficult by > > all reports.) > > > >- Secondly, there is a project underway (though it's in its early stages yet) > > to unify packaging systems across a number of BSD-based operating systems, > > including OS X. In fact, Wilfredo Sanches, the former design lead on OS X, > > is a developer on this project. Should their work come to fruition, it's > > possible that the existing FreeBSD port could eventually serve as the basis > > for installing EMBOSS on OS X as well. :-) See: > > > > http://www.openpackages.org/ > > > >-- Johann > > -- > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > ?cole Sup?rieure De Biotechnologie De Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 03 90 24 47 05 > Fax.: 03 90 24 46 86 > From dmartin at davasg4.msiwtb.dundee.ac.uk Thu Apr 12 23:28:03 2001 From: dmartin at davasg4.msiwtb.dundee.ac.uk (David Martin) Date: Fri, 13 Apr 2001 00:28:03 +0100 Subject: EMBOSS on OS X In-Reply-To: Message-ID: On Thu, 12 Apr 2001, Jean-Christophe Ame wrote: > Thanks for the reply. I'm forwarding this to emboss-dev to see if they want to pick it up. I'll wait till my Mac arrives then try it.. and be prepared for a few more questions. ..d > Hi David, > > > For MacOSX to compile EMBOSS you need: > > first in the configure file: > > 1. remove from the configure file all the -lnsl call, because Mac OSX > doesn't recognise it (don't ask me why) > 2. add dylib at this position: (this is for configure to recognize X > windows - I use Xtools from TENON at http:// www.tenon.com) > > if (xmkmf) >/dev/null 2>/dev/null && test -f Makefile; then > # GNU make sometimes prints "make[1]: Entering...", which would confuse us. > eval `${MAKE-make} acfindx 2>/dev/null | grep -v make` > # Open Windows xmkmf reportedly sets LIBDIR instead of USRLIBDIR. > for ac_extension in a dylib so sl; do > if test ! -f $ac_im_usrlibdir/libX11.$ac_extension && > test -f $ac_im_libdir/libX11.$ac_extension; then > ac_im_usrlibdir=$ac_im_libdir; break > fi > done > # Screen out bogus values from the imake configuration. They are > # bogus both because they are the default anyway, and because > > and > > do > for ac_extension in a dylib so sl; do > if test -r $ac_dir/lib${x_direct_test_library}.$ac_extension; then > ac_x_libraries=$ac_dir > break 2 > fi > done > done > fi > > Then, you need to copy some config.* files from the system for > configure to work: > > $ cp /usr/libexexec/config.x . > > I think the config.guess and config.sub must be updated in emboss. > > Then, > > $ ./configure > > $ make > > Also, MacOSX doesn't like the -lm library (it is included in the > system). It needs to be removed in the configure files where it > appears (I think in MSE's package). Otherwise it works very well. > > One time I succeeded in doing > aclocal -I m4 > autoconf > automake -a > ./configure = -disable-shared > make > > But, it doesn't work anymore as aclocal complains ??? > > Hope that will well. > Jean-Christophe > > > > > > > > > > > > >On Wed, 11 Apr 2001, Jean-Christophe Ame wrote: > > > >> I have installed EMBOSS on MacOSX since the release of the public > >> beta ant it works like a charm. There is no problem installing it, > >> one needs just to remove -lnsl everywhere in the configure file. > > > > > >Could you give me a detailed description and I will plug it in to the next > >edition of the admin guide. ie. a step by step process. I don't have a Mac > >yet (though I should be getting one soon). > > > >Maybe the EMBOSS developers could add a something to the configure script > >to deal with OS X? > > > >..d > > > > > >> Jean-Christophe > >> > >> > >> >David Martin wrote (to the EMBOSS list): > >> >> > >> >> Has anyone tried compiling and running EMBOSS on MacOS X yet? I can > >> >> imagine this being very nice with the X interfaces (Colimate etc). > >> > > >> >And Richard Phelps wrote (to me personally): > >> >> > >> >> How might I install emboss on Mac Os X?? > >> > > >> >Well, personally, I think I've touched the keyboard of a Mac > >>maybe five times > >> >in my life (they're not common here at all), so it's highly unlikely I'll > >> >have the pleasure of using MacOS X any time soon. > >> > > >> >However: > >> > > >> >- It would be interesting to know how easily an application with > >>a relatively > >> > GNUish installation process compiles and installs on OS X. I would > >> > personally be very interested in feedback. (Shouldn't be too > >>difficult by > >> > all reports.) > >> > > >> >- Secondly, there is a project underway (though it's in its early > >>stages yet) > >> > to unify packaging systems across a number of BSD-based > >>operating systems, > >> > including OS X. In fact, Wilfredo Sanches, the former design > >>lead on OS X, > >> > is a developer on this project. Should their work come to > >>fruition, it's > >> > possible that the existing FreeBSD port could eventually serve > >>as the basis > >> > for installing EMBOSS on OS X as well. :-) See: > >> > > >> > http://www.openpackages.org/ > >> > > >> >-- Johann > >> > >> -- > >> ________________________ > >> Jean-Christophe Am?, PhD > >> U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > >> ?cole Sup?rieure De Biotechnologie De Strasbourg > >> P?le API > >> Boulevard S?bastien-Brant > > > 67400 Illkirch > > > France > > > > > > tel.: 03 90 24 47 05 > > > Fax.: 03 90 24 46 86 > > > > > -- > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se Et Mutag?n?se Mol?culaire Et Structurale > ?cole Sup?rieure De Biotechnologie De Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 03 90 24 47 05 > Fax.: 03 90 24 46 86 > From dmartin at gen67172.msiwtb.dundee.ac.uk Tue Apr 17 16:40:39 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Tue, 17 Apr 2001 17:40:39 +0100 (BST) Subject: EMBOSS 1.12.0 Message-ID: On Tue, 17 Apr 2001 ableasby at hgmp.mrc.ac.uk wrote: > > EMBOSS 1.12.0 contains the following new programs in addition to > the usual load of library mods/fixes/etc. > > Distmat: Creates a distance matrix from multiple alignments of nucleotide or > protein sequences. The sequences need to be aligned before running > this program. The quality of the alignment is of paramount importance > in obtaining meaningful information from this analysis. This looks interesting. Could I suggest some refinements? 1. Add the sequence count to the top of the output. 2. Substituttion matrices for proteins. Can this matrix file be read in again by an EMBOSS library routine to reconstruct the matrix without having to write a new parser? I might be tempted to write an additivity/ultrametry checker. > Charge: Simple sliding window plot of charge vs position in a protein > sequence. > > Cai: Codon adaptive index calculation. A measurement of the level of > usage of the 64 codons in a sequence. > And again, every application has absolutely no trace information in the output such as date and time it was run, by whom, and with which datafiles/parameters. Is there a way to get a full list of parameters used in an EMBOSS application (ie everything that could be specified on the command line.. ..d From peter.rice at uk.lionbioscience.com Wed Apr 18 16:56:32 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 18 Apr 2001 17:56:32 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADDC740.47B48CF3@uk.lionbioscience.com> David Martin wrote: > And again, every application has absolutely no trace information in the > output such as date and time it was run, by whom, and with which > datafiles/parameters. Is there a way to get a full list of parameters > used in an EMBOSS application (ie everything that could be specified on > the command line.. Interesting. It would be very useful to add this. Needs a little thought about the best time to write everything. The output file is opened during ACD processing, but before all the user prompts have been done so it would need to be some post-processing. Maybe we should post-process all output files at the end of ajAcdInit and write this kind of header. How much detail would you need? All options would probably be excessive. Maybe only those options set by the user (by answering the prompt or on the command line)? A command line qualifier for output files could toggle the level of detail (or even turn it off). ... and of course we can do the same things for feature reports when they are committed ... -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Wed Apr 18 17:15:18 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Wed, 18 Apr 2001 18:15:18 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADDC740.47B48CF3@uk.lionbioscience.com> Message-ID: On Wed, 18 Apr 2001, Peter Rice wrote: > David Martin wrote: > > And again, every application has absolutely no trace information in the > > output such as date and time it was run, by whom, and with which > > datafiles/parameters. Is there a way to get a full list of parameters > > used in an EMBOSS application (ie everything that could be specified on > > the command line.. > > Interesting. It would be very useful to add this. Needs a little thought > about the best time to write everything. The output file is opened during > ACD processing, but before all the user prompts have been done so it would > need to be some post-processing. > > Maybe we should post-process all output files at the end of ajAcdInit and > write this kind of header. > > How much detail would you need? All options would probably be excessive. > Maybe only those options set by the user (by answering the prompt or on the > command line)? I would take all options specified in the ACD file plus any specific associated parameters so that essentially everything is stored. Some sort of XML would be nice and easy.. Eangles.dat no ... For the mean time it could be put behind # to hide it from post processiong programs but when EMBOSS outputs its data in a standard object model that could be XML, it has a ready made data management info built in. (cheap XML output as unparsed character data ... <[[ ..output here ]]> > > A command line qualifier for output files could toggle the level of detail > (or even turn it off). or even an environment variable > > ... and of course we can do the same things for feature reports when they > are committed ... indeed.. XML output as a standard option for all programs would be a very nice thing. It requires some reworking of how results are handled though to actually think 'object' rather than 'essay'. ..d From peter.rice at uk.lionbioscience.com Wed Apr 18 17:21:58 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 18 Apr 2001 18:21:58 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADDCD36.21861BAD@uk.lionbioscience.com> David Martin wrote: > I would take all options specified in the ACD file plus any specific > associated parameters so that essentially everything is stored. Very messy for applications with a long list of (possibly hidden) options. Could maybe cover all the required options (the ones that get prompted for). I would prefer just the ones that were set (i.e. enough to build a command line to repeat the run) > XML output as a standard option for all programs would be a very nice > thing. It requires some reworking of how results are handled though to > actually think 'object' rather than 'essay'. Do you have some standard XML in mind? Whenever I look into this I see a forest of DTDs and no clear standard. Of course, we could invent one... -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 07:55:34 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 08:55:34 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADDCD36.21861BAD@uk.lionbioscience.com> Message-ID: On Wed, 18 Apr 2001, Peter Rice wrote: > David Martin wrote: > > I would take all options specified in the ACD file plus any specific > > associated parameters so that essentially everything is stored. > > Very messy for applications with a long list of (possibly hidden) options. > Could maybe cover all the required options (the ones that get prompted > for). I would prefer just the ones that were set (i.e. enough to build a > command line to repeat the run) But what if the defaults have been modified in the ACD? It is about repeatability of experiments. Of course one could have an option to have either a minimal header or (as is now) no header at all. I think the current situation is poor scientific practice (ie the data isn't self documenting.. We were always taught to label and date everything, photos, spec traces, column traces, tubes in racks etc. so you _know_ what it is because it knows what it is.) Keeping just the command line and prompted options would be a minimal header (along with program name and date.) For full record keeping a full record should be kept. This can easily be parsed out by any vaguely competent unix hack and allows for proper traceability. > > > XML output as a standard option for all programs would be a very nice > > thing. It requires some reworking of how results are handled though to > > actually think 'object' rather than 'essay'. > > Do you have some standard XML in mind? Whenever I look into this I see a > forest of DTDs and no clear standard. Of course, we could invent one... Most of the DTD's out there are crap. I would go with one of our own to start with. If designed well this can easily be transformed into any format desired (HTML etc.) thus removing from the application writer the need to cope with every possible format. A post-process transformation can allow the user to specify an XSL stylesheet of choice (or DSSSL if that route is chosen). If EMBOSS follows the following scheme: ACD parse V Do the Science V Generate output in XML (full record) V Transform to the users desired form Then you can have the 'classical' form of output ie plain text, HTML, short XML output, full (untransformed) output, SQL, MS Word (only kidding), graphical output in whatever form (How about moving EMBOSS graphics from PL_PLOT to SVG?) It adds a layer of output independence to EMBOSS, provides a standard interface for external apps to pick up emboss output (think of all the fun you are having in dealing with unstandardised output models for the various programs and the joy of James Bonfield getting EMBOSS into SPIN.) This is not really very different to what happens with sequences which have their own internal representation and are transformed to the required format on output. ..d From peter.rice at uk.lionbioscience.com Thu Apr 19 08:02:11 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 19 Apr 2001 09:02:11 +0100 Subject: EMBOSS 1.12.0 References: Message-ID: <3ADE9B83.B2922583@uk.lionbioscience.com> David Martin wrote: >For full record keeping a full record should be kept. This can easily be >parsed out by any vaguely competent unix hack and allows for proper >traceability. Although we do also need to consider the user who will want to read the output. An option to skip (or to limit the size of) the header, definable by an environment variable (or .embossrc configuration) can keep them happy though. > Most of the DTD's out there are crap. I would go with one of our own to > start with. If designed well this can easily be transformed into any > format desired (HTML etc.) thus removing from the application writer the > need to cope with every possible format. A post-process transformation > can allow the user to specify an XSL stylesheet of choice (or DSSSL if > that route is chosen). > > If EMBOSS follows the following scheme: > > ACD parse > V > Do the Science > V > Generate output in XML (full record) > V > Transform to the users desired form > > Then you can have the 'classical' form of output ie plain text, > HTML, short XML output, full (untransformed) output, SQL, MS Word (only > kidding), graphical output in whatever form (How about moving EMBOSS > graphics from PL_PLOT to SVG?) MS Word? That was originally planned - output files were going to have titles and headers and paragraphs so they could be written as plain text, HTML or RTF but the print functions were never implemented. Would be easy to do though :-) > This is not really very different to what happens with sequences which > have their own internal representation and are transformed to the required > format on output. Yup, that's what output reports will be for. A limited number of output formats for features, alignments, and so on - with the chance to choose the format you like best. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 08:15:50 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 09:15:50 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADE9B83.B2922583@uk.lionbioscience.com> Message-ID: On Thu, 19 Apr 2001, Peter Rice wrote: > David Martin wrote: > >For full record keeping a full record should be kept. This can easily be > >parsed out by any vaguely competent unix hack and allows for proper > >traceability. > > Although we do also need to consider the user who will want to read the > output. An option to skip (or to limit the size of) the header, definable > by an environment variable (or .embossrc configuration) can keep them happy > though. EMBOSS_FORMAT fasta does this for sequences. There is no reason why the default output cannot be 'classic' (or for die-hard MS shops, 'powerpoint' or 'msword97'). XLST looks to be very easy to do. Just write a suitable style sheet and add it to the local implementation. It would even allow the data to be checksummed and watermarked if required.. (but the end admin can implement this quite easily if given a standard output to work with). For an automated run where the data needs to be warehoused and traceable the system can add another command line option for post processing. It also allows remote operation of EMBOSS programs with the end format being determined at the users point of contact. > > > Most of the DTD's out there are crap. I would go with one of our own to > > start with. If designed well this can easily be transformed into any > > format desired (HTML etc.) thus removing from the application writer the > > need to cope with every possible format. A post-process transformation > > can allow the user to specify an XSL stylesheet of choice (or DSSSL if > > that route is chosen). > > > > If EMBOSS follows the following scheme: > > > > ACD parse > > V > > Do the Science > > V > > Generate output in XML (full record) > > V > > Transform to the users desired form > > > > Then you can have the 'classical' form of output ie plain text, > > HTML, short XML output, full (untransformed) output, SQL, MS Word (only > > kidding), graphical output in whatever form (How about moving EMBOSS > > graphics from PL_PLOT to SVG?) > > MS Word? That was originally planned - output files were going to have > titles and headers and paragraphs so they could be written as plain text, > HTML or RTF but the print functions were never implemented. Would be easy > to do though :-) Then dump it to XML, use XSLT to generate the various formats. If EMBOSS can generate premade powerpoint slides of the output then it really is on to a winner (because scientists are lazy beasts..) > > > This is not really very different to what happens with sequences which > > have their own internal representation and are transformed to the required > > format on output. > > Yup, that's what output reports will be for. A limited number of output > formats for features, alignments, and so on - with the chance to choose the > format you like best. > Indeed, with extensibility for local customisation so I can generate an MS Word document with local crest, font, format etc should I so desire, or a web page, or LaTeX output, or have it automatically generate a tutorial script that shows the command line used and all the prompted options as it would appear when driven from the comamnd line, or SQL to go straight into the LIMS system, or email to the user, or ... All down to the creativity of the admin who is working with a documented output format for which there are plenty of tools available to do the work. (EMBOSS still doesn't support BoulderIO format for sequences a la primer3.) ..d From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 09:57:58 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 10:57:58 +0100 (BST) Subject: EMBOSS 1.12.0 In-Reply-To: <3ADEB335.C61A08C6@uk.lionbioscience.com> Message-ID: On Thu, 19 Apr 2001, Peter Rice wrote: > David, > > >Then dump it to XML, use XSLT to generate the various formats. > > Is there a DTD for this? XSLT is still new to me, but I guess I should > learn... XSLT is a description of rules for transforming an XML document into an arbitrary format/mess, including running external programs on part of the data. Have a look at the XML.com or w3c.org sites for more info. It looks quite amenable. > > >If EMBOSS can generate premade powerpoint slides of the output then it > >really is on to a winner (because scientists are lazy beasts..) > > Again, how? I guess it should be relatively easy, apart from designing the > background image :-) > If you can generate word documents then surely powerpoint is not too hard.. > >EMBOSS still doesn't support BoulderIO format for sequences a la > >primer3. > > True. Nor GFF with sequences in the header (though I am just implementing > that one to make reading GFF files easier) > > Does anyone have the BoulderIO format description? > Lincoln Stein? have a look also at www.no.embnet.org/Programs/SAL/primer05.php3 for primer 0.5 ..d > From dmartin at gen67172.msiwtb.dundee.ac.uk Thu Apr 19 12:27:09 2001 From: dmartin at gen67172.msiwtb.dundee.ac.uk (David Martin) Date: Thu, 19 Apr 2001 13:27:09 +0100 (BST) Subject: Project meeting minutes? Message-ID: Are you still having meetings or have people given up talking to one another? The last minutes are from Feb 26th.. ..d From David.Lapointe at umassmed.edu Thu Apr 19 13:23:17 2001 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Thu, 19 Apr 2001 09:23:17 -0400 Subject: EMBOSS 1.12.0 Message-ID: The BoulderIO info is at http://stein.cshl.org/software/boulder/ Also an alternative to DTD is XML Schema though that is still a work in progress. David > -----Original Message----- > From: David Martin [mailto:dmartin at gen67172.msiwtb.dundee.ac.uk] > Sent: Thursday, April 19, 2001 5:58 AM > To: Peter Rice > Cc: emboss-dev at embnet.org > Subject: Re: EMBOSS 1.12.0 > > > On Thu, 19 Apr 2001, Peter Rice wrote: > > > David, > > > > >Then dump it to XML, use XSLT to generate the various formats. > > > > Is there a DTD for this? XSLT is still new to me, but I > guess I should > > learn... > > XSLT is a description of rules for transforming an XML > document into an > arbitrary format/mess, including running external programs on > part of the > data. Have a look at the XML.com or w3c.org sites for more > info. It looks > quite amenable. > > > > > >If EMBOSS can generate premade powerpoint slides of the > output then it > > >really is on to a winner (because scientists are lazy beasts..) > > > > Again, how? I guess it should be relatively easy, apart > from designing the > > background image :-) > > > > If you can generate word documents then surely powerpoint is not too > hard.. > > > >EMBOSS still doesn't support BoulderIO format for sequences a la > > >primer3. > > > > True. Nor GFF with sequences in the header (though I am > just implementing > > that one to make reading GFF files easier) > > > > Does anyone have the BoulderIO format description? > > > Lincoln Stein? have a look also at > www.no.embnet.org/Programs/SAL/primer05.php3 for primer 0.5 > > ..d > > > > > > > From jkb at mrc-lmb.cam.ac.uk Fri Apr 27 16:52:44 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Fri, 27 Apr 2001 17:52:44 +0100 Subject: EMBOSS program groups Message-ID: <20010427175244.A15923@arran.mrc-lmb.cam.ac.uk> Hi all, I'm looking at restructuring the program groups so that there are fewer groups, or at least make some groups to be subgroups (cascading menus in my interface). There seems to be a huge amount of redundancy with the average program being in 1.7 groups (ish ;-)). Some entire groups have huge overlap, such as Motifs and Pattern matching. What are people's thoughts on this? I'm willing to submit my changes back to the emboss team, but how many people are likely to be effected by this? We need this ourselves as the menus are simply too long at present, and it's virtually impossible to find anything. On a different note, I see that there's also a huge redundancy in programs. How should users choose between them? Eg, as a novice user of emboss, how would I know to use stretcher over needle? And similarly for water vs matcher. The documentation for needle implies that needle is for short sequences and stretcher is for longer sequences. However stretcher uses the Myers and Miller algorithm, which claims to be a full needleman-wunsch alignment algorithm anyway (but with linear-space memory requirements). Similarly matcher vs needle - both claim to be rigorous algorithms, but matcher uses less memory. It seems that so far the reason for redundancy is simply that multiple authors have submitted programs to do identical tasks, with different names. Would it be treading on peoples toes too much to suggest that some of this redudancy should be removed? In the cases outlined above, assuming the results really are comparable, then it seems clear that the more memory hungry versions should go. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 14:44:37 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 15:44:37 +0100 Subject: Proposal: new menu layout, with examples Message-ID: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Now that several interfaces to EMBOSS are arriving I feel that it is important to have a well structured menu system (as indicated in my last email). I now have a proposed layout, included as an attachment. It may be hard to see the problem with the existing setup, so I've produce a tcl/tk to display a menu tree and created menu specs for how it is now (1.11.0, so nearly now) and my new proposed setup. My new layout still has quite a few questionable choices, and perhaps I could duplicate more often (although personally I dislike that) and split into more cascading menus. However it's a start[1]. So if you're interested in developing interfaces to EMBOSS, or you are one of the developers, please download the following: ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/private/emboss_dev/em.tar.gz See the README file for information on how to run the tcl/tk app. It ought to work OK even on pretty old tcl/tk installations. James [1] I should also point out that in "Spin" the menu labels are actually the program doc lines with the program command name in brackets (eg "Finds DNA inverted repeats (einverted)"). However I haven't addressed the issue of a common language style for doc names yet. -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ -------------- next part -------------- +Comparison + Global alignment = stretcher = needle - + Local alignment = simplesw = water = matcher = supermatcher = est2genome - + Multiple alignment = cluster = emma - + Dot plots = dotmatcher = dotpath = dottup = polydot - + Other = megamerger = merger = diffseq = plotcon = seqmatchall = wordmatch = prettyplot - - +Translation = backtranseq = checktrans = coderet = cusp = getorf = plotorf = prettyseq = showorf = showseq = transeq - +Databases + Indexing = dbiblast = dbifasta = dbiflat = dbigcg = showdb - + SCOP = nrscope = scope = stamps - - +Edit = cutseq = degapseq = descseq = extractseq = maskfeat = maskseq = msbar = newseq = noreturn = nthseq = pasteseq = revseq = shuffleseq = splitter = trimseq = vectorstrip - +Search + Motifs/pattern matching = dreg = fuzznuc = fuzzpro = fuzztran = helixturnhelix = patmatdb = patmatmotifs = preg = printsextract = prosextract = pscan = tfscan - + Repeats = einverted = equicktandem = etandem = palindrome - + Profiles = cons = profit = prophecy = prophet - + Primers = prima = primersearch = stssearch - + CPG islands = cpgplot = cpgreport = newcpgreport = newcpgseek - + Restriction enzymes = recode = redata = remap = restover = restrict = silent - + Enzymes = findkm - + Misc = textsearch - - +Composition + DNA = banana = btwisted = chaos = chips = codcmp = complex = compseq = dan = freak = geecee = isochore = marscan = syco = wobble - + Protein = antigenic = charge = compseq = digest = domainer = emowse = garnier = helixturnhelix = hmoment = iep = octanol = oddcomp = pepcoil = pepinfo = pepnet = pepstats = pepwheel = pepwindow = pepwindowall = sigcleave = tmap - - +Display = cirdna = entret = infoseq = lindna = notseq = prettyplot = seqinfo = seqret = seqretall = seqretallfeat = seqretfeat = seqretset = seqretsplit = seqrettype = seqtofeat = showfeat = showseq - +Utils = embossdata = entrails = rebaseextract = seealso = tfextract = tfm = wossname - +Test = ajbad = ajfeatest = ajtest = ajtest2 = corbatest = demofeatures = demolist = demosequence = demostring = demotable = histogramtest = patmattest = plplottest = proteinmotifsearch = testplot = treetypedisplay - From peter.rice at uk.lionbioscience.com Mon Apr 30 15:00:32 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 16:00:32 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED7E10.AE78F038@uk.lionbioscience.com> James Bonfield wrote: > > Now that several interfaces to EMBOSS are arriving I feel that it is > important to have a well structured menu system (as indicated in my > last email). > > I now have a proposed layout, included as an attachment. It may be > hard to see the problem with the existing setup, so I've produce a > tcl/tk to display a menu tree and created menu specs for how it is > now (1.11.0, so nearly now) and my new proposed setup. My new layout > still has quite a few questionable choices, and perhaps I could > duplicate more often (although personally I > dislike that) and > split into more cascading menus. However it's a start[1]. Looks nice - much nicer than the old emboss_menus_orig version! Should we have 2 classes of groups in ACD files - one for menus with a short name or hierarchy as in this case, and a longer version (called 'keywords', perhaps, and derived from the current 'groups') for wossname to search? > [1] I should also point out that in "Spin" the menu labels are > actually the program doc lines with the program command name > in brackets (eg "Finds DNA inverted repeats (einverted)"). However > I haven't addressed the issue of a common language style for doc > names yet. The text is best for many novice users, but I would maybe prefer the name first (and sorted by name?) for expert users. But of course, it should be up to the users to say what they want :-) -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From gwilliam at hgmp.mrc.ac.uk Mon Apr 30 15:19:18 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 30 Apr 2001 16:19:18 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> James Bonfield wrote: > > Now that several interfaces to EMBOSS are arriving I feel that it is important > to have a well structured menu system (as indicated in my last email). Agreed! > I now have a proposed layout, included as an attachment. It may be hard to see > the problem with the existing setup, so I've produce a tcl/tk to display a > menu tree and created menu specs for how it is now (1.11.0, so nearly now) and > my new proposed setup. My new layout still has quite a few questionable > choices, and perhaps I could duplicate more often (although personally I > dislike that) and split into more cascading menus. However it's a start[1]. > > So if you're interested in developing interfaces to EMBOSS, or you are one of > the developers, please download the following: > > ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/private/emboss_dev/em.tar.gz > > See the README file for information on how to run the tcl/tk app. It ought to > work OK even on pretty old tcl/tk installations. Didn't work for me - I just got the wish window and a '%' prompt. I looked at your 'emboss_menu*' files though to see what your reorganisation is like. Some commments (more minor niggles than anything major): I like much of the reorganisation. Firstly, I am a 'splitter', I like having several paths to a program, so many of my comments related to terseness have been removed from this version of this message ;-) Some things seems to be in the wrong place: I would like to see a 'codon usage' section with things like 'cusp', 'chips' 'syco' 'cai' in. 'nthseq' - not really an 'edit' program - more a 'database organisation' program 'textsearch' - looks wrong in 'Search:Misc', maybe change to 'Search:Text' ? 'complex' is now deleted 'marscan' finds a MAR site, this is not really a property of DNA composition - I would like to see a 'Features' menu as well as 'Composition'. 'Features' would hold those programs which can/should/will write out GFF files. So this should go in 'Features:DNA' 'antigenic' - finds antigenic features so should go in 'Features:Protein' 'emowse' - not really protein composition, nore identification - should probably go somewhere under 'Search' 'helixturnhelix' - features 'pepnet', 'pepwheel' - more of 'Display' programs than 'Composition' programs? 'garnier', 'tmap' - secondary structure 'notseq' - not a 'Display' program - - more a 'database organisation' program, like 'nthseq' 'tfm', 'wossname' - I would rather see this under 'Help' than 'lost in 'Utils' > James > > [1] I should also point out that in "Spin" the menu labels are actually the > program doc lines with the program command name in brackets (eg "Finds DNA > inverted repeats (einverted)"). However I haven't addressed the issue of a > common language style for doc names yet. I enforce a description that has as many keywords as possible and that fits on a 80 character line when displayed by 'wossname'. I also try to enforce capital first letter and no ending full-stop (to shave off an extra character). I try to capitalise acronyms and database names. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 16:05:40 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 17:05:40 +0100 Subject: Proposal: new menu layout, with examples Message-ID: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 04:00:32PM +0100, Peter Rice wrote: > Looks nice - much nicer than the old emboss_menus_orig version! > > Should we have 2 classes of groups in ACD files - one for menus with a > short name or hierarchy as in this case, and a longer version (called > 'keywords', perhaps, and derived from the current 'groups') for wossname to > search? Thanks for the comments. That makes sense as we have two specific purposes: wossname and menus. I like the idea of being able to resort based on expert vs novice. It shouldn't be too hard to have the choice, but I'm not sure how easy it would be to do this on-the-fly (at least with my current code). On the topic of groups - just a warning that we'll also be looking at restructuring the ACD with the 'group:' type, which is now sounding like a staggeringly confusing choice of word! By example, take vectorstrip.acd: group: vecgrp [ info: "Vector sequence" type: frame ] bool: vectorfile [ param: Y def: Y prompt: "Are your vector sequences in a file?" ] infile: vectors [ param: Y req: @($(vectorfile)?1:0) nullok: Y def: "" prompt: "Name of vectorfile" ] string: linkerA [ req: @(!$(vectorfile)) prompt:"5' sequence" ] string: linkerB [ req: @(!$(vectorfile)) prompt: "3' sequence" ] endgroup: vecgrp In the GUI dialogue this comes out as a labelled frame. We also have the choice of "type: page" to use tabs in a tabbed notebook. This basically allows grouping of command line options into common areas and hence guides the user through filling out the various dialogues. (Although mostly the defaults will suffice.) This is a large amount of work though so we haven't yet started this, except where absolutely necessary due to the size of the dialogues (showseq). James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 16:10:12 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 17:10:12 +0100 Subject: Proposal: new menu layout, with examples In-Reply-To: <3AED8276.D8D4EF8@hgmp.mrc.ac.uk>; from gwilliam@hgmp.mrc.ac.uk on Mon, Apr 30, 2001 at 04:19:18PM +0100 References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> Message-ID: <20010430171012.E21424@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 04:19:18PM +0100, Gary Williams, Tel 01223 494522 wrote: > Some commments (more minor niggles than anything major): Thanks for this list of comments. I'll try and reorganise things further. > Firstly, I am a 'splitter', I like having several paths to a program, so > many of my comments related to terseness have been removed from this > version of this message ;-) Grin. I agree that having multiple paths can be useful for novice users, and there are a _few_ programs in multiple places (eg compseq) in my new menu layout. However I consider it to be a trade off between not being able to find something and not being able to see it due to too many programs listed. I guess we just differ on the break-even point :) > I would like to see a 'codon usage' section with things like 'cusp', > 'chips' 'syco' 'cai' in. That makes sense. I'm not really familier with most of the EMBOSS tools; or some of the fields it covers either. I'll notify the list when I update the menu files again. > I enforce a description that has as many keywords as possible and that > fits on a 80 character line when displayed by 'wossname'. > > I also try to enforce capital first letter and no ending full-stop (to > shave off an extra character). > > I try to capitalise acronyms and database names. Are such things documented, so that new developers follow the same rules? Of course it can be rather hard to get anyone to read documentation (myself included). James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From gwilliam at hgmp.mrc.ac.uk Mon Apr 30 16:12:44 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 30 Apr 2001 17:12:44 +0100 Subject: Proposal: new menu layout, with examples References: <20010430154437.A20585@arran.mrc-lmb.cam.ac.uk> <3AED8276.D8D4EF8@hgmp.mrc.ac.uk> <20010430171012.E21424@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED8EFC.1612B8@hgmp.mrc.ac.uk> James Bonfield wrote: > > I enforce a description that has as many keywords as possible and that > > fits on a 80 character line when displayed by 'wossname'. > > > > I also try to enforce capital first letter and no ending full-stop (to > > shave off an extra character). > > > > I try to capitalise acronyms and database names. > > Are such things documented, so that new developers follow the same rules? Of > course it can be rather hard to get anyone to read documentation (myself > included). No - I just change the documentation after consultation with the author. I've not had any complaints yet. :-) -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From peter.rice at uk.lionbioscience.com Mon Apr 30 17:17:46 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 18:17:46 +0100 Subject: Proposal: new menu layout, with examples References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AED9E3A.383A1649@uk.lionbioscience.com> Hi James, > On the topic of groups - just a warning that we'll also be looking > at restructuring the ACD with the 'group:' type, which is now > sounding like a staggeringly confusing choice of word! How about 'section' and 'endsection' ? Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From jkb at mrc-lmb.cam.ac.uk Mon Apr 30 17:29:43 2001 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Mon, 30 Apr 2001 18:29:43 +0100 Subject: Proposal: new menu layout, with examples In-Reply-To: <3AED9E3A.383A1649@uk.lionbioscience.com>; from peter.rice@uk.lionbioscience.com on Mon, Apr 30, 2001 at 06:17:46PM +0100 References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> <3AED9E3A.383A1649@uk.lionbioscience.com> Message-ID: <20010430182943.A21399@arran.mrc-lmb.cam.ac.uk> On Mon, Apr 30, 2001 at 06:17:46PM +0100, Peter Rice wrote: > Hi James, > > > On the topic of groups - just a warning that we'll also be looking > > at restructuring the ACD with the 'group:' type, which is now > > sounding like a staggeringly confusing choice of word! > > How about 'section' and 'endsection' ? Agreed. What about the attributes? At present I've got: type A choice between "frame" and "page". Defaults to frame. This controls whether the section should be a labelled frame or a page in a notebook. The distinction only really matters for GUI or WWW based systems. In Tcl the choice is pretty clear. In WWW I'd expect a frame to be a table; a page is trickier to implement (and may require javascript), but I've seen such things done regularly. info A heading for the frame or page. Defaults to "" A page without a heading would look rather odd, but a frame without a heading is OK - it's just a bordered block. book Only needed for type:frame. The notebook to associate this page to. Defaults to "", which implies all pages are part of the same notebook. This is only needed if a program wishes to make use of more than one tabbed notebook, in which case this is used to determine which book a page is within. border Only needed for type:frame. Defaults to 1. The border width of the frame. side Only needed for type:frame. Defaults to top. This is used for 'frame packing'. It is only needed if we wish to express complex layout designs. Eg: +----------------------------------+ | Section1 | | blah | | blah | | +-------------+ +--------------+ + | | Section2 | | Section3 | | | | blah | | blah | | | +-------------+ +--------------+ | +----------------------------------+ I think perhaps the border and side attributes are overkill, and I haven't yet used them except in test ACD files. Naturally all of these are really just "hints" to the interface, which may ultimately do whatever it prefers. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Tel: 01223 402499 Fax: 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From peter.rice at uk.lionbioscience.com Mon Apr 30 17:39:23 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Mon, 30 Apr 2001 18:39:23 +0100 Subject: Proposal: new menu layout, with examples References: <20010430170540.D21424@arran.mrc-lmb.cam.ac.uk> <3AED9E3A.383A1649@uk.lionbioscience.com> <20010430182943.A21399@arran.mrc-lmb.cam.ac.uk> Message-ID: <3AEDA34B.BF3D0374@uk.lionbioscience.com> James Bonfield wrote: > On Mon, Apr 30, 2001 at 06:17:46PM +0100, Peter Rice wrote: > > How about 'section' and 'endsection' ? > > Agreed. > > What about the attributes? We also, of course, have the name which follows the 'section:') > At present I've got: > > type A choice between "frame" and "page". Tricky to test a list of values. frame:Y would be simpler, but we should add controlled vocabulary types some time (see 'side' below). > info A heading for the frame or page. Defaults to "" > A page without a heading would look rather odd, but a frame > without a heading is OK - it's just a bordered block. OK. Can also be used to prompt interactive users (e.g. "Gap penalties:" before prompting for the gap penalty values). Or we could have a separate 'prompt' for this, and use 'info' in the -help output (tricky though - qualifiers will be in sections but also in required/optional/advanced groupings so maybe -help should be left alone) > book Only needed for type:frame. > The notebook to associate this page to. Defaults to "", which > implies all pages are part of the same notebook. This is only > needed if a program wishes to make use of more than one tabbed > notebook, in which case this is used to determine which book a > page is within. Can we pick a more general name for this? > border Only needed for type:frame. Defaults to 1. > The border width of the frame. OK. > side Only needed for type:frame. Defaults to top. > This is used for 'frame packing'. It is only needed if we wish > to express complex layout designs. Eg: Looks like another controlled vocabulary. Also, looks like I can think of uses for it. Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723