From jison at ebi.ac.uk Mon Oct 3 03:16:23 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Mon, 3 Oct 2011 08:16:23 +0100 (BST) Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <4E8483F8.4080908@ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> Message-ID: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Morning I think it depends on what's most important, maintaining the richness of the EMBOSS command-line (dependencies in default values) or compatibility with the Galaxy or any interface that can't handle this. That's a tough one! I'm leaning towards the latter, but not if it makes many applications really messy. So I would add new options and remove old ones - bearing in mind that would need to be done for all apps. Cheers Jon > A question for our developer community... > > I am working through the GALAXY wrappers for EMBOSS applications. GALAXY > has a very clean way to define command line applications which is close > to EMBOSS's ACD definitions, so most applications are easy to define. > > I have problems where the default values in the ACD file depend on other > values. Two examples from prettyplot illustrate the problem. In both > cases, the current GALAXY definitions ignore these qualifiers. > > integer: residuesperline [ > default: "50" > information: "Number of residues to be displayed on each > line" > ] > > integer: resbreak [ > information: "Residues before a space" > default: "$(residuesperline)" > expected: "Same as -residuesperline to give no breaks" > ] > > > The second qualifier defaults to the value of the first. GALAXY is > unable to interpret this. It could be defined with a default of "50" for > GALAXY, but I would prefer to remove this qualifier and add a new one > "-blocksperline" with a default of 1. In this way the dependency > disappears, and the results are cleaner. > > The second value is a calculation from sequence properties: > > float: plurality [ > information: "Plurality check value (totweight/2)" > default: "@( $(sequences.totweight) / 2)" > expected: "Half the total sequence weighting" > ] > > This has a long history, back to the EGCG version of prettyplot where > the command line options were extensions of a GCG program. The "weight" > is by default 1.0 per sequence, but GCG format had a way to adjust > weights in the input file. Plurality is nice in that it allows a > definition of how many of the sequences should match. > > In this case, it seems easier to ignore the weight-based value and > instead to define -percent 50.0 then multiple the total weight (or > number of sequences) by 0.50 and get the same results. > > I am a little nervous about removing command line options because of the > risk of breaking some interfaces. > > So: > > 1. Should I go ahead and add the new options? > 2. Do I remove the old options so old wrappers, scripts, etc. break with > "unknown qualifier -plurality" > 3. Or, do we keep the old options, declare them obsolete, object to > their use but keep going > > As option 3 would also complicate life for wrappers - anyone making new > wrappers would most probably include the obsolete options - I prefer 1+2 > but I would appreciate some feedback. > > regards, > > Peter Rice > EMBOSS Team > _______________________________________________ > emboss-dev mailing list > emboss-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss-dev > From p.j.a.cock at googlemail.com Mon Oct 3 09:55:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 3 Oct 2011 14:55:52 +0100 Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Message-ID: On Mon, Oct 3, 2011 at 8:16 AM, Jon Ison wrote: > Morning > > I think it depends on what's most important, maintaining the richness of the EMBOSS command-line > (dependencies in default values) or compatibility with the Galaxy or any interface that can't > handle this. ?That's a tough one! ?I'm leaning towards the latter, but not if it makes many > applications really messy. > > So I would add new options and remove old ones - bearing in mind that would need to be done for > all apps. > > Cheers Peter R, Given Galaxy can handle this (even if it is a bit awkward to do), what are the other interfaces / wrapper developers where context dependent defaults may be an issue? It isn't an issue for Biopython (although our wrappers are all hand coded for now, rather than being generated from the EMBOSS ACD files - a possible student project one day?) Peter C. From ajb at ebi.ac.uk Wed Oct 5 11:06:32 2011 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Wed, 5 Oct 2011 16:06:32 +0100 (BST) Subject: [emboss-dev] EMBOSS patch set 1-24 available. New mEMBOSS available. Message-ID: <39228.82.26.12.214.1317827192.squirrel@imap04.ebi.ac.uk> New bug-fix files are available for EMBOSS-6.4.0 and, for Windows users, a new version of mEMBOSS is available. The bugs fixed include those recently fixed (22-24), listed below, and all those fixed by previous patches (1-21). 1) UNIX As usual, the most convenient way of applying the bug-fixes is to apply the patch file: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-24.gz to a freshly extracted copy of the EMBOSS-6.4.0.tar.gz source code and recompiling/installing. (see ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/README.patch for instructions on using 'patch'). Alternatively, you can individually copy the patched files from the ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ directory if your system does not support 'patch'. 2) mEMBOSS The new version incorporates all new and previous bug-fixes. Uninstall your previous mEMBOSS installation and download and install the new setup file from: ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.4.0.4-setup.exe Alan ----------------------------------------------------------------------- Fix 22. EMBOSS-6.4.0/emboss/diffseq.c EMBOSS-6.4.0/ajax/core/ajreport.c 14-Sep-2011: Diffseq reports insertions in the second sequence with a length 2 reversed region in the first sequence instead of a length 0 empty sequence. This bug was introduced in release 6.0.0 when reversed sequence features were updated. Fix 23. EMBOSS-6.4.0/ajax/core/ajindex.c 04-Oct-2011: Dbx index files from earlier releases do not include a type parameter to indicate an Identifier or Secondary index. The code to test index field names failed to define id and acc fields as Identifiers. This fix allows old indexes to work with EMBOSS 6.4.0. Fix 24. EMBOSS-6.4.0/ajax/core/ajfileio.c 05-Oct-2011: Trimming carriage controls from the ends of lines in a buffer failed when MacOSX-style characters are used and the line buffer is a reference counted string. An example on non-MacOSX systems was processing the data returned by the NCBI Entrez server. From gbottu at vub.ac.be Wed Oct 5 14:33:17 2011 From: gbottu at vub.ac.be (Guy Bottu) Date: Wed, 05 Oct 2011 20:33:17 +0200 Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Message-ID: <4E8CA2ED.9080005@vub.ac.be> Jon Ison wrote: > I think it depends on what's most important, maintaining the richness of the EMBOSS command-line > (dependencies in default values) or compatibility with the Galaxy or any interface that can't > handle this. That's a tough one! I'm leaning towards the latter, but not if it makes many > applications really messy. > > So I would add new options and remove old ones - bearing in mind that would need to be done for > all apps. I think that you should anyway not reduce the richness of ACD language. After all, there are people that use the EMBOSS libraries to develop applications for their local use, without using Galaxy or any other particular tool. As for prettyplot, making the changes suggested by Peter is OK since the functionality is fully preserved and it does not become significantly more difficult from the viewpoint of the user. There might however be problems with some other programs where eliminating dependencies cannot be done without making the interface messy... Regards, Guy Bottu From jison at ebi.ac.uk Mon Oct 3 07:16:23 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Mon, 3 Oct 2011 08:16:23 +0100 (BST) Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <4E8483F8.4080908@ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> Message-ID: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Morning I think it depends on what's most important, maintaining the richness of the EMBOSS command-line (dependencies in default values) or compatibility with the Galaxy or any interface that can't handle this. That's a tough one! I'm leaning towards the latter, but not if it makes many applications really messy. So I would add new options and remove old ones - bearing in mind that would need to be done for all apps. Cheers Jon > A question for our developer community... > > I am working through the GALAXY wrappers for EMBOSS applications. GALAXY > has a very clean way to define command line applications which is close > to EMBOSS's ACD definitions, so most applications are easy to define. > > I have problems where the default values in the ACD file depend on other > values. Two examples from prettyplot illustrate the problem. In both > cases, the current GALAXY definitions ignore these qualifiers. > > integer: residuesperline [ > default: "50" > information: "Number of residues to be displayed on each > line" > ] > > integer: resbreak [ > information: "Residues before a space" > default: "$(residuesperline)" > expected: "Same as -residuesperline to give no breaks" > ] > > > The second qualifier defaults to the value of the first. GALAXY is > unable to interpret this. It could be defined with a default of "50" for > GALAXY, but I would prefer to remove this qualifier and add a new one > "-blocksperline" with a default of 1. In this way the dependency > disappears, and the results are cleaner. > > The second value is a calculation from sequence properties: > > float: plurality [ > information: "Plurality check value (totweight/2)" > default: "@( $(sequences.totweight) / 2)" > expected: "Half the total sequence weighting" > ] > > This has a long history, back to the EGCG version of prettyplot where > the command line options were extensions of a GCG program. The "weight" > is by default 1.0 per sequence, but GCG format had a way to adjust > weights in the input file. Plurality is nice in that it allows a > definition of how many of the sequences should match. > > In this case, it seems easier to ignore the weight-based value and > instead to define -percent 50.0 then multiple the total weight (or > number of sequences) by 0.50 and get the same results. > > I am a little nervous about removing command line options because of the > risk of breaking some interfaces. > > So: > > 1. Should I go ahead and add the new options? > 2. Do I remove the old options so old wrappers, scripts, etc. break with > "unknown qualifier -plurality" > 3. Or, do we keep the old options, declare them obsolete, object to > their use but keep going > > As option 3 would also complicate life for wrappers - anyone making new > wrappers would most probably include the obsolete options - I prefer 1+2 > but I would appreciate some feedback. > > regards, > > Peter Rice > EMBOSS Team > _______________________________________________ > emboss-dev mailing list > emboss-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss-dev > From p.j.a.cock at googlemail.com Mon Oct 3 13:55:52 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 3 Oct 2011 14:55:52 +0100 Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Message-ID: On Mon, Oct 3, 2011 at 8:16 AM, Jon Ison wrote: > Morning > > I think it depends on what's most important, maintaining the richness of the EMBOSS command-line > (dependencies in default values) or compatibility with the Galaxy or any interface that can't > handle this. ?That's a tough one! ?I'm leaning towards the latter, but not if it makes many > applications really messy. > > So I would add new options and remove old ones - bearing in mind that would need to be done for > all apps. > > Cheers Peter R, Given Galaxy can handle this (even if it is a bit awkward to do), what are the other interfaces / wrapper developers where context dependent defaults may be an issue? It isn't an issue for Biopython (although our wrappers are all hand coded for now, rather than being generated from the EMBOSS ACD files - a possible student project one day?) Peter C. From ajb at ebi.ac.uk Wed Oct 5 15:06:32 2011 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Wed, 5 Oct 2011 16:06:32 +0100 (BST) Subject: [emboss-dev] EMBOSS patch set 1-24 available. New mEMBOSS available. Message-ID: <39228.82.26.12.214.1317827192.squirrel@imap04.ebi.ac.uk> New bug-fix files are available for EMBOSS-6.4.0 and, for Windows users, a new version of mEMBOSS is available. The bugs fixed include those recently fixed (22-24), listed below, and all those fixed by previous patches (1-21). 1) UNIX As usual, the most convenient way of applying the bug-fixes is to apply the patch file: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-24.gz to a freshly extracted copy of the EMBOSS-6.4.0.tar.gz source code and recompiling/installing. (see ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/README.patch for instructions on using 'patch'). Alternatively, you can individually copy the patched files from the ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ directory if your system does not support 'patch'. 2) mEMBOSS The new version incorporates all new and previous bug-fixes. Uninstall your previous mEMBOSS installation and download and install the new setup file from: ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.4.0.4-setup.exe Alan ----------------------------------------------------------------------- Fix 22. EMBOSS-6.4.0/emboss/diffseq.c EMBOSS-6.4.0/ajax/core/ajreport.c 14-Sep-2011: Diffseq reports insertions in the second sequence with a length 2 reversed region in the first sequence instead of a length 0 empty sequence. This bug was introduced in release 6.0.0 when reversed sequence features were updated. Fix 23. EMBOSS-6.4.0/ajax/core/ajindex.c 04-Oct-2011: Dbx index files from earlier releases do not include a type parameter to indicate an Identifier or Secondary index. The code to test index field names failed to define id and acc fields as Identifiers. This fix allows old indexes to work with EMBOSS 6.4.0. Fix 24. EMBOSS-6.4.0/ajax/core/ajfileio.c 05-Oct-2011: Trimming carriage controls from the ends of lines in a buffer failed when MacOSX-style characters are used and the line buffer is a reference counted string. An example on non-MacOSX systems was processing the data returned by the NCBI Entrez server. From gbottu at vub.ac.be Wed Oct 5 18:33:17 2011 From: gbottu at vub.ac.be (Guy Bottu) Date: Wed, 05 Oct 2011 20:33:17 +0200 Subject: [emboss-dev] Commandline changes in EMBOSS applications In-Reply-To: <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> References: <4E8483F8.4080908@ebi.ac.uk> <34790.84.92.187.247.1317626183.squirrel@webmail.ebi.ac.uk> Message-ID: <4E8CA2ED.9080005@vub.ac.be> Jon Ison wrote: > I think it depends on what's most important, maintaining the richness of the EMBOSS command-line > (dependencies in default values) or compatibility with the Galaxy or any interface that can't > handle this. That's a tough one! I'm leaning towards the latter, but not if it makes many > applications really messy. > > So I would add new options and remove old ones - bearing in mind that would need to be done for > all apps. I think that you should anyway not reduce the richness of ACD language. After all, there are people that use the EMBOSS libraries to develop applications for their local use, without using Galaxy or any other particular tool. As for prettyplot, making the changes suggested by Peter is OK since the functionality is fully preserved and it does not become significantly more difficult from the viewpoint of the user. There might however be problems with some other programs where eliminating dependencies cannot be done without making the interface messy... Regards, Guy Bottu