From pjotr.public78 at thebird.nl  Tue Dec  1 06:38:35 2009
From: pjotr.public78 at thebird.nl (Pjotr Prins)
Date: Tue, 1 Dec 2009 12:38:35 +0100
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <20091130115717.GA9235@thebird.nl>
References: <20091130115717.GA9235@thebird.nl>
Message-ID: <20091201113835.GA22908@thebird.nl>

Hello, anyone on this list?

Pj.

On Mon, Nov 30, 2009 at 12:57:17PM +0100, Pjotr Prins wrote:
> For your inspection, I have committed a patch for splitting out the
> logic of ./emboss/transeq.c. The patch is here:
> 
>   http://github.com/pjotrp/EMBOSS/commit/713800c4aa08ddf70b87f245a524c1a0b30c0942
> 
> The simplified transeq.c is here:
> 
>   http://github.com/pjotrp/EMBOSS/blob/biolib/emboss/transeq.c
> 
> The new interfaces are here:
> 
>   http://github.com/pjotrp/EMBOSS/blob/biolib/emboss/function/emboss_transeq.c
> 
> Basically I have split out the ACD logic and programming logic and
> given them new names:
> 
>   int transeq_acd(int argc, char **argv)
> 
>   AjPSeqout transeq( AjPSeqall seqall, AjPStr *framelist, AjPStr tablename, AjPRange regions, AjBool trim, AjBool clean, AjBool alternate)
> 
> so you can call either from an external program. The advantage being
> the call interface is exactly the same, whether from the command
> line, the web interface, or directly through a shared linked library.
> 
> What do you think? I propose to (slowly) accept splitting out the
> other routines in this fashion. As it does not interfere with EMBOSS
> it can be done in small steps.
> 
> The file emboss/function/emboss_transeq.c may get some extra
> interfaces - the idea is that is contains nicely named and direct
> methods (unlike the internal 'ajCamelCase' naming conventions). A
> useful one would be a simple one reading frame translation with
> pre-selected translation table (for speed). But more on that later -
> I can also weight-lift that in biolib itself.
> 
> The reason I want to do this here is to prevent duplication of
> functionality at different levels. 
> 
> Pj.
> _______________________________________________
> emboss-dev mailing list
> emboss-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss-dev

From biopython at maubp.freeserve.co.uk  Tue Dec  1 07:02:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Dec 2009 12:02:21 +0000
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <20091130115717.GA9235@thebird.nl>
References: <20091130115717.GA9235@thebird.nl>
Message-ID: <320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>

On Mon, Nov 30, 2009 at 11:57 AM, Pjotr Prins <pjotr.public78 at thebird.nl> wrote:
>
> The file emboss/function/emboss_transeq.c may get some extra
> interfaces - the idea is that is contains nicely named and direct
> methods (unlike the internal 'ajCamelCase' naming conventions).
>

Naming conventions and what is "nice" is a personal judgement.
Why change things? For anyone used to the EMBOSS code base,
it is very advantageous to preserve the old names in any interface.
I would also say it makes sense to keep the "aj" prefix - it acts
as a namespace, avoiding name collisions with internal names.

Peter
(an EMBOSS user and minor contributor)

From pjotr.public78 at thebird.nl  Tue Dec  1 07:35:10 2009
From: pjotr.public78 at thebird.nl (Pjotr Prins)
Date: Tue, 1 Dec 2009 13:35:10 +0100
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>
References: <20091130115717.GA9235@thebird.nl>
	<320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>
Message-ID: <20091201123510.GA23982@thebird.nl>

On Tue, Dec 01, 2009 at 12:02:21PM +0000, Peter wrote:
> On Mon, Nov 30, 2009 at 11:57 AM, Pjotr Prins <pjotr.public78 at thebird.nl> wrote:
> >
> > The file emboss/function/emboss_transeq.c may get some extra
> > interfaces - the idea is that is contains nicely named and direct
> > methods (unlike the internal 'ajCamelCase' naming conventions).
> >
> Naming conventions and what is "nice" is a personal judgement.

Sure.

> Why change things? For anyone used to the EMBOSS code base,
> it is very advantageous to preserve the old names in any interface.

The standard naming is available - and not going to disappear. I
don't expect anyone to change his/her ways.

> I would also say it makes sense to keep the "aj" prefix - it acts
> as a namespace, avoiding name collisions with internal names.

It is poor-mans namespacing. And it is fine for the backend. There is
no reason not to provide something better for users. Especially if
the front-end is C with extensions. With Biolib I can even introduce
C++, if needed.

Anyway, if EMBOSS does not like it - it does not have to go into the
main code base. I would prefer to export, at least, transeq and
transeq_acd to the external world. That would align with EMBOSS'
documentation of the binaries. It would make sense (to me) to have
that in the main code base.

The only thing I am really asking is to split code out of the main()
functions.

Pj.


From pmr at ebi.ac.uk  Thu Dec 10 08:36:14 2009
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 10 Dec 2009 13:36:14 +0000
Subject: [emboss-dev] EMBOSS 6.1.0 patch 1.3
Message-ID: <4B20F94E.4000506@ebi.ac.uk>

A patch for EMBOSS 6.1.0 is on the FTP server. This fixes problems with
extractfeat, using format names with dashes (fastq-sanger) in USAs,
scaling issues in plot outputs, and some minor bugs.

The files are on our FTP server ftp://emboss.open-bio.org/pub/EMBOSS/fixes
with a patch file and instructions in the patches subdirectory.

Fix 3. EMBOSS-6.1.0/ajax/ajfeat.c
        EMBOSS-6.1.0/ajax/ajfeat.h
        EMBOSS-6.1.0/ajax/ajgraph.c
        EMBOSS-6.1.0/ajax/ajmath.c
        EMBOSS-6.1.0/ajax/ajseq.c
        EMBOSS-6.1.0/ajax/ajseqread.c
        EMBOSS-6.1.0/ajax/ajseqwrite.c
        EMBOSS-6.1.0/nucleus/embmisc.c
        EMBOSS-6.1.0/nucleus/embmisc.h
        EMBOSS-6.1.0/nucleus/embpat.c
        EMBOSS-6.1.0/emboss/coderet.c
        EMBOSS-6.1.0/emboss/extractfeat.c
        EMBOSS-6.1.0/emboss/notseq.c
        EMBOSS-6.1.0/emboss/prettyplot.c
        EMBOSS-6.1.0/emboss/seqmatchall.c
        EMBOSS-6.1.0/emboss/showfeat.c
        EMBOSS-6.1.0/emboss/showpep.c
        EMBOSS-6.1.0/emboss/showseq.c
        EMBOSS-6.1.0/emboss/twofeat.c
        EMBOSS-6.1.0/jemboss/utils/install-jemboss-server.sh
 
EMBOSS-6.1.0/jemboss/org/emboss/jemboss/server/AppendToLogFileThread.java
 
EMBOSS-6.1.0/jemboss/org/emboss/jemboss/server/JembossAuthServer.java

02-Dec-2009: Fixes problems with extractfeat. The fix includes cleaner
              definitions of functions used to match feature tags and
              feature types which result in minor updates to 6 other
              applications.

              Extractfeat in previous versions used its own text parser
              to extract feature data from only a limited set of
              formats. In release 6.1.0 it was replaced by the standard
              EMBOSS feature table. With no options set, extractfeat
              rejected all features (type '*' was needed to extract
              features). Extractfeat default settings now extract all
              features from an entry.

              Features on the reverse strand were incorrectly processed
              (an effect caused by some of the old extractfeat code
              remaining). Reverse strand features are now correctly
              parsed, including both "join(complement())" and
              "complement(join())" syntax in EMBL/GenBank/DDBJ feature
              tables.

              Fixes an issue in GenBank parsing where the ORIGIN line is 
absent.

              Fixes scaling errors in prettyplot, especially in mEMBOSS
              when plotting to a window on screen (the default
              output). The plplot library does not report the true
              width and height for several devices. The assumptions in
              prettyplot depend on reasonable size estimates. Release
              6.2.0 will have further corrections to plplot device
              scaling.

	     Fixes the counting of non-coding features in coderet.

	     Fixes a seqmatchall error for short sequences with perfect matches

	     When reverse-complementing sequences, also reverses
	     the quality scores.

	     Allows '-' in format names in the USA syntax, to allow
	     fastq-sanger fastq-illumina and fastq-solexa format names
	     to be used.

	     When reading protein sequences, a sequence with only a
	     stop is now recognized as empty (zero length) after
	     processing ambiguity codes and stops.

	     Fixes a problem writing features in PIR format when the
	     feature table is empty, for example a report file with no hits.

	     Fixes a dependency on 'ant' to install a Jemboss server.

	     Fixes a problem in logging Jemboss info/error messages.


regards,

Peter Rice

From biopython at maubp.freeserve.co.uk  Tue Dec 15 07:07:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 15 Dec 2009 12:07:48 +0000
Subject: [emboss-dev] Updating http://www.open-bio.org/wiki/SourceCode
Message-ID: <320fb6e00912150407o48fd981fn16dedaf0581e5fef@mail.gmail.com>

Hello all,

We just had a puzzling mailing list query about the Biopython CVS
repository, which turned out to be partly due to some very dated
information here: http://www.open-bio.org/wiki/SourceCode

I've made a few minor improvements, but feel the whole page
could be simplified.

Am I right in thinking it is just EMBOSS still using CVS (all the
other projects are now on SVN or github, or obsolete)? If so, since
EMBOSS has nice CVS documentation on their own webpages,
could we remove most of the CVS text from the OBF wiki:
http://emboss.sourceforge.net/developers/cvs.html

Thanks,

Peter
(@Biopython)

From biopython at maubp.freeserve.co.uk  Tue Dec 15 08:11:55 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 15 Dec 2009 13:11:55 +0000
Subject: [emboss-dev] [Open-bio-l] Updating
	http://www.open-bio.org/wiki/SourceCode
In-Reply-To: <E02EE0D2-04F6-4B1E-98F8-B2397EB5AED1@ebi.ac.uk>
References: <320fb6e00912150407o48fd981fn16dedaf0581e5fef@mail.gmail.com>
	<E02EE0D2-04F6-4B1E-98F8-B2397EB5AED1@ebi.ac.uk>
Message-ID: <320fb6e00912150511u928e0a3s2187a11a634bda0a@mail.gmail.com>

On Tue, Dec 15, 2009 at 12:42 PM, Andy Jenkinson
<andy.jenkinson at ebi.ac.uk> wrote:
> To be honest I'm not sure which (if any) of the BioDAS project's
> components are using CVS. IIRC something was but I don't have
> access so have never looked into it. Perhaps someone else can confirm?

This wiki page suggests that BioDAS is also still using CVS:
http://www.biodas.org/wiki/DAS/2#CVS_Access

Peter

From pjotr.public78 at thebird.nl  Tue Dec  1 11:38:35 2009
From: pjotr.public78 at thebird.nl (Pjotr Prins)
Date: Tue, 1 Dec 2009 12:38:35 +0100
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <20091130115717.GA9235@thebird.nl>
References: <20091130115717.GA9235@thebird.nl>
Message-ID: <20091201113835.GA22908@thebird.nl>

Hello, anyone on this list?

Pj.

On Mon, Nov 30, 2009 at 12:57:17PM +0100, Pjotr Prins wrote:
> For your inspection, I have committed a patch for splitting out the
> logic of ./emboss/transeq.c. The patch is here:
> 
>   http://github.com/pjotrp/EMBOSS/commit/713800c4aa08ddf70b87f245a524c1a0b30c0942
> 
> The simplified transeq.c is here:
> 
>   http://github.com/pjotrp/EMBOSS/blob/biolib/emboss/transeq.c
> 
> The new interfaces are here:
> 
>   http://github.com/pjotrp/EMBOSS/blob/biolib/emboss/function/emboss_transeq.c
> 
> Basically I have split out the ACD logic and programming logic and
> given them new names:
> 
>   int transeq_acd(int argc, char **argv)
> 
>   AjPSeqout transeq( AjPSeqall seqall, AjPStr *framelist, AjPStr tablename, AjPRange regions, AjBool trim, AjBool clean, AjBool alternate)
> 
> so you can call either from an external program. The advantage being
> the call interface is exactly the same, whether from the command
> line, the web interface, or directly through a shared linked library.
> 
> What do you think? I propose to (slowly) accept splitting out the
> other routines in this fashion. As it does not interfere with EMBOSS
> it can be done in small steps.
> 
> The file emboss/function/emboss_transeq.c may get some extra
> interfaces - the idea is that is contains nicely named and direct
> methods (unlike the internal 'ajCamelCase' naming conventions). A
> useful one would be a simple one reading frame translation with
> pre-selected translation table (for speed). But more on that later -
> I can also weight-lift that in biolib itself.
> 
> The reason I want to do this here is to prevent duplication of
> functionality at different levels. 
> 
> Pj.
> _______________________________________________
> emboss-dev mailing list
> emboss-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss-dev


From biopython at maubp.freeserve.co.uk  Tue Dec  1 12:02:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Dec 2009 12:02:21 +0000
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <20091130115717.GA9235@thebird.nl>
References: <20091130115717.GA9235@thebird.nl>
Message-ID: <320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>

On Mon, Nov 30, 2009 at 11:57 AM, Pjotr Prins <pjotr.public78 at thebird.nl> wrote:
>
> The file emboss/function/emboss_transeq.c may get some extra
> interfaces - the idea is that is contains nicely named and direct
> methods (unlike the internal 'ajCamelCase' naming conventions).
>

Naming conventions and what is "nice" is a personal judgement.
Why change things? For anyone used to the EMBOSS code base,
it is very advantageous to preserve the old names in any interface.
I would also say it makes sense to keep the "aj" prefix - it acts
as a namespace, avoiding name collisions with internal names.

Peter
(an EMBOSS user and minor contributor)


From pjotr.public78 at thebird.nl  Tue Dec  1 12:35:10 2009
From: pjotr.public78 at thebird.nl (Pjotr Prins)
Date: Tue, 1 Dec 2009 13:35:10 +0100
Subject: [emboss-dev] Mapping EMBOSS to Ruby, Perl and Python
In-Reply-To: <320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>
References: <20091130115717.GA9235@thebird.nl>
	<320fb6e00912010402m7ded2694ne6191c71935ddfa5@mail.gmail.com>
Message-ID: <20091201123510.GA23982@thebird.nl>

On Tue, Dec 01, 2009 at 12:02:21PM +0000, Peter wrote:
> On Mon, Nov 30, 2009 at 11:57 AM, Pjotr Prins <pjotr.public78 at thebird.nl> wrote:
> >
> > The file emboss/function/emboss_transeq.c may get some extra
> > interfaces - the idea is that is contains nicely named and direct
> > methods (unlike the internal 'ajCamelCase' naming conventions).
> >
> Naming conventions and what is "nice" is a personal judgement.

Sure.

> Why change things? For anyone used to the EMBOSS code base,
> it is very advantageous to preserve the old names in any interface.

The standard naming is available - and not going to disappear. I
don't expect anyone to change his/her ways.

> I would also say it makes sense to keep the "aj" prefix - it acts
> as a namespace, avoiding name collisions with internal names.

It is poor-mans namespacing. And it is fine for the backend. There is
no reason not to provide something better for users. Especially if
the front-end is C with extensions. With Biolib I can even introduce
C++, if needed.

Anyway, if EMBOSS does not like it - it does not have to go into the
main code base. I would prefer to export, at least, transeq and
transeq_acd to the external world. That would align with EMBOSS'
documentation of the binaries. It would make sense (to me) to have
that in the main code base.

The only thing I am really asking is to split code out of the main()
functions.

Pj.


From pmr at ebi.ac.uk  Thu Dec 10 13:36:14 2009
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 10 Dec 2009 13:36:14 +0000
Subject: [emboss-dev] EMBOSS 6.1.0 patch 1.3
Message-ID: <4B20F94E.4000506@ebi.ac.uk>

A patch for EMBOSS 6.1.0 is on the FTP server. This fixes problems with
extractfeat, using format names with dashes (fastq-sanger) in USAs,
scaling issues in plot outputs, and some minor bugs.

The files are on our FTP server ftp://emboss.open-bio.org/pub/EMBOSS/fixes
with a patch file and instructions in the patches subdirectory.

Fix 3. EMBOSS-6.1.0/ajax/ajfeat.c
        EMBOSS-6.1.0/ajax/ajfeat.h
        EMBOSS-6.1.0/ajax/ajgraph.c
        EMBOSS-6.1.0/ajax/ajmath.c
        EMBOSS-6.1.0/ajax/ajseq.c
        EMBOSS-6.1.0/ajax/ajseqread.c
        EMBOSS-6.1.0/ajax/ajseqwrite.c
        EMBOSS-6.1.0/nucleus/embmisc.c
        EMBOSS-6.1.0/nucleus/embmisc.h
        EMBOSS-6.1.0/nucleus/embpat.c
        EMBOSS-6.1.0/emboss/coderet.c
        EMBOSS-6.1.0/emboss/extractfeat.c
        EMBOSS-6.1.0/emboss/notseq.c
        EMBOSS-6.1.0/emboss/prettyplot.c
        EMBOSS-6.1.0/emboss/seqmatchall.c
        EMBOSS-6.1.0/emboss/showfeat.c
        EMBOSS-6.1.0/emboss/showpep.c
        EMBOSS-6.1.0/emboss/showseq.c
        EMBOSS-6.1.0/emboss/twofeat.c
        EMBOSS-6.1.0/jemboss/utils/install-jemboss-server.sh
 
EMBOSS-6.1.0/jemboss/org/emboss/jemboss/server/AppendToLogFileThread.java
 
EMBOSS-6.1.0/jemboss/org/emboss/jemboss/server/JembossAuthServer.java

02-Dec-2009: Fixes problems with extractfeat. The fix includes cleaner
              definitions of functions used to match feature tags and
              feature types which result in minor updates to 6 other
              applications.

              Extractfeat in previous versions used its own text parser
              to extract feature data from only a limited set of
              formats. In release 6.1.0 it was replaced by the standard
              EMBOSS feature table. With no options set, extractfeat
              rejected all features (type '*' was needed to extract
              features). Extractfeat default settings now extract all
              features from an entry.

              Features on the reverse strand were incorrectly processed
              (an effect caused by some of the old extractfeat code
              remaining). Reverse strand features are now correctly
              parsed, including both "join(complement())" and
              "complement(join())" syntax in EMBL/GenBank/DDBJ feature
              tables.

              Fixes an issue in GenBank parsing where the ORIGIN line is 
absent.

              Fixes scaling errors in prettyplot, especially in mEMBOSS
              when plotting to a window on screen (the default
              output). The plplot library does not report the true
              width and height for several devices. The assumptions in
              prettyplot depend on reasonable size estimates. Release
              6.2.0 will have further corrections to plplot device
              scaling.

	     Fixes the counting of non-coding features in coderet.

	     Fixes a seqmatchall error for short sequences with perfect matches

	     When reverse-complementing sequences, also reverses
	     the quality scores.

	     Allows '-' in format names in the USA syntax, to allow
	     fastq-sanger fastq-illumina and fastq-solexa format names
	     to be used.

	     When reading protein sequences, a sequence with only a
	     stop is now recognized as empty (zero length) after
	     processing ambiguity codes and stops.

	     Fixes a problem writing features in PIR format when the
	     feature table is empty, for example a report file with no hits.

	     Fixes a dependency on 'ant' to install a Jemboss server.

	     Fixes a problem in logging Jemboss info/error messages.


regards,

Peter Rice


From biopython at maubp.freeserve.co.uk  Tue Dec 15 12:07:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 15 Dec 2009 12:07:48 +0000
Subject: [emboss-dev] Updating http://www.open-bio.org/wiki/SourceCode
Message-ID: <320fb6e00912150407o48fd981fn16dedaf0581e5fef@mail.gmail.com>

Hello all,

We just had a puzzling mailing list query about the Biopython CVS
repository, which turned out to be partly due to some very dated
information here: http://www.open-bio.org/wiki/SourceCode

I've made a few minor improvements, but feel the whole page
could be simplified.

Am I right in thinking it is just EMBOSS still using CVS (all the
other projects are now on SVN or github, or obsolete)? If so, since
EMBOSS has nice CVS documentation on their own webpages,
could we remove most of the CVS text from the OBF wiki:
http://emboss.sourceforge.net/developers/cvs.html

Thanks,

Peter
(@Biopython)


From biopython at maubp.freeserve.co.uk  Tue Dec 15 13:11:55 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 15 Dec 2009 13:11:55 +0000
Subject: [emboss-dev] [Open-bio-l] Updating
	http://www.open-bio.org/wiki/SourceCode
In-Reply-To: <E02EE0D2-04F6-4B1E-98F8-B2397EB5AED1@ebi.ac.uk>
References: <320fb6e00912150407o48fd981fn16dedaf0581e5fef@mail.gmail.com>
	<E02EE0D2-04F6-4B1E-98F8-B2397EB5AED1@ebi.ac.uk>
Message-ID: <320fb6e00912150511u928e0a3s2187a11a634bda0a@mail.gmail.com>

On Tue, Dec 15, 2009 at 12:42 PM, Andy Jenkinson
<andy.jenkinson at ebi.ac.uk> wrote:
> To be honest I'm not sure which (if any) of the BioDAS project's
> components are using CVS. IIRC something was but I don't have
> access so have never looked into it. Perhaps someone else can confirm?

This wiki page suggests that BioDAS is also still using CVS:
http://www.biodas.org/wiki/DAS/2#CVS_Access

Peter