From wshui at cse.unsw.edu.au  Wed Aug  1 12:48:15 2001
From: wshui at cse.unsw.edu.au (Bill Shui)
Date: Wed, 1 Aug 2001 16:48:15 +0000
Subject: Question on Edatas
Message-ID: <20010801164815.A5750@nimky.cse.unsw.edu.au>

Hi all,
    I'm using EMBOSS as part of my honours project at the moment.

    I'm in the process of disecting the entire emboss package and use only
    some of the ajax and nucleus libraries.

    However, I was just wandering as to the data files that were packaged
    with the release. Are they example data files or are they being 
    used by one of the emboss programs?

    I know that the file "Edayhoff.freq" is being used by pepstat so
    I assume that all of these data files are being used by other programs
    rather than just being example data files.

    Am I right on this?

    Please advice.

thanks in advance.

Bill
-- 
There are three kinds of people: men, women, and unix.
------------------------------------------------------
Bill Shui		Email: wshui at cse.unsw.edu.au


From ableasby at hgmp.mrc.ac.uk  Wed Aug  1 02:53:31 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 1 Aug 2001 07:53:31 +0100 (BST)
Subject: Question on Edatas
Message-ID: <200108010653.HAA05098@bromine.hgmp.mrc.ac.uk>

Hi,

The files in the embdoss/data directory and subdirectories contain data
that is essential to the operation of either applications or indeed
to the library (e.g. the Efeatures data for feature tables). The
data in test/data are generally for quality testing or examples.

HTH

Alan


From touro at capoeirabrasil.com.au  Mon Aug  6 11:03:07 2001
From: touro at capoeirabrasil.com.au (Bill Shui)
Date: Tue, 7 Aug 2001 01:03:07 +1000
Subject: interpreting ajtranslate.* in ajax library.
Message-ID: <20010807010307.A17321@capoeira.sydney>

Hi there,
    I'm using EMBOSS as part of my honours thesis. What I am doing now is 
    breaking up all the library modules and reuse bits of them to get something
    working.


    However, I am stuck with ajtranslate or the transeq program.

    In the file ajtranslate.c, the function ajTrnReadFile uses struct
    AjSTrn to store the EGC data (well at least that's how I understood it)
    correct me if I was wrong. 


    Now, I don't understand why the variable GC and Starts in 
    AjSTrn are 15 by 15 by 15 matrices?

    I also do not understand the meaning of initialisation of
    the char arrays trnconv and trncomp.

    and why most of the arrays are 14?

your prompt reply to this is much appreciated as I really need this for my
thesis and I'm on a tight schedule.

thanks in advance.

regards.

Bill
-- 
The mark of a good party is that you wake up the next morning
wanting to change your name and start a new life in different
city.
                -- Vance Bourjaily, "Esquire"
---------------------------------------------
Bill Shui           Email: wshui at bigpond.net.au
			   wshui at cse.unsw.edu.au
			   touro at capoeirabrasil.com.au
			   bill.shui at proteomesystems.com
Bioinformatics Programmer


From gwilliam at hgmp.mrc.ac.uk  Mon Aug  6 11:24:13 2001
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Mon, 06 Aug 2001 16:24:13 +0100
Subject: interpreting ajtranslate.* in ajax library.
References: <20010807010307.A17321@capoeira.sydney>
Message-ID: <3B6EB69C.FFD741C8@hgmp.mrc.ac.uk>

Bill Shui wrote:
> 
> Hi there,
>     I'm using EMBOSS as part of my honours thesis. What I am doing now is
>     breaking up all the library modules and reuse bits of them to get something
>     working.
> 
>     However, I am stuck with ajtranslate or the transeq program.
> 
>     In the file ajtranslate.c, the function ajTrnReadFile uses struct
>     AjSTrn to store the EGC data (well at least that's how I understood it)
>     correct me if I was wrong.

Correct.

>     Now, I don't understand why the variable GC and Starts in
>     AjSTrn are 15 by 15 by 15 matrices?

Each codon has 3 bases, so we use a 3-dimensional array to convert the
codons to residues.

The size of the array could be 4x4x4 for most purposes (there are four
bases: A, C, G, T) but sometimes ambiguity codes are used in positions
where the base is uncertain, e.g. 'M' codes for 'A' or 'C'. There are 15
bases if you include these ambiguity codes (including 'N' for the
completely unknown base). So to translate codons that have ambiguity
codes in them, you really need a 15x15x15 matrix.

Similarly, for the Start codons, although there are far fewer codons
that are Start codons and so this could probably have been done in a
more memory efficient way.

>     I also do not understand the meaning of initialisation of
>     the char arrays trnconv and trncomp.

To look up an element in the 15x15x15 codon to residue matrix, you need
to convert the bases to numbers. This is what trnconv[] is for. 
trncomp[] does the same thing, but gives you the number of the code for
the complement - this is used for translating the complement of the
sequence.

>     and why most of the arrays are 14?

Most of the arrays trnconv[] and trncomp[] hold '14' because this is the
code I am using for 'N' (unknown) - these are letters that do not
correspond to any recognised nucleotide code letter (i.e. they are not
one of: ACGTUMRWSYKVHDBN).

See:
http://www.chem.qmw.ac.uk/iupac/misc/naseq.html
for details of the ambiguity codes.

> your prompt reply to this is much appreciated as I really need this for my
> thesis and I'm on a tight schedule.

Is this soon enough for you?

> thanks in advance.
> 
> regards.
> 
> Bill
> --
> The mark of a good party is that you wake up the next morning
> wanting to change your name and start a new life in different
> city.
>                 -- Vance Bourjaily, "Esquire"
> ---------------------------------------------
> Bill Shui           Email: wshui at bigpond.net.au
>                            wshui at cse.unsw.edu.au
>                            touro at capoeirabrasil.com.au
>                            bill.shui at proteomesystems.com
> Bioinformatics Programmer

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From dmartin at bioinformatics.msiwtb.dundee.ac.uk  Wed Aug  8 11:23:25 2001
From: dmartin at bioinformatics.msiwtb.dundee.ac.uk (David Martin)
Date: Wed, 8 Aug 2001 16:23:25 +0100 (BST)
Subject: Using the dynamic arrays in a random access manner
Message-ID: <Pine.LNX.4.33.0108081616410.926-100000@bioinformatics.msiwtb.dundee.ac.uk>


One problem with using the 2d and higher arrays in ajarr.c is that if one
doesn't fill them sequentially but instead uses them in a semi sparse
manner this can lead to problems.

If one wants the size of an array teh current function calls return the
maximum dimensions, but this doesn't neccessarily mean that every point
within that area has been allocated. This can lead to a warning or worse
when trying to aj...Get the value and the particular one dimensional array
doesn't stretch that far or hasn't been allocated at all.

I have a couple of patches for the Int2d arrays which add two more
function calls ajInt2dLenElem (gets the second dimension length at any
particular specified first dimension position) and ajInt2dElemExists which
checks to see whether an element has indeed been allocated yet.

The alternative is to roll some of this into ajInt2dGet so that an
attempt to access an unallocated element returns null or zero instead of
throwing an error, but I am not sure that is completely desirable.

Thoughts

..d


----------------------------------
David Martin PhD
Bioinformatics Scientific Officer
Wellcome Trust Biocentre, Dundee
----------------------------------


From ableasby at hgmp.mrc.ac.uk  Wed Aug  8 13:25:54 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 8 Aug 2001 18:25:54 +0100 (BST)
Subject: Using the dynamic arrays in a random access manner
Message-ID: <200108081725.SAA11200@bromine.hgmp.mrc.ac.uk>

The proposed functions certainly wouldn't hurt anything but
their use would really be a programming style matter.

Cheers
Alan


From gbottu at ben.vub.ac.be  Wed Aug 15 12:11:41 2001
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Wed, 15 Aug 2001 18:11:41 +0200 (MET DST)
Subject: EMBOSS whish list
Message-ID: <200108151611.SAA13664@bigben.vub.ac.be>

from : BEN

	Dear colleagues,
	
I know it is easy to say "we need this, we need that !", while it takes so much 
time and effort to program. Nevertheless, I take the liberty to make a few 
suggestions for improvements that would help EMBOSS to equal and even surpass 
GCG in userfriendliness :

- for graphic display : an X-Window graphic with "zoom" function, also tektronix 
emulation in color.
- a mechanism to easily submit programs to a "batch queue" from the command line 
(comparable to the -batch parameter of GCG and egcg)
- at the present it is difficult to find out which data file(s) a program uses 
and hence to put an alternative data file in the working directory. You have to 
read exhaustively the on-line manual. It would be nice if there was always a 
command line parameter with default value that would appear if you do
xxx -help
or alternatively maybe a new General Parameter to make
e.g.   xxx -datafile
to make a program display its data file names. The data file name should 
preferable appear in the ACD file so that the parsers that generate pages for 
the graphical user interfaces can find it.
- the possibility to input/output sequences from any program in any format 
should be extended to other kinds of data. E.g.
  base/aa symbol comparison tables : GCG, BLAST, SIM,...
  codon usage tables : GCG, CUTG,...
  3D structures : PDB, Kinemage,...   (is already being done ?)
  ...


	Sincerely,
	Guy Bottu 
 

From jkb at mrc-lmb.cam.ac.uk  Wed Aug 15 12:22:44 2001
From: jkb at mrc-lmb.cam.ac.uk (James Bonfield)
Date: Wed, 15 Aug 2001 17:22:44 +0100
Subject: EMBOSS whish list
In-Reply-To: <200108151611.SAA13664@bigben.vub.ac.be>; from gbottu@ben.vub.ac.be on Wed, Aug 15, 2001 at 06:11:41PM +0200
References: <200108151611.SAA13664@bigben.vub.ac.be>
Message-ID: <20010815172244.B30470@arran.mrc-lmb.cam.ac.uk>

On Wed, Aug 15, 2001 at 06:11:41PM +0200, Guy Bottu wrote:
> - for graphic display : an X-Window graphic with "zoom" function, also tektronix 
> emulation in color.

Maybe now is the time for a quick plug (sorry). We've (finally) managed to
make an official release of "spin". This includes a graphical interface to
EMBOSS, including the ability to zoom, scroll and superimpose plots.

Spin is still in its early days as an EMBOSS interface (although it has its
own algorithms too, some of which date back to before the dawn of time).

See http://www.mrc-lmb.cam.ac.uk/pubseq/ for more details.

Alas it doesn't solve your other problems.

I've also heard via the grapevine that another graphical interface to EMBOSS
is in development; using Java. Is there any news on the progress of this?

> e.g.   xxx -datafile
> to make a program display its data file names. The data file name should 
> preferable appear in the ACD file so that the parsers that generate pages for 
> the graphical user interfaces can find it.

This is a good point. We found this horrid to do, especially for the programs
that generate an arbitrary number of output files. Before running the programs
we delete all {progname}.dat[0-9]* files. Then after running we do a filename
'glob' to see which files have been created. This works OK, but can really
catch out the user if they try to have an input file named "syco.dat" (for
example)!

James

-- 
James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/


From ableasby at hgmp.mrc.ac.uk  Wed Aug 15 16:04:50 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 15 Aug 2001 21:04:50 +0100 (BST)
Subject: EMBOSS whish list
Message-ID: <200108152004.VAA14659@bromine.hgmp.mrc.ac.uk>

The java interface (currently called jEMBOSS) is still under development
and hopefully will be released before the end of the year. It
doesn't have anything as fancy as zoom yet. Maybe the title should
have been swish list?

We have filled one of the programming posts (corba and soapy things)
and are hoping to fill the other soon. The intention is for the
second post to work in the graphics area (openGL etc).

Alan

PS: Other points noted.


From brooks at embl-grenoble.fr  Mon Aug 20 12:54:02 2001
From: brooks at embl-grenoble.fr (Brooks Mark)
Date: Mon, 20 Aug 2001 18:54:02 +0200
Subject: Using the ajSeqRead function....
References: <200108081725.SAA11200@bromine.hgmp.mrc.ac.uk>
Message-ID: <3B8140A9.4FF833AE@embl-grenoble.fr>

Hi all,
        I have run into a bit of a problem when trying to open a file
from a file selection dialog.  I need to parse (nucleotide) sequence
files and spit their contents into AjPSeq instances.

   Question 1:  Am I right in thinking that ajSeqRead should parse these
files in this manner?
If so:
    Question 2: Am I doing this right? Here is a simplified code
snippet:

------------->8-------------------8<------------------
int
open_ok () {
AjPSeq seq;
AjPSeqin seqIn;
AjPStr seqfileInName;

seq = ajSeqNew ();
seqIn = ajSeqinNew ();
seqfileInName = ajStrNewC("actin.seq");
ajSeqinUsa ( &seqIn , seqfileInName );
ajSeqinSetNuc (seqIn);
ajSeqRead (seq , seqIn);
ajSeqinDel (&seqIn);
return 0;

}

------------->8-------------------8<------------------

Sorry if it's a daft pair of questions, I'm a bit of a newbie! (Hence my poor code too!)

Thanks in advance for any comments,

Mark

--
Mark Brooks,
EMBL Grenoble Outstation,
6, rue Jules Horowitz, BP181
38042 Grenoble Cedex 9, France.
Tel: + (0)4 76 20 72 85


From ableasby at hgmp.mrc.ac.uk  Mon Aug 20 15:06:53 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Mon, 20 Aug 2001 20:06:53 +0100 (BST)
Subject: Using the ajSeqRead function....
Message-ID: <200108201906.UAA26210@bromine.hgmp.mrc.ac.uk>

Hi Mark,

There is a functio already for doing that sort of thing
i.e. ajSeqGetFromUsa

Give it a USA and say whether the sequence is a protein
or not. It'll return a filled AjPSeq.

HTH

Alan


From dmartin at bioinformatics.msiwtb.dundee.ac.uk  Wed Aug 22 11:30:42 2001
From: dmartin at bioinformatics.msiwtb.dundee.ac.uk (David Martin)
Date: Wed, 22 Aug 2001 16:30:42 +0100 (BST)
Subject: USA extensions
Message-ID: <Pine.LNX.4.33.0108221628150.22422-100000@bioinformatics.msiwtb.dundee.ac.uk>


A long time ago on the wish list it was mooted that USA's could be
extended to include region information. Has anything come of this and what
are the thoughts on feasibility.

In other words it would be nice to be able to write a listfile like

em:hstf[30..90]
em:hscfvii[92..103,108-120]

..d


----------------------------------
David Martin PhD
Bioinformatics Scientific Officer
Wellcome Trust Biocentre, Dundee
----------------------------------


From peter.rice at uk.lionbioscience.com  Wed Aug 22 11:34:01 2001
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Wed, 22 Aug 2001 16:34:01 +0100
Subject: USA extensions
References: <Pine.LNX.4.33.0108221628150.22422-100000@bioinformatics.msiwtb.dundee.ac.uk>
Message-ID: <3B83D0E9.248C58D0@uk.lionbioscience.com>

David Martin wrote:
> 
> A long time ago on the wish list it was mooted that USA's could be
> extended to include region information. Has anything come of this and what
> are the thoughts on feasibility.
> 
> In other words it would be nice to be able to write a listfile like
> 
> em:hstf[30..90]
> em:hscfvii[92..103,108-120]

Something to discuss at this week's EMBOSS meeting....

Among the possible 'report' formats for writing 'feature' data (any program
that reports start, end and score for some pattern) is a ListFile format to
write a list file that can be used to read in the subsequences.

For this we do need a USA syntax that includes start, end, reverse.

The syntax above would be a reasonable solution (with to..from for reversed
sequences)

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From wshui at cse.unsw.edu.au  Wed Aug  1 16:48:15 2001
From: wshui at cse.unsw.edu.au (Bill Shui)
Date: Wed, 1 Aug 2001 16:48:15 +0000
Subject: Question on Edatas
Message-ID: <20010801164815.A5750@nimky.cse.unsw.edu.au>

Hi all,
    I'm using EMBOSS as part of my honours project at the moment.

    I'm in the process of disecting the entire emboss package and use only
    some of the ajax and nucleus libraries.

    However, I was just wandering as to the data files that were packaged
    with the release. Are they example data files or are they being 
    used by one of the emboss programs?

    I know that the file "Edayhoff.freq" is being used by pepstat so
    I assume that all of these data files are being used by other programs
    rather than just being example data files.

    Am I right on this?

    Please advice.

thanks in advance.

Bill
-- 
There are three kinds of people: men, women, and unix.
------------------------------------------------------
Bill Shui		Email: wshui at cse.unsw.edu.au


From ableasby at hgmp.mrc.ac.uk  Wed Aug  1 06:53:31 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 1 Aug 2001 07:53:31 +0100 (BST)
Subject: Question on Edatas
Message-ID: <200108010653.HAA05098@bromine.hgmp.mrc.ac.uk>

Hi,

The files in the embdoss/data directory and subdirectories contain data
that is essential to the operation of either applications or indeed
to the library (e.g. the Efeatures data for feature tables). The
data in test/data are generally for quality testing or examples.

HTH

Alan


From touro at capoeirabrasil.com.au  Mon Aug  6 15:03:07 2001
From: touro at capoeirabrasil.com.au (Bill Shui)
Date: Tue, 7 Aug 2001 01:03:07 +1000
Subject: interpreting ajtranslate.* in ajax library.
Message-ID: <20010807010307.A17321@capoeira.sydney>

Hi there,
    I'm using EMBOSS as part of my honours thesis. What I am doing now is 
    breaking up all the library modules and reuse bits of them to get something
    working.


    However, I am stuck with ajtranslate or the transeq program.

    In the file ajtranslate.c, the function ajTrnReadFile uses struct
    AjSTrn to store the EGC data (well at least that's how I understood it)
    correct me if I was wrong. 


    Now, I don't understand why the variable GC and Starts in 
    AjSTrn are 15 by 15 by 15 matrices?

    I also do not understand the meaning of initialisation of
    the char arrays trnconv and trncomp.

    and why most of the arrays are 14?

your prompt reply to this is much appreciated as I really need this for my
thesis and I'm on a tight schedule.

thanks in advance.

regards.

Bill
-- 
The mark of a good party is that you wake up the next morning
wanting to change your name and start a new life in different
city.
                -- Vance Bourjaily, "Esquire"
---------------------------------------------
Bill Shui           Email: wshui at bigpond.net.au
			   wshui at cse.unsw.edu.au
			   touro at capoeirabrasil.com.au
			   bill.shui at proteomesystems.com
Bioinformatics Programmer


From gwilliam at hgmp.mrc.ac.uk  Mon Aug  6 15:24:13 2001
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Mon, 06 Aug 2001 16:24:13 +0100
Subject: interpreting ajtranslate.* in ajax library.
References: <20010807010307.A17321@capoeira.sydney>
Message-ID: <3B6EB69C.FFD741C8@hgmp.mrc.ac.uk>

Bill Shui wrote:
> 
> Hi there,
>     I'm using EMBOSS as part of my honours thesis. What I am doing now is
>     breaking up all the library modules and reuse bits of them to get something
>     working.
> 
>     However, I am stuck with ajtranslate or the transeq program.
> 
>     In the file ajtranslate.c, the function ajTrnReadFile uses struct
>     AjSTrn to store the EGC data (well at least that's how I understood it)
>     correct me if I was wrong.

Correct.

>     Now, I don't understand why the variable GC and Starts in
>     AjSTrn are 15 by 15 by 15 matrices?

Each codon has 3 bases, so we use a 3-dimensional array to convert the
codons to residues.

The size of the array could be 4x4x4 for most purposes (there are four
bases: A, C, G, T) but sometimes ambiguity codes are used in positions
where the base is uncertain, e.g. 'M' codes for 'A' or 'C'. There are 15
bases if you include these ambiguity codes (including 'N' for the
completely unknown base). So to translate codons that have ambiguity
codes in them, you really need a 15x15x15 matrix.

Similarly, for the Start codons, although there are far fewer codons
that are Start codons and so this could probably have been done in a
more memory efficient way.

>     I also do not understand the meaning of initialisation of
>     the char arrays trnconv and trncomp.

To look up an element in the 15x15x15 codon to residue matrix, you need
to convert the bases to numbers. This is what trnconv[] is for. 
trncomp[] does the same thing, but gives you the number of the code for
the complement - this is used for translating the complement of the
sequence.

>     and why most of the arrays are 14?

Most of the arrays trnconv[] and trncomp[] hold '14' because this is the
code I am using for 'N' (unknown) - these are letters that do not
correspond to any recognised nucleotide code letter (i.e. they are not
one of: ACGTUMRWSYKVHDBN).

See:
http://www.chem.qmw.ac.uk/iupac/misc/naseq.html
for details of the ambiguity codes.

> your prompt reply to this is much appreciated as I really need this for my
> thesis and I'm on a tight schedule.

Is this soon enough for you?

> thanks in advance.
> 
> regards.
> 
> Bill
> --
> The mark of a good party is that you wake up the next morning
> wanting to change your name and start a new life in different
> city.
>                 -- Vance Bourjaily, "Esquire"
> ---------------------------------------------
> Bill Shui           Email: wshui at bigpond.net.au
>                            wshui at cse.unsw.edu.au
>                            touro at capoeirabrasil.com.au
>                            bill.shui at proteomesystems.com
> Bioinformatics Programmer

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From dmartin at bioinformatics.msiwtb.dundee.ac.uk  Wed Aug  8 15:23:25 2001
From: dmartin at bioinformatics.msiwtb.dundee.ac.uk (David Martin)
Date: Wed, 8 Aug 2001 16:23:25 +0100 (BST)
Subject: Using the dynamic arrays in a random access manner
Message-ID: <Pine.LNX.4.33.0108081616410.926-100000@bioinformatics.msiwtb.dundee.ac.uk>


One problem with using the 2d and higher arrays in ajarr.c is that if one
doesn't fill them sequentially but instead uses them in a semi sparse
manner this can lead to problems.

If one wants the size of an array teh current function calls return the
maximum dimensions, but this doesn't neccessarily mean that every point
within that area has been allocated. This can lead to a warning or worse
when trying to aj...Get the value and the particular one dimensional array
doesn't stretch that far or hasn't been allocated at all.

I have a couple of patches for the Int2d arrays which add two more
function calls ajInt2dLenElem (gets the second dimension length at any
particular specified first dimension position) and ajInt2dElemExists which
checks to see whether an element has indeed been allocated yet.

The alternative is to roll some of this into ajInt2dGet so that an
attempt to access an unallocated element returns null or zero instead of
throwing an error, but I am not sure that is completely desirable.

Thoughts

..d


----------------------------------
David Martin PhD
Bioinformatics Scientific Officer
Wellcome Trust Biocentre, Dundee
----------------------------------


From ableasby at hgmp.mrc.ac.uk  Wed Aug  8 17:25:54 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 8 Aug 2001 18:25:54 +0100 (BST)
Subject: Using the dynamic arrays in a random access manner
Message-ID: <200108081725.SAA11200@bromine.hgmp.mrc.ac.uk>

The proposed functions certainly wouldn't hurt anything but
their use would really be a programming style matter.

Cheers
Alan


From gbottu at ben.vub.ac.be  Wed Aug 15 16:11:41 2001
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Wed, 15 Aug 2001 18:11:41 +0200 (MET DST)
Subject: EMBOSS whish list
Message-ID: <200108151611.SAA13664@bigben.vub.ac.be>

from : BEN

	Dear colleagues,
	
I know it is easy to say "we need this, we need that !", while it takes so much 
time and effort to program. Nevertheless, I take the liberty to make a few 
suggestions for improvements that would help EMBOSS to equal and even surpass 
GCG in userfriendliness :

- for graphic display : an X-Window graphic with "zoom" function, also tektronix 
emulation in color.
- a mechanism to easily submit programs to a "batch queue" from the command line 
(comparable to the -batch parameter of GCG and egcg)
- at the present it is difficult to find out which data file(s) a program uses 
and hence to put an alternative data file in the working directory. You have to 
read exhaustively the on-line manual. It would be nice if there was always a 
command line parameter with default value that would appear if you do
xxx -help
or alternatively maybe a new General Parameter to make
e.g.   xxx -datafile
to make a program display its data file names. The data file name should 
preferable appear in the ACD file so that the parsers that generate pages for 
the graphical user interfaces can find it.
- the possibility to input/output sequences from any program in any format 
should be extended to other kinds of data. E.g.
  base/aa symbol comparison tables : GCG, BLAST, SIM,...
  codon usage tables : GCG, CUTG,...
  3D structures : PDB, Kinemage,...   (is already being done ?)
  ...


	Sincerely,
	Guy Bottu 
 

From jkb at mrc-lmb.cam.ac.uk  Wed Aug 15 16:22:44 2001
From: jkb at mrc-lmb.cam.ac.uk (James Bonfield)
Date: Wed, 15 Aug 2001 17:22:44 +0100
Subject: EMBOSS whish list
In-Reply-To: <200108151611.SAA13664@bigben.vub.ac.be>; from gbottu@ben.vub.ac.be on Wed, Aug 15, 2001 at 06:11:41PM +0200
References: <200108151611.SAA13664@bigben.vub.ac.be>
Message-ID: <20010815172244.B30470@arran.mrc-lmb.cam.ac.uk>

On Wed, Aug 15, 2001 at 06:11:41PM +0200, Guy Bottu wrote:
> - for graphic display : an X-Window graphic with "zoom" function, also tektronix 
> emulation in color.

Maybe now is the time for a quick plug (sorry). We've (finally) managed to
make an official release of "spin". This includes a graphical interface to
EMBOSS, including the ability to zoom, scroll and superimpose plots.

Spin is still in its early days as an EMBOSS interface (although it has its
own algorithms too, some of which date back to before the dawn of time).

See http://www.mrc-lmb.cam.ac.uk/pubseq/ for more details.

Alas it doesn't solve your other problems.

I've also heard via the grapevine that another graphical interface to EMBOSS
is in development; using Java. Is there any news on the progress of this?

> e.g.   xxx -datafile
> to make a program display its data file names. The data file name should 
> preferable appear in the ACD file so that the parsers that generate pages for 
> the graphical user interfaces can find it.

This is a good point. We found this horrid to do, especially for the programs
that generate an arbitrary number of output files. Before running the programs
we delete all {progname}.dat[0-9]* files. Then after running we do a filename
'glob' to see which files have been created. This works OK, but can really
catch out the user if they try to have an input file named "syco.dat" (for
example)!

James

-- 
James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/


From ableasby at hgmp.mrc.ac.uk  Wed Aug 15 20:04:50 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Wed, 15 Aug 2001 21:04:50 +0100 (BST)
Subject: EMBOSS whish list
Message-ID: <200108152004.VAA14659@bromine.hgmp.mrc.ac.uk>

The java interface (currently called jEMBOSS) is still under development
and hopefully will be released before the end of the year. It
doesn't have anything as fancy as zoom yet. Maybe the title should
have been swish list?

We have filled one of the programming posts (corba and soapy things)
and are hoping to fill the other soon. The intention is for the
second post to work in the graphics area (openGL etc).

Alan

PS: Other points noted.


From brooks at embl-grenoble.fr  Mon Aug 20 16:54:02 2001
From: brooks at embl-grenoble.fr (Brooks Mark)
Date: Mon, 20 Aug 2001 18:54:02 +0200
Subject: Using the ajSeqRead function....
References: <200108081725.SAA11200@bromine.hgmp.mrc.ac.uk>
Message-ID: <3B8140A9.4FF833AE@embl-grenoble.fr>

Hi all,
        I have run into a bit of a problem when trying to open a file
from a file selection dialog.  I need to parse (nucleotide) sequence
files and spit their contents into AjPSeq instances.

   Question 1:  Am I right in thinking that ajSeqRead should parse these
files in this manner?
If so:
    Question 2: Am I doing this right? Here is a simplified code
snippet:

------------->8-------------------8<------------------
int
open_ok () {
AjPSeq seq;
AjPSeqin seqIn;
AjPStr seqfileInName;

seq = ajSeqNew ();
seqIn = ajSeqinNew ();
seqfileInName = ajStrNewC("actin.seq");
ajSeqinUsa ( &seqIn , seqfileInName );
ajSeqinSetNuc (seqIn);
ajSeqRead (seq , seqIn);
ajSeqinDel (&seqIn);
return 0;

}

------------->8-------------------8<------------------

Sorry if it's a daft pair of questions, I'm a bit of a newbie! (Hence my poor code too!)

Thanks in advance for any comments,

Mark

--
Mark Brooks,
EMBL Grenoble Outstation,
6, rue Jules Horowitz, BP181
38042 Grenoble Cedex 9, France.
Tel: + (0)4 76 20 72 85


From ableasby at hgmp.mrc.ac.uk  Mon Aug 20 19:06:53 2001
From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk)
Date: Mon, 20 Aug 2001 20:06:53 +0100 (BST)
Subject: Using the ajSeqRead function....
Message-ID: <200108201906.UAA26210@bromine.hgmp.mrc.ac.uk>

Hi Mark,

There is a functio already for doing that sort of thing
i.e. ajSeqGetFromUsa

Give it a USA and say whether the sequence is a protein
or not. It'll return a filled AjPSeq.

HTH

Alan


From dmartin at bioinformatics.msiwtb.dundee.ac.uk  Wed Aug 22 15:30:42 2001
From: dmartin at bioinformatics.msiwtb.dundee.ac.uk (David Martin)
Date: Wed, 22 Aug 2001 16:30:42 +0100 (BST)
Subject: USA extensions
Message-ID: <Pine.LNX.4.33.0108221628150.22422-100000@bioinformatics.msiwtb.dundee.ac.uk>


A long time ago on the wish list it was mooted that USA's could be
extended to include region information. Has anything come of this and what
are the thoughts on feasibility.

In other words it would be nice to be able to write a listfile like

em:hstf[30..90]
em:hscfvii[92..103,108-120]

..d


----------------------------------
David Martin PhD
Bioinformatics Scientific Officer
Wellcome Trust Biocentre, Dundee
----------------------------------


From peter.rice at uk.lionbioscience.com  Wed Aug 22 15:34:01 2001
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Wed, 22 Aug 2001 16:34:01 +0100
Subject: USA extensions
References: <Pine.LNX.4.33.0108221628150.22422-100000@bioinformatics.msiwtb.dundee.ac.uk>
Message-ID: <3B83D0E9.248C58D0@uk.lionbioscience.com>

David Martin wrote:
> 
> A long time ago on the wish list it was mooted that USA's could be
> extended to include region information. Has anything come of this and what
> are the thoughts on feasibility.
> 
> In other words it would be nice to be able to write a listfile like
> 
> em:hstf[30..90]
> em:hscfvii[92..103,108-120]

Something to discuss at this week's EMBOSS meeting....

Among the possible 'report' formats for writing 'feature' data (any program
that reports start, end and score for some pattern) is a ListFile format to
write a list file that can be used to read in the subsequences.

For this we do need a USA syntax that includes start, end, reverse.

The syntax above would be a reasonable solution (with to..from for reversed
sequences)

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723