From michaelwatson at paradigm-therapeutics.co.uk  Tue Jun 18 10:59:43 2002
From: michaelwatson at paradigm-therapeutics.co.uk (Mick Watson)
Date: Tue, 18 Jun 2002 15:59:43 +0100
Subject: Transeq
Message-ID: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk>

Hi

Can anyone tell me why when I give transeq a fasta file like:

>gnl|UG|Hs#S3220135
ATGGCAGCGCGCCCGCTGCCCGTGTCCCCCGCCCGCGCCCTCCTGCTCGCCCTGGCCGGTGCTCTGCTCGCGCCCTGCGA

I get the output

>Hs#S3220135_1
MAARPLPVSPARALLLALAGALLAPCEARGVSLWNEGRADEVVSASVRSGDLWIPVKSFD

Where has the "gnl|UG|" part of my description line gone???!!!  And
why?  Can I switch this "feature" off?

Thanks
Mick


From gwilliam at hgmp.mrc.ac.uk  Tue Jun 18 11:15:06 2002
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Tue, 18 Jun 2002 16:15:06 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk>
Message-ID: <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk>


You get the output in 'fasta' format by default.
If you want it in 'ncbi' format, then you have to ask for it:

transeq nucleic.seq ncbi::protein.pep 

or 

transeq nucleic.seq -osf ncbi protein.pep

Gary


Mick Watson wrote:
> 
> Hi
> 
> Can anyone tell me why when I give transeq a fasta file like:
> 
> >gnl|UG|Hs#S3220135
> ATGGCAGCGCGCCCGCTGCCCGTGTCCCCCGCCCGCGCCCTCCTGCTCGCCCTGGCCGGTGCTCTGCTCGCGCCCTGCGA
> 
> I get the output
> 
> >Hs#S3220135_1
> MAARPLPVSPARALLLALAGALLAPCEARGVSLWNEGRADEVVSASVRSGDLWIPVKSFD
> 
> Where has the "gnl|UG|" part of my description line gone???!!!  And
> why?  Can I switch this "feature" off?
> 
> Thanks
> Mick

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From peter.rice at uk.lionbioscience.com  Tue Jun 18 11:21:28 2002
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 18 Jun 2002 16:21:28 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk>
Message-ID: <3D0F4FF8.38C867FC@uk.lionbioscience.com>

"Gary Williams, Tel 01223 494522" wrote:
> 
> You get the output in 'fasta' format by default.
> If you want it in 'ncbi' format, then you have to ask for it:
> 
> transeq nucleic.seq ncbi::protein.pep
> 
> or
> 
> transeq nucleic.seq -osf ncbi protein.pep

You still lose the "UG" database name. You wil get an identifier of:

>gnl|unk|Hs#S3220135_1
MAARPLPVSPARALLLALAGALLAPCX


NCBI's "FASTA" identifiers are strange things that EMBOSS can read but not
save completely ... but this should not be a problem because "UG" is not
really the database name for the protein translation.

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From michaelwatson at paradigm-therapeutics.co.uk  Tue Jun 18 11:27:28 2002
From: michaelwatson at paradigm-therapeutics.co.uk (Mick Watson)
Date: Tue, 18 Jun 2002 16:27:28 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk> <3D0F4FF8.38C867FC@uk.lionbioscience.com>
Message-ID: <3D0F5160.EC913DC@paradigm-therapeutics.co.uk>

OK, thanks for the help!

In this instance I really wanted the first part of the fasta line to stay the
same - I realise that it doesn't anyway die to the "_1" which is appended -
so now as well as removing that I am also putting the "gnl|UG|" part back on
the front too!

My first instinct would be to just leave the fasta line alone other than to
simply append _# to the translation......

Peter Rice wrote:

> "Gary Williams, Tel 01223 494522" wrote:
> >
> > You get the output in 'fasta' format by default.
> > If you want it in 'ncbi' format, then you have to ask for it:
> >
> > transeq nucleic.seq ncbi::protein.pep
> >
> > or
> >
> > transeq nucleic.seq -osf ncbi protein.pep
>
> You still lose the "UG" database name. You wil get an identifier of:
>
> >gnl|unk|Hs#S3220135_1
> MAARPLPVSPARALLLALAGALLAPCX
>
> NCBI's "FASTA" identifiers are strange things that EMBOSS can read but not
> save completely ... but this should not be a problem because "UG" is not
> really the database name for the protein translation.
>
> Peter
>
> --
> ------------------------------------------------
> Peter Rice, LION Bioscience Ltd, Cambridge, UK
> peter.rice at uk.lionbioscience.com +44 1223 224723


From peter.rice at uk.lionbioscience.com  Tue Jun 18 11:41:38 2002
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 18 Jun 2002 16:41:38 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk> <3D0F4FF8.38C867FC@uk.lionbioscience.com> <3D0F5160.EC913DC@paradigm-therapeutics.co.uk>
Message-ID: <3D0F54B2.F575F7EA@uk.lionbioscience.com>

Mick Watson wrote:
> 
> In this instance I really wanted the first part of the fasta line to stay the
> same - I realise that it doesn't anyway die to the "_1" which is appended -
> so now as well as removing that I am also putting the "gnl|UG|" part back on
> the front too!
> 
> My first instinct would be to just leave the fasta line alone other than to
> simply append _# to the translation......

It is a very small change to allow the following:

transeq nucleic.seq -osf ncbi protein.pep -osdb UG

This would replace the "unk" with "UG" in your NCBI output.

Probably will be available in EMBOSS 2.5

regards,

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From d.counsell at hgmp.mrc.ac.uk  Mon Jun 24 06:17:16 2002
From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell)
Date: Mon, 24 Jun 2002 11:17:16 +0100
Subject: The EMBOSS Documentation Project
Message-ID: <20020624111716.F24581@hgmp.mrc.ac.uk>

Dear All                                                                        
                                                                                
                                                                                
There will be a meeting to discuss EMBOSS documentation on Wednesday
morning at 10:30 in the loft of the Hinxton conference centre.  There
is a real possibility of obtaining some funding for developing this so
it'd be nice to have a plan for how we could spend the money if/when
it appears.
                                                                                
What do users want from our documentation system?  What do we want
from it?  How are we going to store, implement and distribute it?  Who
should we hire to build it?  What are the prospects for making this
work a research project in itself?  Who gets the movie rights?
                                                                                
I'm soliciting opinions on all these and other questions---ideally
*other people's* opinions, so please come along if you can, and/or
circulate this message to anyone else you think might be able to
contribute.  I've booked facilities for "about half-a-dozen".

You are also encouraged to write to me with suggestions about
documentation.  I'm relatively new round here so apologies if this
message falls outside the scope of either of these lists; pretend you
didn't see it.
                                                                                
                                                                                
all the best                                                                    
                                                                                
Damian                    

-- 
Damian COUNSELL                      email:  d.counsell at hgmp.mrc.ac.uk
MRC Human Genome Mapping Project RC  phone:  +44 (0)1223 494500
Cambridge CB10 1SB                   direct: +44 (0)1223 494585
http://www.hgmp.mrc.ac.uk/~dcounsel/ fax:    +44 (0)1223 494512


From jrvalverde at cnb.uam.es  Mon Jun 24 12:02:26 2002
From: jrvalverde at cnb.uam.es (jrvalverde at cnb.uam.es)
Date: Mon, 24 Jun 2002 18:02:26 +0200 (DST)
Subject: The EMBOSS Documentation Project
In-Reply-To: <20020624111716.F24581@hgmp.mrc.ac.uk>
Message-ID: <200206241602.g5OG2Sn2015835@embnet.cnb.uam.es>

Damian Counsell <d.counsell at hgmp.mrc.ac.uk> wrote:
> 

Just my 2? worth

> What do users want from our documentation system?  What do we want
> from it?  How are we going to store, implement and distribute it?  Who
> should we hire to build it?  What are the prospects for making this
> work a research project in itself?  Who gets the movie rights?

1) I'd say it to be practical, readable and understandable. Readability
depends on using users' language and point of view. If they don't use it, 
it will be worthless...

2) I'd venture we want users to use, torture and exploit it. This leads
to question 5, research:

5) Make it a research project on HCI in Biosciences. What do users *in
all the various areas of Life Sciences* need to use EMBOSS? You need to
target various "markets": obviously Biologists/Molecular Biologists, which
we _believe_ we already know well, but also Medical Doctors, whose needs
are quite different but solvable with the same tools, Pharmacologists,
Population Geneticists, etc. How can one describe a program like CLUSTAL
so it is clear that it is not only useful in building phylogenies, but
also in spotting preserved/differing regions, population variability,
disease relevance, epidemiology dynamics, etc...? AND make it amenable
to all the various target users?

One may of course write looooooong encyclopedic manuals, but I fear no
one would use them (they'd be frightened). Or thousands of Howtos, but I 
fear no one can cover all possibilities. Or find out new, efficient
LifeSci HCI techniques.

3) That should be the result of the project, it might be a "documentation"
database that may be queried by examples, goals, tasks, targets, whatever.
Or a set of GUIs that embed the knowledge on using the software transparently. 
And should hide complexity so users do not get frightened by its volume, and
that appeals to various kinds of users, maybe with customisable or alternate 
UIs. I'm not aware of any field studies on HCI design in the Biosciences,
and there's a serious need for it. We are assuming that the interfaces that
work in other areas should also do here, but is it so? From my experience
I fear not.

Actually many people complains SRS (despite being powerful) is "difficult",
and that W2H is too complex, but they won't like simple forms either.. So,
what the hell do _they_ want? I'd really like to know. Even when they use the
tools, they rarely know something as basic as whom to acknowledge -or how 
to find out whom- for the tools used in citations. Or even if they should 
cite anything.

Had I to bet, documentation should probably be tightly integrated inside
the software UI itself, so that in addition to results it produces guidelines
on how to understand/interpret them, in addition to program listings it
produces "task identification" protocols, etc... Something quite different
from what we have now, may be akin to "wizards" or a mixture between 
a knowledge base, wizards, software, and whoknowswhat younameit.

4) So it should be someone that knows or wills to learn about a) Life Sciences,
b) Bioinformatics, c) Statistics, d) Human-Computer Interfaces, that wills
to work tightly with users, listen to them and translate what they _do_ or
would _like to do_ into easy, amenable instructions/protocols/interfaces.
Someone has to seat down with many users, interview them, generate
polls, collect statistics, implement _user_ drafts (not his/her/or their
advisor at all), and then devise new polls, collect new data and find which
do actually work in each field and why, and possibly -at the very end- try
their own hands at improving user suggested methods.

To sum it up: I'd suggest to start with a blank sheet of paper, fully from
scratch, forget everything done (except perhaps to collect user satisfaction
statistics to start up), and find out what no one else has before: what do
users really want in order to use correctly the software (i.e. making
_educated_ decisions, understanding analysis results from a critical
point of view, and doing responsible use of tools).

Which answers 6) above: credits would go to users, for they would be the actual
source of information, technical tips, UI designs, suggestions, etc.. and
thus results should revert to them (with due recognition to the catalytic
"medium", who would have to sum up statistics and conclusions in papers). 

> You are also encouraged to write to me with suggestions about
> documentation.  I'm relatively new round here so apologies if this
> message falls outside the scope of either of these lists; pretend you
> didn't see it.
>                                                                                 

Dunno if I made myself clear, just ask for anything you want. 

	Hope this helps.

				j


From letondal at pasteur.fr  Mon Jun 24 16:04:41 2002
From: letondal at pasteur.fr (Catherine Letondal)
Date: Mon, 24 Jun 2002 22:04:41 +0200
Subject: The EMBOSS Documentation Project 
In-Reply-To: Your message of "Mon, 24 Jun 2002 11:17:16 BST."
             <20020624111716.F24581@hgmp.mrc.ac.uk> 
Message-ID: <200206242004.g5OK4fF8399392@electre.pasteur.fr>


Damian Counsell wrote:

> What do users want from our documentation system?  

Ask them, or rather look at them.

The best way to know is to make a few user studies around the current documentation
system - not that many studies are necessary IMHO (I don't fully agree with Jose in his 
previous message that you really need a HCI research project for this, although it could of
course be very interesting).
A very efficient and light technique is testing by pair (see for instance
http://www.cs.umd.edu/~zzj/Codiscov.htm) with a scenario or a precise 
task + thinking-aloud method. This is a way to learn a lot about what should be done
in a few time. It brings more information than a questionnaire.

> I'm soliciting opinions on all these and other questions---ideally
> *other people's* opinions, so please come along if you can, and/or
> circulate this message to anyone else you think might be able to
> contribute.  I've booked facilities for "about half-a-dozen".

Several levels of documentation might be a good idea, the first one being the GUI 
itself to remove accidental complexity, and at least 1 not too simplistic example of 
input and output.
A FAQ (built on actually frequently asked questions :-)) could also be useful.
The new documentation itself should be tested.

What is simple has to look simple, but what is complex must not be hidden. A simple
aspect can be an affordance to go further also...

A pragmatic approach...

--
Catherine Letondal -- Pasteur Institute Computing Center


From gbottu at ben.vub.ac.be  Tue Jun 25 05:13:49 2002
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Tue, 25 Jun 2002 11:13:49 +0200 (CEST)
Subject: The EMBOSS Documentation Project
Message-ID: <200206250913.LAA0001109989@ben.vub.ac.be>

from : BEN

	Dear All,
	
A good manual is certainly wellcome. What is available now is not up-to-date, 
uncomplete and/or only available on remote WWW sites. Of course, writing a 
manual is a big effort and programmers are notoriously not eager to spend much 
effort in writing manuals. It is nearly unavoidable of increasing the EMBOSS 
developement team with a person whose job it is to write the manual.

How should it look like ? IMHO, the GCG manual is a good example, except that it 
contains too much unneeded repetititons.

What do we need ?
documentation for each program + a general user's guide
available as well on-line as as printable text (maybe PDF), since paging through 
a printed text is still more convenient than browsing through screens
also a developer's guide with info about ACD, sequence and other formats, AJAX 
and NUCLEUS functions, etc.

	Guy Bottu 


From gwilliam at hgmp.mrc.ac.uk  Tue Jun 25 07:41:36 2002
From: gwilliam at hgmp.mrc.ac.uk (gwilliam at hgmp.mrc.ac.uk)
Date: Tue, 25 Jun 2002 12:41:36 +0100 (BST)
Subject: Current documentation
Message-ID: <200206251141.MAA15672@bromine.hgmp.mrc.ac.uk>

This is the current state of EMBOSS documentation to the best of my knowledge.


EMBOSS documentation
--------------------

The current types and locations of documentation of EMBOSS are:


Files
=====
 - these are mainly plain text files in the EMBOSS distribution


Files in the EMBOSS top level directory
---------------------------------------
 - these are all distributed and are in the CVS tree

ChangeLog
 - a brief description of the changes made to any code for the
   next and previous releases. 
 - maintained by anyone who changes code
 - up-to-date

README 
 - a brief description of how to compile and install EMBOSS
 - maintained by (?)
 - up-to-date
    
FAQ 
 - a description of questions and answers on various topics in EMBOSS
 - maintained by Gary
 - OUT OF DATE

AUTHORS 
 - a brief list of authors
 - maintained by (?)
 - OUT OF DATE

THANKS 
 - thanks to people who have distributed code
 - maintained by (?)
 - OUT OF DATE
  

NEWS
 - brief description of changes in versions, now just says, see ChangeLog
 - maintained by (?)
 - up-to-date
  

Files in the EMBOSS 'doc/manuals' directory
-------------------------------------------
 - these are all distributed and are in the CVS tree

EMBOSS-FreeBSD-HOWTO.txt
 - how to install EMBOSS under FreeBSD
 - maintained by (?)
 - up-to-date ?

admin*
 - David Martin's Admin Guide
 - maintained by Gary
 - will be replaced by Peter Rice's Admin Guide

emboss_qg*
 - David Martin's EMBOSS Quick Guide
 - maintained by Gary
 - up-to-date (?)

internals*
 - Peter Rice's descriptions of the sequence access libraries
 - maintained by Peter
 - up-to-date


Files in the EMBOSS 'doc/programs/html' directory
--------------------------------------------
 - these are all distributed and are in the CVS tree
 - searched by wossname
 - this is a copy of the files in
   http://www.uk.embnet.org/Software/EMBOSS/Apps
   but will all HTML inline instead of using 'server-side includes'
   produced by running the script 'scripts/autodoc.pl'
 - maintained by Gary
 - up-to-date copy of http://www.uk.embnet.org/Software/EMBOSS/Apps

Files in the EMBOSS 'doc/programs/text' directory
--------------------------------------------
 - these are all distributed and are in the CVS tree
 - searched by wossname
 - this is a text copy of the files in the directory 'doc/programs/html'
   produced by running 'lynx -dump' as part of the script
   'scripts/autodoc.pl'
 - maintained by Gary
 - up-to-date copy of the directory 'doc/programs/html'

Files in the EMBOSS 'doc/tutorials' directory
---------------------------------------------
 - these are all distributed and are in the CVS tree

emboss-gcg.ppt
 - powerpoint file of talk given by Gary/Lisa May 2001 for those
   converting from GCG to EMBOSS
 - not maintained

emboss-interfaces.ppt
 - powerpoint file of talk given by Gary/Lisa May 2001 for those
   converting from GCG to EMBOSS
 - not maintained
 - OUT OF DATE (no mention of Jemboss)

emboss_tutorial*
 - TeX and PostScript files of the 'current' EMBOSS tutorial
 - authored mainly by Val Curwen
 - not maintained
 - COULD DO WITH A SUBSTANTIAL REWRITE

Files in in the EMBOSS 'emboss/acd' directory
---------------------------------------------
 - these are all distributed and are in the CVS tree
 - these are the .acd files
 - there is one file per application
 - they are generally maintained by the authors of the application
 - they contain the following types of documentation:

one-line documentation line 
 - to describe what the program does
 - it is displayed in many derived documents and programs
 - it is defined in 
   http://www.uk.embnet.org/Software/EMBOSS/Acd/syntax.html#Docattr9
 - The length of the doc: string should be kept to 63 characters or
   shorter in order to allow the wossname utility to display each
   program name and its documentation on one 80 character line. 
 - The doc: string should not end with a '.' character
 - Any acronyms or capatalised abbreviations in the doc: string should
   be written in upper-case.  (e.g: SNPs, EST, DNA, ABI, SRS, ASCII,
   CDS, mRNA, B-DNA, RNA, CpG, ORFs, MAR/SAR, PCR, STS, REBASE, SCOP,
   PROSITE, PRINTS, EMBL, TRANSFAC, BLAST, GCG, EMBOSS)
 - The doc: string should start with an upper-case letter. 

groups
 - to describe the general function of the program
 - it is defined in
   http://www.uk.embnet.org/Software/EMBOSS/Acd/syntax.html#Grpattr10
 - it is displayed in many derived documents and programs
 - the two-level structure and names of the groups are well-defined as a
   result of extensive negotiations with the Staden Team and other
   GUI-builders. 

prompts
 - each qualifier of the program should have a prompt
 - this information is NOT displayed anywhere when help is requested!

help
 - gives help on the qualifier
 - it is displayed in many derived documents and when running with '-help'
 - each qualifier can have an unlimited amount of text describing it.
 - THIS IS OFTEN MISSING
 - I suggest that all .acd files should be reviewed and help added as a 
   matter of urgency.


Web pages
=========
 - These are mainly HTML documents (also some text & GIF)
 - They are mainly hand-crafted
 - Top-level URL is
   http://www.uk.embnet.org/Software/EMBOSS
 - These files are NOT distributed with EMBOSS.
 - They probably should be distributed so that other sites can set up their
   own copies. 
 - They should certainly be included in the CVS tree so any author can
   edit them.

Index page
----------
 - Maintained by Gary
 - Up-to-date
 - Links to overview, applications, userdocs, interfaces, downloading,
   admin, internals, mailinglist, coordination, licencing, credits. 

Overview
--------
 - Brief description of EMBOSS
 - Maintained by Gary
 - Up-to-date


Applications
------------
 - Links to groups and EMBASSY application pages
 - Table of links to each applications's page
 - Maintained by Gary 
 - Up-to-date

Userdocs
--------
 - A link to the tutorial (a copy of the tutorial in the EMBOSS
   'doc/tutorials' directory)
 - Links to descriptions of:
   - Uniform Sequence Addresses
   - Sequence Formats
   - Alignment Formats
   - Feature Formats
   - Report Formats
 - All of these are maintained by Gary
 - Up-to-date

Interfaces
----------
 - links to Jemboss and other interface sites
 - Maintained by Gary
 - Some descriptions and links may need attention

Downloading
-----------
 - Brief description of how to download and unpack EMBOSS
 - Maintained by Gary and Alan
 - Up-to-date

Admin
-----
 - Links to David Martin's Admin Guide.
 - Maintained by Gary
 - Up-to-date (Admin Guide being re-written by Peter Rice)

Internals
---------
 - Links to:
   - Developer's Introduction
   - Guide to Writing EMBOSS Applications
   - Ajax Command Definition
   - Ajax Library documentation 
   - Nucleus Library documentation
   - PLplot Graphics Library original documentation 
   - EMBOSS C programming standards
   - EMBOSS code documentation standards 
   - EMBOSS user manual standards 
   - C versus C++
   - Using the CVS Server
   - EMBOSS Interface Projects 
   - EMBOSS Database Definitions 
 - Links to EFUNC and EDATA which are SRS databases of the documentation
   derived from the comments in the headers of Ajax and Nucleus routines
   automatically created by running the script 'scripts/emboss*.pl'
 - All of these are maintained by (?)
 - Up-to-date (?)


Mailinglist
-----------
 - Brief description of the mailing lists
 - Maintained by Gary 
 - Up-to-date
 - Should we have a searchable archive of messages?

Coordination
------------
 - Links to minutes of planning meetings
 - Maintained by Jon
 - Up-to-date

Licencing
---------
 - Description of GNU Licence
 - Maintained by Gary 
 - Up-to-date

Credits
-------
 - People who have worked on EMBOSS
 - Maintained by Gary
 - Up-to-date (?)

Individual application documentation
====================================
 - The primary documentation for the programs is held in 
   http://www.uk.embnet.org/Software/EMBOSS/Apps/
 - The documentation is processed and also held in the EMBOSS directories
   'doc/programs/html'  and 'doc/programs/text'  (see above).
 - The documentation is held in an HTML file whose name is simply the
   name of the application with the extension '.html'. 
 - There are many server-side includes. The standard ones are:
   inc/*.ione - file containing the one-line description as produced by
      the application 'wossname'
   inc/*.ihelp - file containing the -help output by 
      'application -help'
   inc/*.itable - file containing the command table as output by 
      'application -help -acdtable'
   inc/*.isee - file containing the associated programs as output by the
      application 'seealso'
 - The rest of the file is hand-edited.
 - The script 'scripts/edithtml.pl' allows you to create a web page by
   filling in details of the description, usage, input and output files,
   etc. It uses the template file 'template.html.save' as a starting point.
 - The script 'scripts/autodoc.pl' will go through all of the
   applications updating any whose help details have changed and offering
   to commit them to the CVS tree for you. 
 - The example usage and input and output example files should be the
   same as in the 'test/qatest.dat' QA testing data.  Some applications
   have this, but not all.  It would be useful if this information could be
   derived from the QA tests automatically. 
 - Maintained by Gary
 - NOT UP-TO-DATE - still waiting for documentation from the authors of
   many Protein structure applications!


From jrvalverde at cnb.uam.es  Tue Jun 25 09:42:24 2002
From: jrvalverde at cnb.uam.es (Jos� R. Valverde)
Date: Tue, 25 Jun 2002 15:42:24 +0200
Subject: The EMBOSS Documentation Project
In-Reply-To: <200206242004.g5OK4fF8399392@electre.pasteur.fr>
References: <20020624111716.F24581@hgmp.mrc.ac.uk>
	<200206242004.g5OK4fF8399392@electre.pasteur.fr>
Message-ID: <20020625154224.3e4cdac6.jrvalverde@cnb.uam.es>

On Mon, 24 Jun 2002 22:04:41 +0200
"Catherine Letondal" <letondal at pasteur.fr> wrote:

> 
> Damian Counsell wrote:
> 
> > What do users want from our documentation system?  
> 
> Ask them, or rather look at them.
> 
> The best way to know is to make a few user studies around the current documentation
> system - not that many studies are necessary IMHO (I don't fully agree with Jose in his 
> previous message that you really need a HCI research project for this, although it could of
> course be very interesting).
> A very efficient and light technique is testing by pair (see for instance
> http://www.cs.umd.edu/~zzj/Codiscov.htm) with a scenario or a precise 
> task + thinking-aloud method. This is a way to learn a lot about what should be done
> in a few time. It brings more information than a questionnaire.
> 
Agreed.

At any rate, it should be users who answer. There are plenty of methods that have proved
useful in dealing with users. Any of them would do. Thinking aloud is indeed one of the
most powerful methods.

An HCI project may be overkill, but still someone with an interest or knowledge in HCI design
would be helpful. And from my very humble, lowly laying, low profile point of view, the more
open the original approach the better. We might get some surprises about our assumptions on
users.

				j
--
Jose R. Valverde
EMBnet/CNB


From d.counsell at hgmp.mrc.ac.uk  Thu Jun 27 08:42:15 2002
From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell)
Date: Thu, 27 Jun 2002 13:42:15 +0100
Subject: documentation update
Message-ID: <20020627134215.B2564@hgmp.mrc.ac.uk>

Dear All


Yesterday we had a productive and interesting(!) meeting to discuss
documentation.  It will be the first of many.  I have set up a Web
page to keep people informed about work on the wonderful new
documentation system for EMBOSS:

     http://www.hgmp.mrc.ac.uk/~dcounsel/EDP/EMBOSS_documentation.html

.  Remember this is just a skeleton.  A "release early release often"
policy will apply to this resource and criticisms and contributions
are welcome.  Meeting minutes to follow soon.

As Gary (Williams) has pointed out, the task of updating and
converting the existing docs will be a big one.  Once the spec is
defined we'll come looking for volunteers.  (Or perhaps the
Documentation Czar will visit and you to ask if you'd rather be using
a laptop from a hospital bed...)


Utility! Parseability! Reproducibility!  Vive la documentation*!


Damian


(*Yes, I know it should be "Vive les dossiers!".  Have you no poetry?)


-- 
Damian COUNSELL                      email:  d.counsell at hgmp.mrc.ac.uk
MRC Human Genome Mapping Project RC  phone:  +44 (0)1223 494500
Cambridge CB10 1SB                   direct: +44 (0)1223 494585
http://www.hgmp.mrc.ac.uk/~dcounsel/ fax:    +44 (0)1223 494512


From michaelwatson at paradigm-therapeutics.co.uk  Tue Jun 18 14:59:43 2002
From: michaelwatson at paradigm-therapeutics.co.uk (Mick Watson)
Date: Tue, 18 Jun 2002 15:59:43 +0100
Subject: Transeq
Message-ID: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk>

Hi

Can anyone tell me why when I give transeq a fasta file like:

>gnl|UG|Hs#S3220135
ATGGCAGCGCGCCCGCTGCCCGTGTCCCCCGCCCGCGCCCTCCTGCTCGCCCTGGCCGGTGCTCTGCTCGCGCCCTGCGA

I get the output

>Hs#S3220135_1
MAARPLPVSPARALLLALAGALLAPCEARGVSLWNEGRADEVVSASVRSGDLWIPVKSFD

Where has the "gnl|UG|" part of my description line gone???!!!  And
why?  Can I switch this "feature" off?

Thanks
Mick


From gwilliam at hgmp.mrc.ac.uk  Tue Jun 18 15:15:06 2002
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Tue, 18 Jun 2002 16:15:06 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk>
Message-ID: <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk>


You get the output in 'fasta' format by default.
If you want it in 'ncbi' format, then you have to ask for it:

transeq nucleic.seq ncbi::protein.pep 

or 

transeq nucleic.seq -osf ncbi protein.pep

Gary


Mick Watson wrote:
> 
> Hi
> 
> Can anyone tell me why when I give transeq a fasta file like:
> 
> >gnl|UG|Hs#S3220135
> ATGGCAGCGCGCCCGCTGCCCGTGTCCCCCGCCCGCGCCCTCCTGCTCGCCCTGGCCGGTGCTCTGCTCGCGCCCTGCGA
> 
> I get the output
> 
> >Hs#S3220135_1
> MAARPLPVSPARALLLALAGALLAPCEARGVSLWNEGRADEVVSASVRSGDLWIPVKSFD
> 
> Where has the "gnl|UG|" part of my description line gone???!!!  And
> why?  Can I switch this "feature" off?
> 
> Thanks
> Mick

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From peter.rice at uk.lionbioscience.com  Tue Jun 18 15:21:28 2002
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 18 Jun 2002 16:21:28 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk>
Message-ID: <3D0F4FF8.38C867FC@uk.lionbioscience.com>

"Gary Williams, Tel 01223 494522" wrote:
> 
> You get the output in 'fasta' format by default.
> If you want it in 'ncbi' format, then you have to ask for it:
> 
> transeq nucleic.seq ncbi::protein.pep
> 
> or
> 
> transeq nucleic.seq -osf ncbi protein.pep

You still lose the "UG" database name. You wil get an identifier of:

>gnl|unk|Hs#S3220135_1
MAARPLPVSPARALLLALAGALLAPCX


NCBI's "FASTA" identifiers are strange things that EMBOSS can read but not
save completely ... but this should not be a problem because "UG" is not
really the database name for the protein translation.

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From michaelwatson at paradigm-therapeutics.co.uk  Tue Jun 18 15:27:28 2002
From: michaelwatson at paradigm-therapeutics.co.uk (Mick Watson)
Date: Tue, 18 Jun 2002 16:27:28 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk> <3D0F4FF8.38C867FC@uk.lionbioscience.com>
Message-ID: <3D0F5160.EC913DC@paradigm-therapeutics.co.uk>

OK, thanks for the help!

In this instance I really wanted the first part of the fasta line to stay the
same - I realise that it doesn't anyway die to the "_1" which is appended -
so now as well as removing that I am also putting the "gnl|UG|" part back on
the front too!

My first instinct would be to just leave the fasta line alone other than to
simply append _# to the translation......

Peter Rice wrote:

> "Gary Williams, Tel 01223 494522" wrote:
> >
> > You get the output in 'fasta' format by default.
> > If you want it in 'ncbi' format, then you have to ask for it:
> >
> > transeq nucleic.seq ncbi::protein.pep
> >
> > or
> >
> > transeq nucleic.seq -osf ncbi protein.pep
>
> You still lose the "UG" database name. You wil get an identifier of:
>
> >gnl|unk|Hs#S3220135_1
> MAARPLPVSPARALLLALAGALLAPCX
>
> NCBI's "FASTA" identifiers are strange things that EMBOSS can read but not
> save completely ... but this should not be a problem because "UG" is not
> really the database name for the protein translation.
>
> Peter
>
> --
> ------------------------------------------------
> Peter Rice, LION Bioscience Ltd, Cambridge, UK
> peter.rice at uk.lionbioscience.com +44 1223 224723


From peter.rice at uk.lionbioscience.com  Tue Jun 18 15:41:38 2002
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 18 Jun 2002 16:41:38 +0100
Subject: Transeq
References: <3D0F4ADF.8B17BFDD@paradigm-therapeutics.co.uk> <3D0F4E7A.BA2A4735@hgmp.mrc.ac.uk> <3D0F4FF8.38C867FC@uk.lionbioscience.com> <3D0F5160.EC913DC@paradigm-therapeutics.co.uk>
Message-ID: <3D0F54B2.F575F7EA@uk.lionbioscience.com>

Mick Watson wrote:
> 
> In this instance I really wanted the first part of the fasta line to stay the
> same - I realise that it doesn't anyway die to the "_1" which is appended -
> so now as well as removing that I am also putting the "gnl|UG|" part back on
> the front too!
> 
> My first instinct would be to just leave the fasta line alone other than to
> simply append _# to the translation......

It is a very small change to allow the following:

transeq nucleic.seq -osf ncbi protein.pep -osdb UG

This would replace the "unk" with "UG" in your NCBI output.

Probably will be available in EMBOSS 2.5

regards,

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From d.counsell at hgmp.mrc.ac.uk  Mon Jun 24 10:17:16 2002
From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell)
Date: Mon, 24 Jun 2002 11:17:16 +0100
Subject: The EMBOSS Documentation Project
Message-ID: <20020624111716.F24581@hgmp.mrc.ac.uk>

Dear All                                                                        
                                                                                
                                                                                
There will be a meeting to discuss EMBOSS documentation on Wednesday
morning at 10:30 in the loft of the Hinxton conference centre.  There
is a real possibility of obtaining some funding for developing this so
it'd be nice to have a plan for how we could spend the money if/when
it appears.
                                                                                
What do users want from our documentation system?  What do we want
from it?  How are we going to store, implement and distribute it?  Who
should we hire to build it?  What are the prospects for making this
work a research project in itself?  Who gets the movie rights?
                                                                                
I'm soliciting opinions on all these and other questions---ideally
*other people's* opinions, so please come along if you can, and/or
circulate this message to anyone else you think might be able to
contribute.  I've booked facilities for "about half-a-dozen".

You are also encouraged to write to me with suggestions about
documentation.  I'm relatively new round here so apologies if this
message falls outside the scope of either of these lists; pretend you
didn't see it.
                                                                                
                                                                                
all the best                                                                    
                                                                                
Damian                    

-- 
Damian COUNSELL                      email:  d.counsell at hgmp.mrc.ac.uk
MRC Human Genome Mapping Project RC  phone:  +44 (0)1223 494500
Cambridge CB10 1SB                   direct: +44 (0)1223 494585
http://www.hgmp.mrc.ac.uk/~dcounsel/ fax:    +44 (0)1223 494512


From jrvalverde at cnb.uam.es  Mon Jun 24 16:02:26 2002
From: jrvalverde at cnb.uam.es (jrvalverde at cnb.uam.es)
Date: Mon, 24 Jun 2002 18:02:26 +0200 (DST)
Subject: The EMBOSS Documentation Project
In-Reply-To: <20020624111716.F24581@hgmp.mrc.ac.uk>
Message-ID: <200206241602.g5OG2Sn2015835@embnet.cnb.uam.es>

Damian Counsell <d.counsell at hgmp.mrc.ac.uk> wrote:
> 

Just my 2? worth

> What do users want from our documentation system?  What do we want
> from it?  How are we going to store, implement and distribute it?  Who
> should we hire to build it?  What are the prospects for making this
> work a research project in itself?  Who gets the movie rights?

1) I'd say it to be practical, readable and understandable. Readability
depends on using users' language and point of view. If they don't use it, 
it will be worthless...

2) I'd venture we want users to use, torture and exploit it. This leads
to question 5, research:

5) Make it a research project on HCI in Biosciences. What do users *in
all the various areas of Life Sciences* need to use EMBOSS? You need to
target various "markets": obviously Biologists/Molecular Biologists, which
we _believe_ we already know well, but also Medical Doctors, whose needs
are quite different but solvable with the same tools, Pharmacologists,
Population Geneticists, etc. How can one describe a program like CLUSTAL
so it is clear that it is not only useful in building phylogenies, but
also in spotting preserved/differing regions, population variability,
disease relevance, epidemiology dynamics, etc...? AND make it amenable
to all the various target users?

One may of course write looooooong encyclopedic manuals, but I fear no
one would use them (they'd be frightened). Or thousands of Howtos, but I 
fear no one can cover all possibilities. Or find out new, efficient
LifeSci HCI techniques.

3) That should be the result of the project, it might be a "documentation"
database that may be queried by examples, goals, tasks, targets, whatever.
Or a set of GUIs that embed the knowledge on using the software transparently. 
And should hide complexity so users do not get frightened by its volume, and
that appeals to various kinds of users, maybe with customisable or alternate 
UIs. I'm not aware of any field studies on HCI design in the Biosciences,
and there's a serious need for it. We are assuming that the interfaces that
work in other areas should also do here, but is it so? From my experience
I fear not.

Actually many people complains SRS (despite being powerful) is "difficult",
and that W2H is too complex, but they won't like simple forms either.. So,
what the hell do _they_ want? I'd really like to know. Even when they use the
tools, they rarely know something as basic as whom to acknowledge -or how 
to find out whom- for the tools used in citations. Or even if they should 
cite anything.

Had I to bet, documentation should probably be tightly integrated inside
the software UI itself, so that in addition to results it produces guidelines
on how to understand/interpret them, in addition to program listings it
produces "task identification" protocols, etc... Something quite different
from what we have now, may be akin to "wizards" or a mixture between 
a knowledge base, wizards, software, and whoknowswhat younameit.

4) So it should be someone that knows or wills to learn about a) Life Sciences,
b) Bioinformatics, c) Statistics, d) Human-Computer Interfaces, that wills
to work tightly with users, listen to them and translate what they _do_ or
would _like to do_ into easy, amenable instructions/protocols/interfaces.
Someone has to seat down with many users, interview them, generate
polls, collect statistics, implement _user_ drafts (not his/her/or their
advisor at all), and then devise new polls, collect new data and find which
do actually work in each field and why, and possibly -at the very end- try
their own hands at improving user suggested methods.

To sum it up: I'd suggest to start with a blank sheet of paper, fully from
scratch, forget everything done (except perhaps to collect user satisfaction
statistics to start up), and find out what no one else has before: what do
users really want in order to use correctly the software (i.e. making
_educated_ decisions, understanding analysis results from a critical
point of view, and doing responsible use of tools).

Which answers 6) above: credits would go to users, for they would be the actual
source of information, technical tips, UI designs, suggestions, etc.. and
thus results should revert to them (with due recognition to the catalytic
"medium", who would have to sum up statistics and conclusions in papers). 

> You are also encouraged to write to me with suggestions about
> documentation.  I'm relatively new round here so apologies if this
> message falls outside the scope of either of these lists; pretend you
> didn't see it.
>                                                                                 

Dunno if I made myself clear, just ask for anything you want. 

	Hope this helps.

				j


From letondal at pasteur.fr  Mon Jun 24 20:04:41 2002
From: letondal at pasteur.fr (Catherine Letondal)
Date: Mon, 24 Jun 2002 22:04:41 +0200
Subject: The EMBOSS Documentation Project 
In-Reply-To: Your message of "Mon, 24 Jun 2002 11:17:16 BST."
             <20020624111716.F24581@hgmp.mrc.ac.uk> 
Message-ID: <200206242004.g5OK4fF8399392@electre.pasteur.fr>


Damian Counsell wrote:

> What do users want from our documentation system?  

Ask them, or rather look at them.

The best way to know is to make a few user studies around the current documentation
system - not that many studies are necessary IMHO (I don't fully agree with Jose in his 
previous message that you really need a HCI research project for this, although it could of
course be very interesting).
A very efficient and light technique is testing by pair (see for instance
http://www.cs.umd.edu/~zzj/Codiscov.htm) with a scenario or a precise 
task + thinking-aloud method. This is a way to learn a lot about what should be done
in a few time. It brings more information than a questionnaire.

> I'm soliciting opinions on all these and other questions---ideally
> *other people's* opinions, so please come along if you can, and/or
> circulate this message to anyone else you think might be able to
> contribute.  I've booked facilities for "about half-a-dozen".

Several levels of documentation might be a good idea, the first one being the GUI 
itself to remove accidental complexity, and at least 1 not too simplistic example of 
input and output.
A FAQ (built on actually frequently asked questions :-)) could also be useful.
The new documentation itself should be tested.

What is simple has to look simple, but what is complex must not be hidden. A simple
aspect can be an affordance to go further also...

A pragmatic approach...

--
Catherine Letondal -- Pasteur Institute Computing Center


From gbottu at ben.vub.ac.be  Tue Jun 25 09:13:49 2002
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Tue, 25 Jun 2002 11:13:49 +0200 (CEST)
Subject: The EMBOSS Documentation Project
Message-ID: <200206250913.LAA0001109989@ben.vub.ac.be>

from : BEN

	Dear All,
	
A good manual is certainly wellcome. What is available now is not up-to-date, 
uncomplete and/or only available on remote WWW sites. Of course, writing a 
manual is a big effort and programmers are notoriously not eager to spend much 
effort in writing manuals. It is nearly unavoidable of increasing the EMBOSS 
developement team with a person whose job it is to write the manual.

How should it look like ? IMHO, the GCG manual is a good example, except that it 
contains too much unneeded repetititons.

What do we need ?
documentation for each program + a general user's guide
available as well on-line as as printable text (maybe PDF), since paging through 
a printed text is still more convenient than browsing through screens
also a developer's guide with info about ACD, sequence and other formats, AJAX 
and NUCLEUS functions, etc.

	Guy Bottu 


From gwilliam at hgmp.mrc.ac.uk  Tue Jun 25 11:41:36 2002
From: gwilliam at hgmp.mrc.ac.uk (gwilliam at hgmp.mrc.ac.uk)
Date: Tue, 25 Jun 2002 12:41:36 +0100 (BST)
Subject: Current documentation
Message-ID: <200206251141.MAA15672@bromine.hgmp.mrc.ac.uk>

This is the current state of EMBOSS documentation to the best of my knowledge.


EMBOSS documentation
--------------------

The current types and locations of documentation of EMBOSS are:


Files
=====
 - these are mainly plain text files in the EMBOSS distribution


Files in the EMBOSS top level directory
---------------------------------------
 - these are all distributed and are in the CVS tree

ChangeLog
 - a brief description of the changes made to any code for the
   next and previous releases. 
 - maintained by anyone who changes code
 - up-to-date

README 
 - a brief description of how to compile and install EMBOSS
 - maintained by (?)
 - up-to-date
    
FAQ 
 - a description of questions and answers on various topics in EMBOSS
 - maintained by Gary
 - OUT OF DATE

AUTHORS 
 - a brief list of authors
 - maintained by (?)
 - OUT OF DATE

THANKS 
 - thanks to people who have distributed code
 - maintained by (?)
 - OUT OF DATE
  

NEWS
 - brief description of changes in versions, now just says, see ChangeLog
 - maintained by (?)
 - up-to-date
  

Files in the EMBOSS 'doc/manuals' directory
-------------------------------------------
 - these are all distributed and are in the CVS tree

EMBOSS-FreeBSD-HOWTO.txt
 - how to install EMBOSS under FreeBSD
 - maintained by (?)
 - up-to-date ?

admin*
 - David Martin's Admin Guide
 - maintained by Gary
 - will be replaced by Peter Rice's Admin Guide

emboss_qg*
 - David Martin's EMBOSS Quick Guide
 - maintained by Gary
 - up-to-date (?)

internals*
 - Peter Rice's descriptions of the sequence access libraries
 - maintained by Peter
 - up-to-date


Files in the EMBOSS 'doc/programs/html' directory
--------------------------------------------
 - these are all distributed and are in the CVS tree
 - searched by wossname
 - this is a copy of the files in
   http://www.uk.embnet.org/Software/EMBOSS/Apps
   but will all HTML inline instead of using 'server-side includes'
   produced by running the script 'scripts/autodoc.pl'
 - maintained by Gary
 - up-to-date copy of http://www.uk.embnet.org/Software/EMBOSS/Apps

Files in the EMBOSS 'doc/programs/text' directory
--------------------------------------------
 - these are all distributed and are in the CVS tree
 - searched by wossname
 - this is a text copy of the files in the directory 'doc/programs/html'
   produced by running 'lynx -dump' as part of the script
   'scripts/autodoc.pl'
 - maintained by Gary
 - up-to-date copy of the directory 'doc/programs/html'

Files in the EMBOSS 'doc/tutorials' directory
---------------------------------------------
 - these are all distributed and are in the CVS tree

emboss-gcg.ppt
 - powerpoint file of talk given by Gary/Lisa May 2001 for those
   converting from GCG to EMBOSS
 - not maintained

emboss-interfaces.ppt
 - powerpoint file of talk given by Gary/Lisa May 2001 for those
   converting from GCG to EMBOSS
 - not maintained
 - OUT OF DATE (no mention of Jemboss)

emboss_tutorial*
 - TeX and PostScript files of the 'current' EMBOSS tutorial
 - authored mainly by Val Curwen
 - not maintained
 - COULD DO WITH A SUBSTANTIAL REWRITE

Files in in the EMBOSS 'emboss/acd' directory
---------------------------------------------
 - these are all distributed and are in the CVS tree
 - these are the .acd files
 - there is one file per application
 - they are generally maintained by the authors of the application
 - they contain the following types of documentation:

one-line documentation line 
 - to describe what the program does
 - it is displayed in many derived documents and programs
 - it is defined in 
   http://www.uk.embnet.org/Software/EMBOSS/Acd/syntax.html#Docattr9
 - The length of the doc: string should be kept to 63 characters or
   shorter in order to allow the wossname utility to display each
   program name and its documentation on one 80 character line. 
 - The doc: string should not end with a '.' character
 - Any acronyms or capatalised abbreviations in the doc: string should
   be written in upper-case.  (e.g: SNPs, EST, DNA, ABI, SRS, ASCII,
   CDS, mRNA, B-DNA, RNA, CpG, ORFs, MAR/SAR, PCR, STS, REBASE, SCOP,
   PROSITE, PRINTS, EMBL, TRANSFAC, BLAST, GCG, EMBOSS)
 - The doc: string should start with an upper-case letter. 

groups
 - to describe the general function of the program
 - it is defined in
   http://www.uk.embnet.org/Software/EMBOSS/Acd/syntax.html#Grpattr10
 - it is displayed in many derived documents and programs
 - the two-level structure and names of the groups are well-defined as a
   result of extensive negotiations with the Staden Team and other
   GUI-builders. 

prompts
 - each qualifier of the program should have a prompt
 - this information is NOT displayed anywhere when help is requested!

help
 - gives help on the qualifier
 - it is displayed in many derived documents and when running with '-help'
 - each qualifier can have an unlimited amount of text describing it.
 - THIS IS OFTEN MISSING
 - I suggest that all .acd files should be reviewed and help added as a 
   matter of urgency.


Web pages
=========
 - These are mainly HTML documents (also some text & GIF)
 - They are mainly hand-crafted
 - Top-level URL is
   http://www.uk.embnet.org/Software/EMBOSS
 - These files are NOT distributed with EMBOSS.
 - They probably should be distributed so that other sites can set up their
   own copies. 
 - They should certainly be included in the CVS tree so any author can
   edit them.

Index page
----------
 - Maintained by Gary
 - Up-to-date
 - Links to overview, applications, userdocs, interfaces, downloading,
   admin, internals, mailinglist, coordination, licencing, credits. 

Overview
--------
 - Brief description of EMBOSS
 - Maintained by Gary
 - Up-to-date


Applications
------------
 - Links to groups and EMBASSY application pages
 - Table of links to each applications's page
 - Maintained by Gary 
 - Up-to-date

Userdocs
--------
 - A link to the tutorial (a copy of the tutorial in the EMBOSS
   'doc/tutorials' directory)
 - Links to descriptions of:
   - Uniform Sequence Addresses
   - Sequence Formats
   - Alignment Formats
   - Feature Formats
   - Report Formats
 - All of these are maintained by Gary
 - Up-to-date

Interfaces
----------
 - links to Jemboss and other interface sites
 - Maintained by Gary
 - Some descriptions and links may need attention

Downloading
-----------
 - Brief description of how to download and unpack EMBOSS
 - Maintained by Gary and Alan
 - Up-to-date

Admin
-----
 - Links to David Martin's Admin Guide.
 - Maintained by Gary
 - Up-to-date (Admin Guide being re-written by Peter Rice)

Internals
---------
 - Links to:
   - Developer's Introduction
   - Guide to Writing EMBOSS Applications
   - Ajax Command Definition
   - Ajax Library documentation 
   - Nucleus Library documentation
   - PLplot Graphics Library original documentation 
   - EMBOSS C programming standards
   - EMBOSS code documentation standards 
   - EMBOSS user manual standards 
   - C versus C++
   - Using the CVS Server
   - EMBOSS Interface Projects 
   - EMBOSS Database Definitions 
 - Links to EFUNC and EDATA which are SRS databases of the documentation
   derived from the comments in the headers of Ajax and Nucleus routines
   automatically created by running the script 'scripts/emboss*.pl'
 - All of these are maintained by (?)
 - Up-to-date (?)


Mailinglist
-----------
 - Brief description of the mailing lists
 - Maintained by Gary 
 - Up-to-date
 - Should we have a searchable archive of messages?

Coordination
------------
 - Links to minutes of planning meetings
 - Maintained by Jon
 - Up-to-date

Licencing
---------
 - Description of GNU Licence
 - Maintained by Gary 
 - Up-to-date

Credits
-------
 - People who have worked on EMBOSS
 - Maintained by Gary
 - Up-to-date (?)

Individual application documentation
====================================
 - The primary documentation for the programs is held in 
   http://www.uk.embnet.org/Software/EMBOSS/Apps/
 - The documentation is processed and also held in the EMBOSS directories
   'doc/programs/html'  and 'doc/programs/text'  (see above).
 - The documentation is held in an HTML file whose name is simply the
   name of the application with the extension '.html'. 
 - There are many server-side includes. The standard ones are:
   inc/*.ione - file containing the one-line description as produced by
      the application 'wossname'
   inc/*.ihelp - file containing the -help output by 
      'application -help'
   inc/*.itable - file containing the command table as output by 
      'application -help -acdtable'
   inc/*.isee - file containing the associated programs as output by the
      application 'seealso'
 - The rest of the file is hand-edited.
 - The script 'scripts/edithtml.pl' allows you to create a web page by
   filling in details of the description, usage, input and output files,
   etc. It uses the template file 'template.html.save' as a starting point.
 - The script 'scripts/autodoc.pl' will go through all of the
   applications updating any whose help details have changed and offering
   to commit them to the CVS tree for you. 
 - The example usage and input and output example files should be the
   same as in the 'test/qatest.dat' QA testing data.  Some applications
   have this, but not all.  It would be useful if this information could be
   derived from the QA tests automatically. 
 - Maintained by Gary
 - NOT UP-TO-DATE - still waiting for documentation from the authors of
   many Protein structure applications!


From jrvalverde at cnb.uam.es  Tue Jun 25 13:42:24 2002
From: jrvalverde at cnb.uam.es (Jos� R. Valverde)
Date: Tue, 25 Jun 2002 15:42:24 +0200
Subject: The EMBOSS Documentation Project
In-Reply-To: <200206242004.g5OK4fF8399392@electre.pasteur.fr>
References: <20020624111716.F24581@hgmp.mrc.ac.uk>
	<200206242004.g5OK4fF8399392@electre.pasteur.fr>
Message-ID: <20020625154224.3e4cdac6.jrvalverde@cnb.uam.es>

On Mon, 24 Jun 2002 22:04:41 +0200
"Catherine Letondal" <letondal at pasteur.fr> wrote:

> 
> Damian Counsell wrote:
> 
> > What do users want from our documentation system?  
> 
> Ask them, or rather look at them.
> 
> The best way to know is to make a few user studies around the current documentation
> system - not that many studies are necessary IMHO (I don't fully agree with Jose in his 
> previous message that you really need a HCI research project for this, although it could of
> course be very interesting).
> A very efficient and light technique is testing by pair (see for instance
> http://www.cs.umd.edu/~zzj/Codiscov.htm) with a scenario or a precise 
> task + thinking-aloud method. This is a way to learn a lot about what should be done
> in a few time. It brings more information than a questionnaire.
> 
Agreed.

At any rate, it should be users who answer. There are plenty of methods that have proved
useful in dealing with users. Any of them would do. Thinking aloud is indeed one of the
most powerful methods.

An HCI project may be overkill, but still someone with an interest or knowledge in HCI design
would be helpful. And from my very humble, lowly laying, low profile point of view, the more
open the original approach the better. We might get some surprises about our assumptions on
users.

				j
--
Jose R. Valverde
EMBnet/CNB


From d.counsell at hgmp.mrc.ac.uk  Thu Jun 27 12:42:15 2002
From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell)
Date: Thu, 27 Jun 2002 13:42:15 +0100
Subject: documentation update
Message-ID: <20020627134215.B2564@hgmp.mrc.ac.uk>

Dear All


Yesterday we had a productive and interesting(!) meeting to discuss
documentation.  It will be the first of many.  I have set up a Web
page to keep people informed about work on the wonderful new
documentation system for EMBOSS:

     http://www.hgmp.mrc.ac.uk/~dcounsel/EDP/EMBOSS_documentation.html

.  Remember this is just a skeleton.  A "release early release often"
policy will apply to this resource and criticisms and contributions
are welcome.  Meeting minutes to follow soon.

As Gary (Williams) has pointed out, the task of updating and
converting the existing docs will be a big one.  Once the spec is
defined we'll come looking for volunteers.  (Or perhaps the
Documentation Czar will visit and you to ask if you'd rather be using
a laptop from a hospital bed...)


Utility! Parseability! Reproducibility!  Vive la documentation*!


Damian


(*Yes, I know it should be "Vive les dossiers!".  Have you no poetry?)


-- 
Damian COUNSELL                      email:  d.counsell at hgmp.mrc.ac.uk
MRC Human Genome Mapping Project RC  phone:  +44 (0)1223 494500
Cambridge CB10 1SB                   direct: +44 (0)1223 494585
http://www.hgmp.mrc.ac.uk/~dcounsel/ fax:    +44 (0)1223 494512