From pmr at ebi.ac.uk  Fri Apr  1 03:33:41 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 01 Apr 2005 09:33:41 +0100
Subject: [EMBOSS] CODON USAGE TABLES
In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be>
References: <20050330172118.GA14064@bigben.ulb.ac.be>
Message-ID: <424D0765.1000306@ebi.ac.uk>

Guy Bottu wrote:
> 	Dear Peter, dear all,
> 
> A few thoughts on the codon usage tables, now that you are working on 
> them.
> 
> Do you intend to drop the existing tables from the distribution in favor 
> of tables from CUTG ? CUTG has one drawback : the entries for each 
> organism/organelle are made from all the genes, without taking account of 
> the fact that there exist distinct subpopulations. E.g. in E. coli there 
> are the highly expressed genes, the lowly expressed genes and the 
> horizontally transferred genes, which have different codon usage. I think 
> that in the distribution there are at least for some organisms specific 
> files (e.g. Eeco.cut and Eeco_h.cut). The great problem with the files 
> from the current distribution is that it is hard to find out which file 
> contains what.

The file will be annotated with the species and the source database

The _h files will be kept (the chips program needs them for example) ... but 
if we have no documentation on which genes are highly expressed we may have to 
keep the transterm files which are based on only a few genes.

> There is the issue of the number of files in the face of GUI's. Some GUI's 
> for EMBOSS generate a selector from which the user can choose a codon 
> usage table. If the complete CUTG has been extracted and installed, this 
> does not work well anymore. A selector with more than 10000 entries is not 
> convenient and furthermore, in a WWW interface the HTML page takes a 
> perceptibly long time to download.

Any cutgextract modification requests? I have added species selection.

> At the BEN site I solved this the following (not necessarily satisfactory) 
> way : I modified cutgextract so that it creates files with extension .cutg 
> rather than .cut. The interface wEMBOSS only shows the *.cut files in the 
> selector. If a user wants to use a CUTG rather than a standard 
> distribution file under wEMBOSS, he must first copy it to his project 
> using embossdata (at the command line there is no problem).

I will add an option to cutgextract for the output filename extension.

> As formats, it would of course be nice if EMBOSS programs could read and 
> write codon usage tables (and other data) in any format, just as they do 
> for sequences. Which formats should we support besides what EMBOSS uses 
> now ? Is there such a thing as "native" CUTG format (with one entry a 
> file) ?. I know about GCG format (not useful for us, but other people 
> certainly might want it). There is Staden format. Staden format supports 
> also files with 2 tables (codon usage in genes + trinucleotide frequency 
> in noncoding DNA) ; what to do with this ? only read the first ? There is 
> also the format used by CODEHOP 
> (http://blocks.fhcrc.org/blocks/codehop.html). Does 
> someone know other formats ?

CUTG has a format used on their web pages. It also has the spsum file which 
could be used.

regards,

Peter


From pmr at ebi.ac.uk  Fri Apr  1 08:50:52 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 01 Apr 2005 14:50:52 +0100
Subject: [EMBOSS] CODON USAGE TABLES
In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be>
References: <20050330172118.GA14064@bigben.ulb.ac.be>
Message-ID: <424D51BC.4030402@ebi.ac.uk>

Guy Bottu wrote:

> As formats, it would of course be nice if EMBOSS programs could read and 
> write codon usage tables (and other data) in any format, just as they do 
> for sequences. Which formats should we support besides what EMBOSS uses 
> now ? Is there such a thing as "native" CUTG format (with one entry a 
> file) ?. I know about GCG format (not useful for us, but other people 
> certainly might want it). There is Staden format. Staden format supports 
> also files with 2 tables (codon usage in genes + trinucleotide frequency 
> in noncoding DNA) ; what to do with this ? only read the first ? There is 
> also the format used by CODEHOP 
> (http://blocks.fhcrc.org/blocks/codehop.html).

CODEHOP format is minimal, but can be used. It appears to be derived from 
CUTG's "spsum" files (which I will also add as a format).

Other formats I know about (and will include):

codonusage database ftp://ftp.ebi.ac.uk/pub/databases/codonusage

transterm database ftp://ftp.ebi.ac.uk/pub/databases/transterm

GCG (with extra header comments to contain species and other information) does 
anyone have example from GCG or from other sources that write "GCG format" 
files so we can convert U -> T and any other non-standard data.

CUTG website format
http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=Drosophila+melanogaster+%5Bgbinv%5D&aa=1&style=N

SPSUM format (CUTG database .spsum files)

CODEHOP format http://blocks.fhcrc.org/blocks/codehop.html

Staden format: I have no example for this apart from one in the Staden 
src/seq_utils/genetics_codes.c source file - can someone send examples please? 
I would be happy reading an optional second file for some formats, although 
EMBOSS does not currently use the data the Staden format has.

regards,

Peter Rice


From ableasby at hgmp.mrc.ac.uk  Mon Apr  4 08:44:43 2005
From: ableasby at hgmp.mrc.ac.uk (Alan Bleasby)
Date: Mon, 4 Apr 2005 13:44:43 +0100 (BST)
Subject: [EMBOSS] Re: [EMBOSS-BUG] prophecy
Message-ID: <200504041244.j34CihJm022967@bromine.hgmp.mrc.ac.uk>

The short answer is that it was intentional. Until these programs
are replaced (on the list of things to do) it ought to be
documented though.

HTH

Alan


From muratem at eng.uah.edu  Mon Apr  4 11:07:30 2005
From: muratem at eng.uah.edu (Mike Muratet)
Date: Mon, 4 Apr 2005 10:07:30 -0500 (CDT)
Subject: [EMBOSS] Threading einverted
Message-ID: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>

Greetings

Has anyone ever tried to port einverted to a parallel machine like the SGI
altix? Has anyone tried to build multiple threads into einvertied?

Thanks

Mike


From pmr at ebi.ac.uk  Mon Apr  4 11:18:18 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Mon, 04 Apr 2005 16:18:18 +0100
Subject: [EMBOSS] Threading einverted
In-Reply-To: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>
References: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>
Message-ID: <42515ABA.4080802@ebi.ac.uk>

Dear Mike,

> Has anyone ever tried to port einverted to a parallel machine like the SGI
> altix? Has anyone tried to build multiple threads into einvertied?

We have not heard of any attempt to make einverted multi-threaded.

The algorithm is not the easiest to thread. But if you would like to try, I 
would be happy to help!

regards,

Peter Rice


From muratem at eng.uah.edu  Mon Apr  4 11:39:57 2005
From: muratem at eng.uah.edu (Mike Muratet)
Date: Mon, 4 Apr 2005 10:39:57 -0500 (CDT)
Subject: [EMBOSS] Threading einverted
In-Reply-To: <42515ABA.4080802@ebi.ac.uk>
Message-ID: <Pine.GSO.4.05.10504041028110.6258-100000@ebs330>


On Mon, 4 Apr 2005, Peter Rice wrote:

> Dear Mike,
> 
> > Has anyone ever tried to port einverted to a parallel machine like the SGI
> > altix? Has anyone tried to build multiple threads into einvertied?
> 
> We have not heard of any attempt to make einverted multi-threaded.
> 
> The algorithm is not the easiest to thread. But if you would like to try, I 
> would be happy to help!
> 
> regards,
> 
> Peter Rice
> 

Peter

I'm willing to have a go at it. I have an immediate need (isn't that
always the case?) and the shortest path may be threading. The biggest
machine I have access to is an altix. The Itaniums's are supposed to
scream. The system is down at the moment, but when it's available again
I'll compile it and run a benchmark with the existing source.

Do you have anything that describes the algorithm? I don't recall seeing
a reference to a paper. I'll print out the source and stare at it tonight.
The Altix has 16-cpu SMB nodes. It would be nice to hit on all 16.

Cheers

Mike


From msarachu at biol.unlp.edu.ar  Mon Apr 11 11:30:54 2005
From: msarachu at biol.unlp.edu.ar (Martin Sarachu)
Date: Mon, 11 Apr 2005 12:30:54 -0300
Subject: [EMBOSS] wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2
Message-ID: <425A982E.3060206@biol.unlp.edu.ar>

This is to announce the release of wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2

Changes in wEMBOSS include:
  - Small bugfixes and improvements of the ACD parser
  - Added support for Opera browser
  - Added multiple deletion of results

Changes in wrappers4EMBOSS include:
  - Compatibility for EMBOSS 2.8, 2.9 and 2.10
  - ps_scan wrappers updated for the last version of ps_scan.pl
  - Minor ACD enhancements

wrappers4EMBOSS can be installed together with wEMBOSS and is included 
in its distribution.
wrappers4EMBOSS can also be downloaded as a single package.

You can download both packages from http://www.wemboss.org


-- 
Martin Sarachu
msarachu at biol.unlp.edu.ar
AR.EMBnet
http://www.ar.embnet.org


From pmr at ebi.ac.uk  Tue Apr 12 04:57:27 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 12 Apr 2005 09:57:27 +0100
Subject: [EMBOSS] Codon usage file improvements
In-Reply-To: <424ACAB2.8090509@ebi.ac.uk>
References: <424ACAB2.8090509@ebi.ac.uk>
Message-ID: <425B8D77.7080409@ebi.ac.uk>

Peter Rice wrote:
> A quick check before I make changes to the EMBOSS codon usage files.

Done.

The codon usage files now committed to CVS (so this will happen from the next 
release) have the following changes:

1. file naming is Exxxxx where xxxxx is the UniProt/SwissProt 5-letter name 
for the species. Some species in UniProt/SwissProt have more than one name 
(strains used for genome projects, for example AGRTU and AGRT5 for 
Agrobacterium tumefasciens - EMBOSS will use Eagrtu.cut for the codon usage 
table, but has genes from the genome sequence).

For example:

#Species: Agrobacterium tumefaciens str. C58
#Division: gbbct
#Release: CUTG146
#CdsCount: 10705

#Coding GC 59.76%
#1st letter GC 63.11%
#2nd letter GC 44.70%
#3rd letter GC 71.47%

#Codon AA Fraction Frequency Number
GCA    A     0.132    15.154  51011
GCC    A     0.440    50.470 169886
GCG    A     0.328    37.649 126730
GCT    A     0.101    11.550  38879
TGC    C     0.783     6.486  21834


2. The old filenames will stay until release 3.0.0 for those who are used to 
them. I will add comments to their headers. They came from the CODONUSAGE and 
TRANSTERM databases, and we copied their filenames!

The attached file cut.txt lists the old file names and their species. I used 
the notes when selecting species for the new codon usage files.

3. EMBOSS will be able to read other codon usage table formats, and will 
extract the species and other information where possible

4. Codon usage files are checked for inconsistencies - if they specify the 
number of genes, then files with too many stop codons will give a warning. 
Some formats do not include the genetic code, so for some species and formats 
the warning can be ignored. The EMBOSS and GCG formats are safe.

5. Some EMBOSS programs read a codon usage file - but only use it to read a 
genetic code. These programs will instead prompt for a genetic code in the 
next release. For example, showseq and prettyseq only need a genetic code for 
translation. Backtranseq does need a codon usage table - for back translation 
it needs to know the most used codon for each amino acid.

6. A new file Cut.index (in the data/CODONS directory) will list all the codon 
usage files and their species so that a menu of installed codon usage files 
can be used by interfaces.

A copy of Cut.index is attached as Cut_index.txt

Hope this helps

Peter


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cut.txt
Url: http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Cut_index.txt
Url: http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment-0001.txt 

From robin at hms.harvard.edu  Tue Apr 12 10:17:42 2005
From: robin at hms.harvard.edu (Robin Colgrove)
Date: Tue, 12 Apr 2005 10:17:42 -0400
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <425B8D77.7080409@ebi.ac.uk>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk>
Message-ID: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>


Hello.

I was trying to use emma for a multiple sequence alignment of dna 
sequencing reads, but it complained that it could not find clustalw. I 
could not find any mention of clustalw on the EMBOSS page, so I got a 
copy from the clustalw homepage and -not knowing where to place it- 
tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and 
emma gives the error:

    EMBOSS An error in ajsys.c at line 398:
cannot find program 'clustalw'

Looking in the emma.acd and ajsys.c files, I can't find any guidance.

Does anyone know how this is supposed to work?
Alternatively, is there another good way to do multiple sequence 
alignment?
Looking ahead, I do not find any obvious way to do contig assembly, a 
la Phrap, or CAP.

thanks

robin colgrove


From pmr at ebi.ac.uk  Tue Apr 12 10:26:57 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 12 Apr 2005 15:26:57 +0100
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
Message-ID: <425BDAB1.6000606@ebi.ac.uk>

Robin Colgrove wrote:
> I was trying to use emma for a multiple sequence alignment of dna 
> sequencing reads, but it complained that it could not find clustalw. I 
> could not find any mention of clustalw on the EMBOSS page, so I got a 
> copy from the clustalw homepage and -not knowing where to place it- 
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and 
> emma gives the error:
> 
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
> 
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
> 
> Does anyone know how this is supposed to work?

Option 1: Install clustalw in your path so that you (and the emma program) can 
run it from the commandline. The directory where you installed EMBOSS is one 
possible place you can put it.

Option 2: Emma will look for a variable EMBOSS_CLUSTALW (an environment 
variable or a variable defined inemboss.defaults or .embossrc) that has the 
full path for clustalw.

Now ... we should document this ... and perhaps update the emma documentation 
which looks rather old and has too much old clustal information in it.

Hope this helps,

Peter


From robin at hms.harvard.edu  Tue Apr 12 15:20:01 2005
From: robin at hms.harvard.edu (Robin Colgrove)
Date: Tue, 12 Apr 2005 15:20:01 -0400
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
Message-ID: <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu>


Thanks to all for suggestions.
Just putting clustalw in /usr/local/bin did the trick.

Now, I need to figure out why emma/clustalw is giving me such bad 
alignments.
Since I only had 4 sequences, I ended up aligning them pairwise with 
needle, then pieced together the full alignment in vi, but this is not 
going to fly as the number of sequences increases. The online tool I 
use for quick alignments ( 
http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but 
the same fasta file sent either to emma or directly to clustalw gives 
obviously wrong alignments, even though the nucleotide sequences are 
highly homologous.

thanks again

robin


On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:

>
> Hello.
>
> I was trying to use emma for a multiple sequence alignment of dna 
> sequencing reads, but it complained that it could not find clustalw. I 
> could not find any mention of clustalw on the EMBOSS page, so I got a 
> copy from the clustalw homepage and -not knowing where to place it- 
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, 
> and emma gives the error:
>
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
>
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
>
> Does anyone know how this is supposed to work?
> Alternatively, is there another good way to do multiple sequence 
> alignment?
> Looking ahead, I do not find any obvious way to do contig assembly, a 
> la Phrap, or CAP.
>
> thanks
>
> robin colgrove
>


From David.Bauer at Schering.de  Wed Apr 13 02:12:25 2005
From: David.Bauer at Schering.de (David.Bauer at Schering.de)
Date: Wed, 13 Apr 2005 08:12:25 +0200
Subject: Antwort: Re: [EMBOSS] using emma: where to put clustalw
Message-ID: <OF5C412136.C434E2FD-ONC1256FE2.0021397D-C1256FE2.00221916@schering.net>


Hi Robin,

how long are your 4 sequences ?
I observed that clustalw has problems with nucleotide alignments, if there
are larger differences in sequence length.
So e.g. if a 1 kb sequence is nearly completely contained with high
homology within another 2 kb sequence, the resulting alignment can be very
far from optimal.
If you try to align coding sequences there is a program "tranalign" in
EMBOSS.
You can first align the protein sequences (which usually works better than
a multiple alignment of DNA) and then use this alignment with tranalign to
guide the alignment of the corresponding cDNA.

Hope this helps,
David.


                      Robin Colgrove                                                                                             
                      <robin at hms.harva                                                                                           
                      rd.edu>                  An:      emboss at embnet.org                                                        
                      Gesendet von:            Kopie:                                                                            
                      owner-emboss at hgm         Thema:   Re: [EMBOSS] using emma: where to put clustalw                           
                      p.mrc.ac.uk                                                                                                
                                                                                                                                 
                                                                                                                                 
                      12.04.2005 21:20                                                                                           
                                                                                                                                 
                                                                                                                                 
Thanks to all for suggestions.
Just putting clustalw in /usr/local/bin did the trick.

Now, I need to figure out why emma/clustalw is giving me such bad
alignments.
Since I only had 4 sequences, I ended up aligning them pairwise with
needle, then pieced together the full alignment in vi, but this is not
going to fly as the number of sequences increases. The online tool I
use for quick alignments (
http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but
the same fasta file sent either to emma or directly to clustalw gives
obviously wrong alignments, even though the nucleotide sequences are
highly homologous.

thanks again

robin


On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:

>
> Hello.
>
> I was trying to use emma for a multiple sequence alignment of dna
> sequencing reads, but it complained that it could not find clustalw. I
> could not find any mention of clustalw on the EMBOSS page, so I got a
> copy from the clustalw homepage and -not knowing where to place it-
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work,
> and emma gives the error:
>
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
>
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
>
> Does anyone know how this is supposed to work?
> Alternatively, is there another good way to do multiple sequence
> alignment?
> Looking ahead, I do not find any obvious way to do contig assembly, a
> la Phrap, or CAP.
>
> thanks
>
> robin colgrove
>


From gwilliam at hgmp.mrc.ac.uk  Wed Apr 13 04:22:19 2005
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Wed, 13 Apr 2005 09:22:19 +0100
Subject: [EMBOSS] using emma: where to put clustalw
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu>
Message-ID: <425CD6BB.68F8C486@hgmp.mrc.ac.uk>

Some alternate multiple alignment programs for nucleotide sequences on
the web are at:

http://www.hgmp.mrc.ac.uk/GenomeWeb/nuc-mult.html

I would recommend DIALIGN

Gary


Robin Colgrove wrote:
> 
> Thanks to all for suggestions.
> Just putting clustalw in /usr/local/bin did the trick.
> 
> Now, I need to figure out why emma/clustalw is giving me such bad
> alignments.
> Since I only had 4 sequences, I ended up aligning them pairwise with
> needle, then pieced together the full alignment in vi, but this is not
> going to fly as the number of sequences increases. The online tool I
> use for quick alignments (
> http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but
> the same fasta file sent either to emma or directly to clustalw gives
> obviously wrong alignments, even though the nucleotide sequences are
> highly homologous.
> 
> thanks again
> 
> robin
> 
> On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:
> 
> >
> > Hello.
> >
> > I was trying to use emma for a multiple sequence alignment of dna
> > sequencing reads, but it complained that it could not find clustalw. I
> > could not find any mention of clustalw on the EMBOSS page, so I got a
> > copy from the clustalw homepage and -not knowing where to place it-
> > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work,
> > and emma gives the error:
> >
> >    EMBOSS An error in ajsys.c at line 398:
> > cannot find program 'clustalw'
> >
> > Looking in the emma.acd and ajsys.c files, I can't find any guidance.
> >
> > Does anyone know how this is supposed to work?
> > Alternatively, is there another good way to do multiple sequence
> > alignment?
> > Looking ahead, I do not find any obvious way to do contig assembly, a
> > la Phrap, or CAP.
> >
> > thanks
> >
> > robin colgrove
> >

-- 
Gary Williams
MRC Rosalind Franklin Centre for Genomics Research
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
Tel: +44 1223 494522			Fax: +44 1223 494512
E-mail: gwilliam at rfcgr.mrc.ac.uk	Web: http://www.rfcgr.mrc.ac.uk


From jrvalverde at cnb.uam.es  Thu Apr 21 05:58:51 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Thu, 21 Apr 2005 11:58:51 +0200
Subject: [EMBOSS] Wiki
Message-ID: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>

I would rather welcome a Wiki for EMBOSS documentation.

I can host it at Es.EMBnet.Org/es.emboss.org, no problem at that.

The reason is that as I run into problems/tricks/tasks to do, I see
comments that might be added here and there in the documentation. I
would rather go to a single site and make the changes myself than 
go throught he hassle of devising a 'diff' comment, finding out who
to mail, mailing them andn waiting for a new doc release.

If there is interest, I can set it up straight away.

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050421/4620bfb1/attachment.bin 

From pmr at ebi.ac.uk  Thu Apr 21 12:20:24 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 21 Apr 2005 17:20:24 +0100
Subject: [EMBOSS] Wiki
In-Reply-To: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
Message-ID: <4267D2C8.10009@ebi.ac.uk>

Jos? R. Valverde wrote:

> I would rather welcome a Wiki for EMBOSS documentation.

We have all the documentation (including the sourceforge web pages) in CVS. 
Any member of the development/documentation team can make updates there.

No need for a wiki for this - and a wiki would be difficult to manage as most 
of the documentation is generated automatically.

> The reason is that as I run into problems/tricks/tasks to do, I see
> comments that might be added here and there in the documentation. I
> would rather go to a single site and make the changes myself than 
> go throught he hassle of devising a 'diff' comment, finding out who
> to mail, mailing them andn waiting for a new doc release.


Just mail anything like that to emboss-bug.

After all ... there is not much point in changing a wiki version of the 
documentation if we are busy changing the application and the real 
documentation :-)

regards,

Peter


From jrvalverde at cnb.uam.es  Fri Apr 22 04:11:18 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Fri, 22 Apr 2005 10:11:18 +0200
Subject: [EMBOSS] Wiki (and Macs)
In-Reply-To: <4267D2C8.10009@ebi.ac.uk>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
	<4267D2C8.10009@ebi.ac.uk>
Message-ID: <20050422101118.33b19892.jrvalverde@cnb.uam.es>

On Thu, 21 Apr 2005 17:20:24 +0100
Peter Rice <pmr at ebi.ac.uk> wrote:
> 
> After all ... there is not much point in changing a wiki version of the 
> documentation if we are busy changing the application and the real 
> documentation :-)
> 
> regards,
> 
> Peter

Right you are Sir. I guess it's better as it is for now. And yet...

Speaking generally, it probably boils down to the management model we
want for EMBOSS. As it is now I tend to see it much like a Cathedral
than a Bazaar. Truly it isn't, but you must agree it is not so evident
from the docs what the procedures are for participation. At least not
at first sight.

I'm more for the Bazaar model, one where everyone is welcome and 
making changes is as trivial as possible (specially for end-users
and end-user-related material, like docs). I'd rather have that as
a 'common' to build a user community around. Game theory shows that
to be the best strategy in the long run (see e.g. 
http://encyclopedia.laborlawtalk.com/Tragedy_of_the_commons ).

In the short run, with limited resources as the EMBOSS team currently
is, you are right it takes a significant effort and portion of the
existing resources. It makes more sense to concentrate on the short
term now and surviving enough to drive new resources in.

But I think we should have that in sight for the long term.

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/7a0edd92/attachment.bin 

From jrvalverde at cnb.uam.es  Fri Apr 22 04:20:58 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Fri, 22 Apr 2005 10:20:58 +0200
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <4267D2C8.10009@ebi.ac.uk>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
	<4267D2C8.10009@ebi.ac.uk>
Message-ID: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>

I'm trying to find out ways to fund EMBOSS in a way that I can
justify locally.

Mac users are a growing 'market' and a promising community. I've got 
here hundreds of Macs, and they need an easy to use, install and
manage solution.

What is needed (they tell me) is a good editor, and some interactive
graphic facilities for common, simple tasks. Actually, locally, we are
going to spend a significant amount into buying a handful of licenses
for commercial software.

I've tried Erik's CD, but it has some drawbacks regarding the configuration
on non-user-managed Macs (as those where root belongs to a central
authority): Here they can install software but not make modifications.
I can't either, being on the SciComp side and not on the Offimatic
end.

I don't have the resources to do that locally, but would welcome a
sensible way to fund it (like buying 'licenses', packages, CDs or
manuals from an EMBOSS-centered company).

I for one would certainly welcome a Macintosh edition ready to run,
and easy to configure to use central databases. If I were to chose,
I'd try to add those facilities to Jemboss (a sequence editor, and
interactive drawing of clones and molecular graphics). This is the
most lacking thing in EMBOSS now that every user has or can have a 
UNIX machine at their desktop.

And, certainly, I would happily recommend locally that we buy a 
hundred+ licenses at a reasonable price if that would help
fund EMBOSS.

Most ideally, something like the LiveDVD from AT.EMBnet.Org but for
Macs would be a candy. And an easy to justify buy.

Any recommendations? Takers? Pointers?

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/49e14ae1/attachment.bin 

From kellert at ohsu.edu  Fri Apr 22 12:33:44 2005
From: kellert at ohsu.edu (Thomas J Keller)
Date: Fri, 22 Apr 2005 09:33:44 -0700
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
 <4267D2C8.10009@ebi.ac.uk>
 <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>
Message-ID: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>

Greetings,
Have you looked at the fink installation of emboss and kaptain?
I use emboss from the command line, so I haven't tried the GUI 
application "kaptain". Here's what the fink database has to say about 
it:
#######################
kaptain-0.71-22: Universal graphical front-end
   Kaptain is a universal graphical front-end for command line programs,
   and it works wherever Qt3 is available. Someone writes a simple script
   (so called grammar) which describes the possible arguments for a
   command line program and Kaptain brings up a friendly dialog to
   the user to set up the command line. Example grammars can be
   found in /sw/share/kaptain/.
  .
  Web site: http://kaptain.sourceforge.net
  .
  Maintainer: Koen van der Drift <kvddrift at earthlink.net>
######################
The emboss grammar has been written and is available through fink.

You do need the developer tools installed on your Mac, but that's 
trivial, and comes with the OS, so no additional charge for your users.

Just a thought.

Tom Keller, Ph.D.
http://www.ohsu.edu/research/core
kellert at ohsu.edu
503-494-2442

On Apr 22, 2005, at 1:20 AM, Jos? R. Valverde wrote:

> I'm trying to find out ways to fund EMBOSS in a way that I can
> justify locally.
>
> Mac users are a growing 'market' and a promising community. I've got
> here hundreds of Macs, and they need an easy to use, install and
> manage solution.
>
> What is needed (they tell me) is a good editor, and some interactive
> graphic facilities for common, simple tasks. Actually, locally, we are
> going to spend a significant amount into buying a handful of licenses
> for commercial software.
>
> I've tried Erik's CD, but it has some drawbacks regarding the 
> configuration
> on non-user-managed Macs (as those where root belongs to a central
> authority): Here they can install software but not make modifications.
> I can't either, being on the SciComp side and not on the Offimatic
> end.
>
> I don't have the resources to do that locally, but would welcome a
> sensible way to fund it (like buying 'licenses', packages, CDs or
> manuals from an EMBOSS-centered company).
>
> I for one would certainly welcome a Macintosh edition ready to run,
> and easy to configure to use central databases. If I were to chose,
> I'd try to add those facilities to Jemboss (a sequence editor, and
> interactive drawing of clones and molecular graphics). This is the
> most lacking thing in EMBOSS now that every user has or can have a
> UNIX machine at their desktop.
>
> And, certainly, I would happily recommend locally that we buy a
> hundred+ licenses at a reasonable price if that would help
> fund EMBOSS.
>
> Most ideally, something like the LiveDVD from AT.EMBnet.Org but for
> Macs would be a candy. And an easy to justify buy.
>
> Any recommendations? Takers? Pointers?
>
> 				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2879 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/8bbb2284/attachment.bin 

From kvddrift at earthlink.net  Fri Apr 22 16:17:48 2005
From: kvddrift at earthlink.net (Koen van der Drift)
Date: Fri, 22 Apr 2005 16:17:48 -0400
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>
Message-ID: <a0d17fb52c05656b4195a1ef087e69a7@earthlink.net>


On Apr 22, 2005, at 12:33 PM, Thomas J Keller wrote:

>  Web site: http://kaptain.sourceforge.net
>  .
>

Actually, the package is emboss-kaptain.


- Koen.


From pmr at ebi.ac.uk  Fri Apr  1 08:33:41 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 01 Apr 2005 09:33:41 +0100
Subject: [EMBOSS] CODON USAGE TABLES
In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be>
References: <20050330172118.GA14064@bigben.ulb.ac.be>
Message-ID: <424D0765.1000306@ebi.ac.uk>

Guy Bottu wrote:
> 	Dear Peter, dear all,
> 
> A few thoughts on the codon usage tables, now that you are working on 
> them.
> 
> Do you intend to drop the existing tables from the distribution in favor 
> of tables from CUTG ? CUTG has one drawback : the entries for each 
> organism/organelle are made from all the genes, without taking account of 
> the fact that there exist distinct subpopulations. E.g. in E. coli there 
> are the highly expressed genes, the lowly expressed genes and the 
> horizontally transferred genes, which have different codon usage. I think 
> that in the distribution there are at least for some organisms specific 
> files (e.g. Eeco.cut and Eeco_h.cut). The great problem with the files 
> from the current distribution is that it is hard to find out which file 
> contains what.

The file will be annotated with the species and the source database

The _h files will be kept (the chips program needs them for example) ... but 
if we have no documentation on which genes are highly expressed we may have to 
keep the transterm files which are based on only a few genes.

> There is the issue of the number of files in the face of GUI's. Some GUI's 
> for EMBOSS generate a selector from which the user can choose a codon 
> usage table. If the complete CUTG has been extracted and installed, this 
> does not work well anymore. A selector with more than 10000 entries is not 
> convenient and furthermore, in a WWW interface the HTML page takes a 
> perceptibly long time to download.

Any cutgextract modification requests? I have added species selection.

> At the BEN site I solved this the following (not necessarily satisfactory) 
> way : I modified cutgextract so that it creates files with extension .cutg 
> rather than .cut. The interface wEMBOSS only shows the *.cut files in the 
> selector. If a user wants to use a CUTG rather than a standard 
> distribution file under wEMBOSS, he must first copy it to his project 
> using embossdata (at the command line there is no problem).

I will add an option to cutgextract for the output filename extension.

> As formats, it would of course be nice if EMBOSS programs could read and 
> write codon usage tables (and other data) in any format, just as they do 
> for sequences. Which formats should we support besides what EMBOSS uses 
> now ? Is there such a thing as "native" CUTG format (with one entry a 
> file) ?. I know about GCG format (not useful for us, but other people 
> certainly might want it). There is Staden format. Staden format supports 
> also files with 2 tables (codon usage in genes + trinucleotide frequency 
> in noncoding DNA) ; what to do with this ? only read the first ? There is 
> also the format used by CODEHOP 
> (http://blocks.fhcrc.org/blocks/codehop.html). Does 
> someone know other formats ?

CUTG has a format used on their web pages. It also has the spsum file which 
could be used.

regards,

Peter


From pmr at ebi.ac.uk  Fri Apr  1 13:50:52 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 01 Apr 2005 14:50:52 +0100
Subject: [EMBOSS] CODON USAGE TABLES
In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be>
References: <20050330172118.GA14064@bigben.ulb.ac.be>
Message-ID: <424D51BC.4030402@ebi.ac.uk>

Guy Bottu wrote:

> As formats, it would of course be nice if EMBOSS programs could read and 
> write codon usage tables (and other data) in any format, just as they do 
> for sequences. Which formats should we support besides what EMBOSS uses 
> now ? Is there such a thing as "native" CUTG format (with one entry a 
> file) ?. I know about GCG format (not useful for us, but other people 
> certainly might want it). There is Staden format. Staden format supports 
> also files with 2 tables (codon usage in genes + trinucleotide frequency 
> in noncoding DNA) ; what to do with this ? only read the first ? There is 
> also the format used by CODEHOP 
> (http://blocks.fhcrc.org/blocks/codehop.html).

CODEHOP format is minimal, but can be used. It appears to be derived from 
CUTG's "spsum" files (which I will also add as a format).

Other formats I know about (and will include):

codonusage database ftp://ftp.ebi.ac.uk/pub/databases/codonusage

transterm database ftp://ftp.ebi.ac.uk/pub/databases/transterm

GCG (with extra header comments to contain species and other information) does 
anyone have example from GCG or from other sources that write "GCG format" 
files so we can convert U -> T and any other non-standard data.

CUTG website format
http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=Drosophila+melanogaster+%5Bgbinv%5D&aa=1&style=N

SPSUM format (CUTG database .spsum files)

CODEHOP format http://blocks.fhcrc.org/blocks/codehop.html

Staden format: I have no example for this apart from one in the Staden 
src/seq_utils/genetics_codes.c source file - can someone send examples please? 
I would be happy reading an optional second file for some formats, although 
EMBOSS does not currently use the data the Staden format has.

regards,

Peter Rice


From ableasby at hgmp.mrc.ac.uk  Mon Apr  4 12:44:43 2005
From: ableasby at hgmp.mrc.ac.uk (Alan Bleasby)
Date: Mon, 4 Apr 2005 13:44:43 +0100 (BST)
Subject: [EMBOSS] Re: [EMBOSS-BUG] prophecy
Message-ID: <200504041244.j34CihJm022967@bromine.hgmp.mrc.ac.uk>

The short answer is that it was intentional. Until these programs
are replaced (on the list of things to do) it ought to be
documented though.

HTH

Alan


From muratem at eng.uah.edu  Mon Apr  4 15:07:30 2005
From: muratem at eng.uah.edu (Mike Muratet)
Date: Mon, 4 Apr 2005 10:07:30 -0500 (CDT)
Subject: [EMBOSS] Threading einverted
Message-ID: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>

Greetings

Has anyone ever tried to port einverted to a parallel machine like the SGI
altix? Has anyone tried to build multiple threads into einvertied?

Thanks

Mike


From pmr at ebi.ac.uk  Mon Apr  4 15:18:18 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Mon, 04 Apr 2005 16:18:18 +0100
Subject: [EMBOSS] Threading einverted
In-Reply-To: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>
References: <Pine.GSO.4.05.10504041004510.6258-100000@ebs330>
Message-ID: <42515ABA.4080802@ebi.ac.uk>

Dear Mike,

> Has anyone ever tried to port einverted to a parallel machine like the SGI
> altix? Has anyone tried to build multiple threads into einvertied?

We have not heard of any attempt to make einverted multi-threaded.

The algorithm is not the easiest to thread. But if you would like to try, I 
would be happy to help!

regards,

Peter Rice


From muratem at eng.uah.edu  Mon Apr  4 15:39:57 2005
From: muratem at eng.uah.edu (Mike Muratet)
Date: Mon, 4 Apr 2005 10:39:57 -0500 (CDT)
Subject: [EMBOSS] Threading einverted
In-Reply-To: <42515ABA.4080802@ebi.ac.uk>
Message-ID: <Pine.GSO.4.05.10504041028110.6258-100000@ebs330>


On Mon, 4 Apr 2005, Peter Rice wrote:

> Dear Mike,
> 
> > Has anyone ever tried to port einverted to a parallel machine like the SGI
> > altix? Has anyone tried to build multiple threads into einvertied?
> 
> We have not heard of any attempt to make einverted multi-threaded.
> 
> The algorithm is not the easiest to thread. But if you would like to try, I 
> would be happy to help!
> 
> regards,
> 
> Peter Rice
> 

Peter

I'm willing to have a go at it. I have an immediate need (isn't that
always the case?) and the shortest path may be threading. The biggest
machine I have access to is an altix. The Itaniums's are supposed to
scream. The system is down at the moment, but when it's available again
I'll compile it and run a benchmark with the existing source.

Do you have anything that describes the algorithm? I don't recall seeing
a reference to a paper. I'll print out the source and stare at it tonight.
The Altix has 16-cpu SMB nodes. It would be nice to hit on all 16.

Cheers

Mike


From msarachu at biol.unlp.edu.ar  Mon Apr 11 15:30:54 2005
From: msarachu at biol.unlp.edu.ar (Martin Sarachu)
Date: Mon, 11 Apr 2005 12:30:54 -0300
Subject: [EMBOSS] wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2
Message-ID: <425A982E.3060206@biol.unlp.edu.ar>

This is to announce the release of wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2

Changes in wEMBOSS include:
  - Small bugfixes and improvements of the ACD parser
  - Added support for Opera browser
  - Added multiple deletion of results

Changes in wrappers4EMBOSS include:
  - Compatibility for EMBOSS 2.8, 2.9 and 2.10
  - ps_scan wrappers updated for the last version of ps_scan.pl
  - Minor ACD enhancements

wrappers4EMBOSS can be installed together with wEMBOSS and is included 
in its distribution.
wrappers4EMBOSS can also be downloaded as a single package.

You can download both packages from http://www.wemboss.org


-- 
Martin Sarachu
msarachu at biol.unlp.edu.ar
AR.EMBnet
http://www.ar.embnet.org


From pmr at ebi.ac.uk  Tue Apr 12 08:57:27 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 12 Apr 2005 09:57:27 +0100
Subject: [EMBOSS] Codon usage file improvements
In-Reply-To: <424ACAB2.8090509@ebi.ac.uk>
References: <424ACAB2.8090509@ebi.ac.uk>
Message-ID: <425B8D77.7080409@ebi.ac.uk>

Peter Rice wrote:
> A quick check before I make changes to the EMBOSS codon usage files.

Done.

The codon usage files now committed to CVS (so this will happen from the next 
release) have the following changes:

1. file naming is Exxxxx where xxxxx is the UniProt/SwissProt 5-letter name 
for the species. Some species in UniProt/SwissProt have more than one name 
(strains used for genome projects, for example AGRTU and AGRT5 for 
Agrobacterium tumefasciens - EMBOSS will use Eagrtu.cut for the codon usage 
table, but has genes from the genome sequence).

For example:

#Species: Agrobacterium tumefaciens str. C58
#Division: gbbct
#Release: CUTG146
#CdsCount: 10705

#Coding GC 59.76%
#1st letter GC 63.11%
#2nd letter GC 44.70%
#3rd letter GC 71.47%

#Codon AA Fraction Frequency Number
GCA    A     0.132    15.154  51011
GCC    A     0.440    50.470 169886
GCG    A     0.328    37.649 126730
GCT    A     0.101    11.550  38879
TGC    C     0.783     6.486  21834


2. The old filenames will stay until release 3.0.0 for those who are used to 
them. I will add comments to their headers. They came from the CODONUSAGE and 
TRANSTERM databases, and we copied their filenames!

The attached file cut.txt lists the old file names and their species. I used 
the notes when selecting species for the new codon usage files.

3. EMBOSS will be able to read other codon usage table formats, and will 
extract the species and other information where possible

4. Codon usage files are checked for inconsistencies - if they specify the 
number of genes, then files with too many stop codons will give a warning. 
Some formats do not include the genetic code, so for some species and formats 
the warning can be ignored. The EMBOSS and GCG formats are safe.

5. Some EMBOSS programs read a codon usage file - but only use it to read a 
genetic code. These programs will instead prompt for a genetic code in the 
next release. For example, showseq and prettyseq only need a genetic code for 
translation. Backtranseq does need a codon usage table - for back translation 
it needs to know the most used codon for each amino acid.

6. A new file Cut.index (in the data/CODONS directory) will list all the codon 
usage files and their species so that a menu of installed codon usage files 
can be used by interfaces.

A copy of Cut.index is attached as Cut_index.txt

Hope this helps

Peter


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cut.txt
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Cut_index.txt
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment-0003.txt>

From robin at hms.harvard.edu  Tue Apr 12 14:17:42 2005
From: robin at hms.harvard.edu (Robin Colgrove)
Date: Tue, 12 Apr 2005 10:17:42 -0400
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <425B8D77.7080409@ebi.ac.uk>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk>
Message-ID: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>


Hello.

I was trying to use emma for a multiple sequence alignment of dna 
sequencing reads, but it complained that it could not find clustalw. I 
could not find any mention of clustalw on the EMBOSS page, so I got a 
copy from the clustalw homepage and -not knowing where to place it- 
tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and 
emma gives the error:

    EMBOSS An error in ajsys.c at line 398:
cannot find program 'clustalw'

Looking in the emma.acd and ajsys.c files, I can't find any guidance.

Does anyone know how this is supposed to work?
Alternatively, is there another good way to do multiple sequence 
alignment?
Looking ahead, I do not find any obvious way to do contig assembly, a 
la Phrap, or CAP.

thanks

robin colgrove


From pmr at ebi.ac.uk  Tue Apr 12 14:26:57 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 12 Apr 2005 15:26:57 +0100
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
Message-ID: <425BDAB1.6000606@ebi.ac.uk>

Robin Colgrove wrote:
> I was trying to use emma for a multiple sequence alignment of dna 
> sequencing reads, but it complained that it could not find clustalw. I 
> could not find any mention of clustalw on the EMBOSS page, so I got a 
> copy from the clustalw homepage and -not knowing where to place it- 
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and 
> emma gives the error:
> 
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
> 
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
> 
> Does anyone know how this is supposed to work?

Option 1: Install clustalw in your path so that you (and the emma program) can 
run it from the commandline. The directory where you installed EMBOSS is one 
possible place you can put it.

Option 2: Emma will look for a variable EMBOSS_CLUSTALW (an environment 
variable or a variable defined inemboss.defaults or .embossrc) that has the 
full path for clustalw.

Now ... we should document this ... and perhaps update the emma documentation 
which looks rather old and has too much old clustal information in it.

Hope this helps,

Peter


From robin at hms.harvard.edu  Tue Apr 12 19:20:01 2005
From: robin at hms.harvard.edu (Robin Colgrove)
Date: Tue, 12 Apr 2005 15:20:01 -0400
Subject: [EMBOSS] using emma: where to put clustalw
In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu>
Message-ID: <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu>


Thanks to all for suggestions.
Just putting clustalw in /usr/local/bin did the trick.

Now, I need to figure out why emma/clustalw is giving me such bad 
alignments.
Since I only had 4 sequences, I ended up aligning them pairwise with 
needle, then pieced together the full alignment in vi, but this is not 
going to fly as the number of sequences increases. The online tool I 
use for quick alignments ( 
http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but 
the same fasta file sent either to emma or directly to clustalw gives 
obviously wrong alignments, even though the nucleotide sequences are 
highly homologous.

thanks again

robin


On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:

>
> Hello.
>
> I was trying to use emma for a multiple sequence alignment of dna 
> sequencing reads, but it complained that it could not find clustalw. I 
> could not find any mention of clustalw on the EMBOSS page, so I got a 
> copy from the clustalw homepage and -not knowing where to place it- 
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, 
> and emma gives the error:
>
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
>
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
>
> Does anyone know how this is supposed to work?
> Alternatively, is there another good way to do multiple sequence 
> alignment?
> Looking ahead, I do not find any obvious way to do contig assembly, a 
> la Phrap, or CAP.
>
> thanks
>
> robin colgrove
>


From David.Bauer at Schering.de  Wed Apr 13 06:12:25 2005
From: David.Bauer at Schering.de (David.Bauer at Schering.de)
Date: Wed, 13 Apr 2005 08:12:25 +0200
Subject: Antwort: Re: [EMBOSS] using emma: where to put clustalw
Message-ID: <OF5C412136.C434E2FD-ONC1256FE2.0021397D-C1256FE2.00221916@schering.net>


Hi Robin,

how long are your 4 sequences ?
I observed that clustalw has problems with nucleotide alignments, if there
are larger differences in sequence length.
So e.g. if a 1 kb sequence is nearly completely contained with high
homology within another 2 kb sequence, the resulting alignment can be very
far from optimal.
If you try to align coding sequences there is a program "tranalign" in
EMBOSS.
You can first align the protein sequences (which usually works better than
a multiple alignment of DNA) and then use this alignment with tranalign to
guide the alignment of the corresponding cDNA.

Hope this helps,
David.


                      Robin Colgrove                                                                                             
                      <robin at hms.harva                                                                                           
                      rd.edu>                  An:      emboss at embnet.org                                                        
                      Gesendet von:            Kopie:                                                                            
                      owner-emboss at hgm         Thema:   Re: [EMBOSS] using emma: where to put clustalw                           
                      p.mrc.ac.uk                                                                                                
                                                                                                                                 
                                                                                                                                 
                      12.04.2005 21:20                                                                                           
                                                                                                                                 
                                                                                                                                 
Thanks to all for suggestions.
Just putting clustalw in /usr/local/bin did the trick.

Now, I need to figure out why emma/clustalw is giving me such bad
alignments.
Since I only had 4 sequences, I ended up aligning them pairwise with
needle, then pieced together the full alignment in vi, but this is not
going to fly as the number of sequences increases. The online tool I
use for quick alignments (
http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but
the same fasta file sent either to emma or directly to clustalw gives
obviously wrong alignments, even though the nucleotide sequences are
highly homologous.

thanks again

robin


On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:

>
> Hello.
>
> I was trying to use emma for a multiple sequence alignment of dna
> sequencing reads, but it complained that it could not find clustalw. I
> could not find any mention of clustalw on the EMBOSS page, so I got a
> copy from the clustalw homepage and -not knowing where to place it-
> tried the /usr/local/share/ EMBOSS/acd directory. That didn't work,
> and emma gives the error:
>
>    EMBOSS An error in ajsys.c at line 398:
> cannot find program 'clustalw'
>
> Looking in the emma.acd and ajsys.c files, I can't find any guidance.
>
> Does anyone know how this is supposed to work?
> Alternatively, is there another good way to do multiple sequence
> alignment?
> Looking ahead, I do not find any obvious way to do contig assembly, a
> la Phrap, or CAP.
>
> thanks
>
> robin colgrove
>


From gwilliam at hgmp.mrc.ac.uk  Wed Apr 13 08:22:19 2005
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Wed, 13 Apr 2005 09:22:19 +0100
Subject: [EMBOSS] using emma: where to put clustalw
References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu>
Message-ID: <425CD6BB.68F8C486@hgmp.mrc.ac.uk>

Some alternate multiple alignment programs for nucleotide sequences on
the web are at:

http://www.hgmp.mrc.ac.uk/GenomeWeb/nuc-mult.html

I would recommend DIALIGN

Gary


Robin Colgrove wrote:
> 
> Thanks to all for suggestions.
> Just putting clustalw in /usr/local/bin did the trick.
> 
> Now, I need to figure out why emma/clustalw is giving me such bad
> alignments.
> Since I only had 4 sequences, I ended up aligning them pairwise with
> needle, then pieced together the full alignment in vi, but this is not
> going to fly as the number of sequences increases. The online tool I
> use for quick alignments (
> http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but
> the same fasta file sent either to emma or directly to clustalw gives
> obviously wrong alignments, even though the nucleotide sequences are
> highly homologous.
> 
> thanks again
> 
> robin
> 
> On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote:
> 
> >
> > Hello.
> >
> > I was trying to use emma for a multiple sequence alignment of dna
> > sequencing reads, but it complained that it could not find clustalw. I
> > could not find any mention of clustalw on the EMBOSS page, so I got a
> > copy from the clustalw homepage and -not knowing where to place it-
> > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work,
> > and emma gives the error:
> >
> >    EMBOSS An error in ajsys.c at line 398:
> > cannot find program 'clustalw'
> >
> > Looking in the emma.acd and ajsys.c files, I can't find any guidance.
> >
> > Does anyone know how this is supposed to work?
> > Alternatively, is there another good way to do multiple sequence
> > alignment?
> > Looking ahead, I do not find any obvious way to do contig assembly, a
> > la Phrap, or CAP.
> >
> > thanks
> >
> > robin colgrove
> >

-- 
Gary Williams
MRC Rosalind Franklin Centre for Genomics Research
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
Tel: +44 1223 494522			Fax: +44 1223 494512
E-mail: gwilliam at rfcgr.mrc.ac.uk	Web: http://www.rfcgr.mrc.ac.uk


From jrvalverde at cnb.uam.es  Thu Apr 21 09:58:51 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Thu, 21 Apr 2005 11:58:51 +0200
Subject: [EMBOSS] Wiki
Message-ID: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>

I would rather welcome a Wiki for EMBOSS documentation.

I can host it at Es.EMBnet.Org/es.emboss.org, no problem at that.

The reason is that as I run into problems/tricks/tasks to do, I see
comments that might be added here and there in the documentation. I
would rather go to a single site and make the changes myself than 
go throught he hassle of devising a 'diff' comment, finding out who
to mail, mailing them andn waiting for a new doc release.

If there is interest, I can set it up straight away.

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050421/4620bfb1/attachment.sig>

From pmr at ebi.ac.uk  Thu Apr 21 16:20:24 2005
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 21 Apr 2005 17:20:24 +0100
Subject: [EMBOSS] Wiki
In-Reply-To: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
Message-ID: <4267D2C8.10009@ebi.ac.uk>

Jos? R. Valverde wrote:

> I would rather welcome a Wiki for EMBOSS documentation.

We have all the documentation (including the sourceforge web pages) in CVS. 
Any member of the development/documentation team can make updates there.

No need for a wiki for this - and a wiki would be difficult to manage as most 
of the documentation is generated automatically.

> The reason is that as I run into problems/tricks/tasks to do, I see
> comments that might be added here and there in the documentation. I
> would rather go to a single site and make the changes myself than 
> go throught he hassle of devising a 'diff' comment, finding out who
> to mail, mailing them andn waiting for a new doc release.


Just mail anything like that to emboss-bug.

After all ... there is not much point in changing a wiki version of the 
documentation if we are busy changing the application and the real 
documentation :-)

regards,

Peter


From jrvalverde at cnb.uam.es  Fri Apr 22 08:11:18 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Fri, 22 Apr 2005 10:11:18 +0200
Subject: [EMBOSS] Wiki (and Macs)
In-Reply-To: <4267D2C8.10009@ebi.ac.uk>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
	<4267D2C8.10009@ebi.ac.uk>
Message-ID: <20050422101118.33b19892.jrvalverde@cnb.uam.es>

On Thu, 21 Apr 2005 17:20:24 +0100
Peter Rice <pmr at ebi.ac.uk> wrote:
> 
> After all ... there is not much point in changing a wiki version of the 
> documentation if we are busy changing the application and the real 
> documentation :-)
> 
> regards,
> 
> Peter

Right you are Sir. I guess it's better as it is for now. And yet...

Speaking generally, it probably boils down to the management model we
want for EMBOSS. As it is now I tend to see it much like a Cathedral
than a Bazaar. Truly it isn't, but you must agree it is not so evident
from the docs what the procedures are for participation. At least not
at first sight.

I'm more for the Bazaar model, one where everyone is welcome and 
making changes is as trivial as possible (specially for end-users
and end-user-related material, like docs). I'd rather have that as
a 'common' to build a user community around. Game theory shows that
to be the best strategy in the long run (see e.g. 
http://encyclopedia.laborlawtalk.com/Tragedy_of_the_commons ).

In the short run, with limited resources as the EMBOSS team currently
is, you are right it takes a significant effort and portion of the
existing resources. It makes more sense to concentrate on the short
term now and surviving enough to drive new resources in.

But I think we should have that in sight for the long term.

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050422/7a0edd92/attachment.sig>

From jrvalverde at cnb.uam.es  Fri Apr 22 08:20:58 2005
From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde)
Date: Fri, 22 Apr 2005 10:20:58 +0200
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <4267D2C8.10009@ebi.ac.uk>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
	<4267D2C8.10009@ebi.ac.uk>
Message-ID: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>

I'm trying to find out ways to fund EMBOSS in a way that I can
justify locally.

Mac users are a growing 'market' and a promising community. I've got 
here hundreds of Macs, and they need an easy to use, install and
manage solution.

What is needed (they tell me) is a good editor, and some interactive
graphic facilities for common, simple tasks. Actually, locally, we are
going to spend a significant amount into buying a handful of licenses
for commercial software.

I've tried Erik's CD, but it has some drawbacks regarding the configuration
on non-user-managed Macs (as those where root belongs to a central
authority): Here they can install software but not make modifications.
I can't either, being on the SciComp side and not on the Offimatic
end.

I don't have the resources to do that locally, but would welcome a
sensible way to fund it (like buying 'licenses', packages, CDs or
manuals from an EMBOSS-centered company).

I for one would certainly welcome a Macintosh edition ready to run,
and easy to configure to use central databases. If I were to chose,
I'd try to add those facilities to Jemboss (a sequence editor, and
interactive drawing of clones and molecular graphics). This is the
most lacking thing in EMBOSS now that every user has or can have a 
UNIX machine at their desktop.

And, certainly, I would happily recommend locally that we buy a 
hundred+ licenses at a reasonable price if that would help
fund EMBOSS.

Most ideally, something like the LiveDVD from AT.EMBnet.Org but for
Macs would be a candy. And an easy to justify buy.

Any recommendations? Takers? Pointers?

				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050422/49e14ae1/attachment.sig>

From kellert at ohsu.edu  Fri Apr 22 16:33:44 2005
From: kellert at ohsu.edu (Thomas J Keller)
Date: Fri, 22 Apr 2005 09:33:44 -0700
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es>
 <4267D2C8.10009@ebi.ac.uk>
 <20050422102058.2ca36edb.jrvalverde@cnb.uam.es>
Message-ID: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>

Greetings,
Have you looked at the fink installation of emboss and kaptain?
I use emboss from the command line, so I haven't tried the GUI 
application "kaptain". Here's what the fink database has to say about 
it:
#######################
kaptain-0.71-22: Universal graphical front-end
   Kaptain is a universal graphical front-end for command line programs,
   and it works wherever Qt3 is available. Someone writes a simple script
   (so called grammar) which describes the possible arguments for a
   command line program and Kaptain brings up a friendly dialog to
   the user to set up the command line. Example grammars can be
   found in /sw/share/kaptain/.
  .
  Web site: http://kaptain.sourceforge.net
  .
  Maintainer: Koen van der Drift <kvddrift at earthlink.net>
######################
The emboss grammar has been written and is available through fink.

You do need the developer tools installed on your Mac, but that's 
trivial, and comes with the OS, so no additional charge for your users.

Just a thought.

Tom Keller, Ph.D.
http://www.ohsu.edu/research/core
kellert at ohsu.edu
503-494-2442

On Apr 22, 2005, at 1:20 AM, Jos? R. Valverde wrote:

> I'm trying to find out ways to fund EMBOSS in a way that I can
> justify locally.
>
> Mac users are a growing 'market' and a promising community. I've got
> here hundreds of Macs, and they need an easy to use, install and
> manage solution.
>
> What is needed (they tell me) is a good editor, and some interactive
> graphic facilities for common, simple tasks. Actually, locally, we are
> going to spend a significant amount into buying a handful of licenses
> for commercial software.
>
> I've tried Erik's CD, but it has some drawbacks regarding the 
> configuration
> on non-user-managed Macs (as those where root belongs to a central
> authority): Here they can install software but not make modifications.
> I can't either, being on the SciComp side and not on the Offimatic
> end.
>
> I don't have the resources to do that locally, but would welcome a
> sensible way to fund it (like buying 'licenses', packages, CDs or
> manuals from an EMBOSS-centered company).
>
> I for one would certainly welcome a Macintosh edition ready to run,
> and easy to configure to use central databases. If I were to chose,
> I'd try to add those facilities to Jemboss (a sequence editor, and
> interactive drawing of clones and molecular graphics). This is the
> most lacking thing in EMBOSS now that every user has or can have a
> UNIX machine at their desktop.
>
> And, certainly, I would happily recommend locally that we buy a
> hundred+ licenses at a reasonable price if that would help
> fund EMBOSS.
>
> Most ideally, something like the LiveDVD from AT.EMBnet.Org but for
> Macs would be a candy. And an easy to justify buy.
>
> Any recommendations? Takers? Pointers?
>
> 				j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2879 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20050422/8bbb2284/attachment-0001.bin>

From kvddrift at earthlink.net  Fri Apr 22 20:17:48 2005
From: kvddrift at earthlink.net (Koen van der Drift)
Date: Fri, 22 Apr 2005 16:17:48 -0400
Subject: [EMBOSS] Macintosh EMBOSS
In-Reply-To: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>
References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu>
Message-ID: <a0d17fb52c05656b4195a1ef087e69a7@earthlink.net>


On Apr 22, 2005, at 12:33 PM, Thomas J Keller wrote:

>  Web site: http://kaptain.sourceforge.net
>  .
>

Actually, the package is emboss-kaptain.


- Koen.