From pmr at ebi.ac.uk Fri Apr 1 03:33:41 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 01 Apr 2005 09:33:41 +0100 Subject: [EMBOSS] CODON USAGE TABLES In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be> References: <20050330172118.GA14064@bigben.ulb.ac.be> Message-ID: <424D0765.1000306@ebi.ac.uk> Guy Bottu wrote: > Dear Peter, dear all, > > A few thoughts on the codon usage tables, now that you are working on > them. > > Do you intend to drop the existing tables from the distribution in favor > of tables from CUTG ? CUTG has one drawback : the entries for each > organism/organelle are made from all the genes, without taking account of > the fact that there exist distinct subpopulations. E.g. in E. coli there > are the highly expressed genes, the lowly expressed genes and the > horizontally transferred genes, which have different codon usage. I think > that in the distribution there are at least for some organisms specific > files (e.g. Eeco.cut and Eeco_h.cut). The great problem with the files > from the current distribution is that it is hard to find out which file > contains what. The file will be annotated with the species and the source database The _h files will be kept (the chips program needs them for example) ... but if we have no documentation on which genes are highly expressed we may have to keep the transterm files which are based on only a few genes. > There is the issue of the number of files in the face of GUI's. Some GUI's > for EMBOSS generate a selector from which the user can choose a codon > usage table. If the complete CUTG has been extracted and installed, this > does not work well anymore. A selector with more than 10000 entries is not > convenient and furthermore, in a WWW interface the HTML page takes a > perceptibly long time to download. Any cutgextract modification requests? I have added species selection. > At the BEN site I solved this the following (not necessarily satisfactory) > way : I modified cutgextract so that it creates files with extension .cutg > rather than .cut. The interface wEMBOSS only shows the *.cut files in the > selector. If a user wants to use a CUTG rather than a standard > distribution file under wEMBOSS, he must first copy it to his project > using embossdata (at the command line there is no problem). I will add an option to cutgextract for the output filename extension. > As formats, it would of course be nice if EMBOSS programs could read and > write codon usage tables (and other data) in any format, just as they do > for sequences. Which formats should we support besides what EMBOSS uses > now ? Is there such a thing as "native" CUTG format (with one entry a > file) ?. I know about GCG format (not useful for us, but other people > certainly might want it). There is Staden format. Staden format supports > also files with 2 tables (codon usage in genes + trinucleotide frequency > in noncoding DNA) ; what to do with this ? only read the first ? There is > also the format used by CODEHOP > (http://blocks.fhcrc.org/blocks/codehop.html). Does > someone know other formats ? CUTG has a format used on their web pages. It also has the spsum file which could be used. regards, Peter From pmr at ebi.ac.uk Fri Apr 1 08:50:52 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 01 Apr 2005 14:50:52 +0100 Subject: [EMBOSS] CODON USAGE TABLES In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be> References: <20050330172118.GA14064@bigben.ulb.ac.be> Message-ID: <424D51BC.4030402@ebi.ac.uk> Guy Bottu wrote: > As formats, it would of course be nice if EMBOSS programs could read and > write codon usage tables (and other data) in any format, just as they do > for sequences. Which formats should we support besides what EMBOSS uses > now ? Is there such a thing as "native" CUTG format (with one entry a > file) ?. I know about GCG format (not useful for us, but other people > certainly might want it). There is Staden format. Staden format supports > also files with 2 tables (codon usage in genes + trinucleotide frequency > in noncoding DNA) ; what to do with this ? only read the first ? There is > also the format used by CODEHOP > (http://blocks.fhcrc.org/blocks/codehop.html). CODEHOP format is minimal, but can be used. It appears to be derived from CUTG's "spsum" files (which I will also add as a format). Other formats I know about (and will include): codonusage database ftp://ftp.ebi.ac.uk/pub/databases/codonusage transterm database ftp://ftp.ebi.ac.uk/pub/databases/transterm GCG (with extra header comments to contain species and other information) does anyone have example from GCG or from other sources that write "GCG format" files so we can convert U -> T and any other non-standard data. CUTG website format http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=Drosophila+melanogaster+%5Bgbinv%5D&aa=1&style=N SPSUM format (CUTG database .spsum files) CODEHOP format http://blocks.fhcrc.org/blocks/codehop.html Staden format: I have no example for this apart from one in the Staden src/seq_utils/genetics_codes.c source file - can someone send examples please? I would be happy reading an optional second file for some formats, although EMBOSS does not currently use the data the Staden format has. regards, Peter Rice From ableasby at hgmp.mrc.ac.uk Mon Apr 4 08:44:43 2005 From: ableasby at hgmp.mrc.ac.uk (Alan Bleasby) Date: Mon, 4 Apr 2005 13:44:43 +0100 (BST) Subject: [EMBOSS] Re: [EMBOSS-BUG] prophecy Message-ID: <200504041244.j34CihJm022967@bromine.hgmp.mrc.ac.uk> The short answer is that it was intentional. Until these programs are replaced (on the list of things to do) it ought to be documented though. HTH Alan From muratem at eng.uah.edu Mon Apr 4 11:07:30 2005 From: muratem at eng.uah.edu (Mike Muratet) Date: Mon, 4 Apr 2005 10:07:30 -0500 (CDT) Subject: [EMBOSS] Threading einverted Message-ID: Greetings Has anyone ever tried to port einverted to a parallel machine like the SGI altix? Has anyone tried to build multiple threads into einvertied? Thanks Mike From pmr at ebi.ac.uk Mon Apr 4 11:18:18 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 04 Apr 2005 16:18:18 +0100 Subject: [EMBOSS] Threading einverted In-Reply-To: References: Message-ID: <42515ABA.4080802@ebi.ac.uk> Dear Mike, > Has anyone ever tried to port einverted to a parallel machine like the SGI > altix? Has anyone tried to build multiple threads into einvertied? We have not heard of any attempt to make einverted multi-threaded. The algorithm is not the easiest to thread. But if you would like to try, I would be happy to help! regards, Peter Rice From muratem at eng.uah.edu Mon Apr 4 11:39:57 2005 From: muratem at eng.uah.edu (Mike Muratet) Date: Mon, 4 Apr 2005 10:39:57 -0500 (CDT) Subject: [EMBOSS] Threading einverted In-Reply-To: <42515ABA.4080802@ebi.ac.uk> Message-ID: On Mon, 4 Apr 2005, Peter Rice wrote: > Dear Mike, > > > Has anyone ever tried to port einverted to a parallel machine like the SGI > > altix? Has anyone tried to build multiple threads into einvertied? > > We have not heard of any attempt to make einverted multi-threaded. > > The algorithm is not the easiest to thread. But if you would like to try, I > would be happy to help! > > regards, > > Peter Rice > Peter I'm willing to have a go at it. I have an immediate need (isn't that always the case?) and the shortest path may be threading. The biggest machine I have access to is an altix. The Itaniums's are supposed to scream. The system is down at the moment, but when it's available again I'll compile it and run a benchmark with the existing source. Do you have anything that describes the algorithm? I don't recall seeing a reference to a paper. I'll print out the source and stare at it tonight. The Altix has 16-cpu SMB nodes. It would be nice to hit on all 16. Cheers Mike From msarachu at biol.unlp.edu.ar Mon Apr 11 11:30:54 2005 From: msarachu at biol.unlp.edu.ar (Martin Sarachu) Date: Mon, 11 Apr 2005 12:30:54 -0300 Subject: [EMBOSS] wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2 Message-ID: <425A982E.3060206@biol.unlp.edu.ar> This is to announce the release of wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2 Changes in wEMBOSS include: - Small bugfixes and improvements of the ACD parser - Added support for Opera browser - Added multiple deletion of results Changes in wrappers4EMBOSS include: - Compatibility for EMBOSS 2.8, 2.9 and 2.10 - ps_scan wrappers updated for the last version of ps_scan.pl - Minor ACD enhancements wrappers4EMBOSS can be installed together with wEMBOSS and is included in its distribution. wrappers4EMBOSS can also be downloaded as a single package. You can download both packages from http://www.wemboss.org -- Martin Sarachu msarachu at biol.unlp.edu.ar AR.EMBnet http://www.ar.embnet.org From pmr at ebi.ac.uk Tue Apr 12 04:57:27 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 12 Apr 2005 09:57:27 +0100 Subject: [EMBOSS] Codon usage file improvements In-Reply-To: <424ACAB2.8090509@ebi.ac.uk> References: <424ACAB2.8090509@ebi.ac.uk> Message-ID: <425B8D77.7080409@ebi.ac.uk> Peter Rice wrote: > A quick check before I make changes to the EMBOSS codon usage files. Done. The codon usage files now committed to CVS (so this will happen from the next release) have the following changes: 1. file naming is Exxxxx where xxxxx is the UniProt/SwissProt 5-letter name for the species. Some species in UniProt/SwissProt have more than one name (strains used for genome projects, for example AGRTU and AGRT5 for Agrobacterium tumefasciens - EMBOSS will use Eagrtu.cut for the codon usage table, but has genes from the genome sequence). For example: #Species: Agrobacterium tumefaciens str. C58 #Division: gbbct #Release: CUTG146 #CdsCount: 10705 #Coding GC 59.76% #1st letter GC 63.11% #2nd letter GC 44.70% #3rd letter GC 71.47% #Codon AA Fraction Frequency Number GCA A 0.132 15.154 51011 GCC A 0.440 50.470 169886 GCG A 0.328 37.649 126730 GCT A 0.101 11.550 38879 TGC C 0.783 6.486 21834 2. The old filenames will stay until release 3.0.0 for those who are used to them. I will add comments to their headers. They came from the CODONUSAGE and TRANSTERM databases, and we copied their filenames! The attached file cut.txt lists the old file names and their species. I used the notes when selecting species for the new codon usage files. 3. EMBOSS will be able to read other codon usage table formats, and will extract the species and other information where possible 4. Codon usage files are checked for inconsistencies - if they specify the number of genes, then files with too many stop codons will give a warning. Some formats do not include the genetic code, so for some species and formats the warning can be ignored. The EMBOSS and GCG formats are safe. 5. Some EMBOSS programs read a codon usage file - but only use it to read a genetic code. These programs will instead prompt for a genetic code in the next release. For example, showseq and prettyseq only need a genetic code for translation. Backtranseq does need a codon usage table - for back translation it needs to know the most used codon for each amino acid. 6. A new file Cut.index (in the data/CODONS directory) will list all the codon usage files and their species so that a menu of installed codon usage files can be used by interfaces. A copy of Cut.index is attached as Cut_index.txt Hope this helps Peter -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cut.txt Url: http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Cut_index.txt Url: http://lists.open-bio.org/pipermail/emboss/attachments/20050412/d7935cf0/attachment-0001.txt From robin at hms.harvard.edu Tue Apr 12 10:17:42 2005 From: robin at hms.harvard.edu (Robin Colgrove) Date: Tue, 12 Apr 2005 10:17:42 -0400 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <425B8D77.7080409@ebi.ac.uk> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> Message-ID: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Hello. I was trying to use emma for a multiple sequence alignment of dna sequencing reads, but it complained that it could not find clustalw. I could not find any mention of clustalw on the EMBOSS page, so I got a copy from the clustalw homepage and -not knowing where to place it- tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and emma gives the error: EMBOSS An error in ajsys.c at line 398: cannot find program 'clustalw' Looking in the emma.acd and ajsys.c files, I can't find any guidance. Does anyone know how this is supposed to work? Alternatively, is there another good way to do multiple sequence alignment? Looking ahead, I do not find any obvious way to do contig assembly, a la Phrap, or CAP. thanks robin colgrove From pmr at ebi.ac.uk Tue Apr 12 10:26:57 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 12 Apr 2005 15:26:57 +0100 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Message-ID: <425BDAB1.6000606@ebi.ac.uk> Robin Colgrove wrote: > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and > emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? Option 1: Install clustalw in your path so that you (and the emma program) can run it from the commandline. The directory where you installed EMBOSS is one possible place you can put it. Option 2: Emma will look for a variable EMBOSS_CLUSTALW (an environment variable or a variable defined inemboss.defaults or .embossrc) that has the full path for clustalw. Now ... we should document this ... and perhaps update the emma documentation which looks rather old and has too much old clustal information in it. Hope this helps, Peter From robin at hms.harvard.edu Tue Apr 12 15:20:01 2005 From: robin at hms.harvard.edu (Robin Colgrove) Date: Tue, 12 Apr 2005 15:20:01 -0400 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Message-ID: <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu> Thanks to all for suggestions. Just putting clustalw in /usr/local/bin did the trick. Now, I need to figure out why emma/clustalw is giving me such bad alignments. Since I only had 4 sequences, I ended up aligning them pairwise with needle, then pieced together the full alignment in vi, but this is not going to fly as the number of sequences increases. The online tool I use for quick alignments ( http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but the same fasta file sent either to emma or directly to clustalw gives obviously wrong alignments, even though the nucleotide sequences are highly homologous. thanks again robin On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > Hello. > > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > and emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? > Alternatively, is there another good way to do multiple sequence > alignment? > Looking ahead, I do not find any obvious way to do contig assembly, a > la Phrap, or CAP. > > thanks > > robin colgrove > From David.Bauer at Schering.de Wed Apr 13 02:12:25 2005 From: David.Bauer at Schering.de (David.Bauer at Schering.de) Date: Wed, 13 Apr 2005 08:12:25 +0200 Subject: Antwort: Re: [EMBOSS] using emma: where to put clustalw Message-ID: Hi Robin, how long are your 4 sequences ? I observed that clustalw has problems with nucleotide alignments, if there are larger differences in sequence length. So e.g. if a 1 kb sequence is nearly completely contained with high homology within another 2 kb sequence, the resulting alignment can be very far from optimal. If you try to align coding sequences there is a program "tranalign" in EMBOSS. You can first align the protein sequences (which usually works better than a multiple alignment of DNA) and then use this alignment with tranalign to guide the alignment of the corresponding cDNA. Hope this helps, David. Robin Colgrove An: emboss at embnet.org Gesendet von: Kopie: owner-emboss at hgm Thema: Re: [EMBOSS] using emma: where to put clustalw p.mrc.ac.uk 12.04.2005 21:20 Thanks to all for suggestions. Just putting clustalw in /usr/local/bin did the trick. Now, I need to figure out why emma/clustalw is giving me such bad alignments. Since I only had 4 sequences, I ended up aligning them pairwise with needle, then pieced together the full alignment in vi, but this is not going to fly as the number of sequences increases. The online tool I use for quick alignments ( http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but the same fasta file sent either to emma or directly to clustalw gives obviously wrong alignments, even though the nucleotide sequences are highly homologous. thanks again robin On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > Hello. > > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > and emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? > Alternatively, is there another good way to do multiple sequence > alignment? > Looking ahead, I do not find any obvious way to do contig assembly, a > la Phrap, or CAP. > > thanks > > robin colgrove > From gwilliam at hgmp.mrc.ac.uk Wed Apr 13 04:22:19 2005 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 13 Apr 2005 09:22:19 +0100 Subject: [EMBOSS] using emma: where to put clustalw References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu> Message-ID: <425CD6BB.68F8C486@hgmp.mrc.ac.uk> Some alternate multiple alignment programs for nucleotide sequences on the web are at: http://www.hgmp.mrc.ac.uk/GenomeWeb/nuc-mult.html I would recommend DIALIGN Gary Robin Colgrove wrote: > > Thanks to all for suggestions. > Just putting clustalw in /usr/local/bin did the trick. > > Now, I need to figure out why emma/clustalw is giving me such bad > alignments. > Since I only had 4 sequences, I ended up aligning them pairwise with > needle, then pieced together the full alignment in vi, but this is not > going to fly as the number of sequences increases. The online tool I > use for quick alignments ( > http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but > the same fasta file sent either to emma or directly to clustalw gives > obviously wrong alignments, even though the nucleotide sequences are > highly homologous. > > thanks again > > robin > > On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > > > > Hello. > > > > I was trying to use emma for a multiple sequence alignment of dna > > sequencing reads, but it complained that it could not find clustalw. I > > could not find any mention of clustalw on the EMBOSS page, so I got a > > copy from the clustalw homepage and -not knowing where to place it- > > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > > and emma gives the error: > > > > EMBOSS An error in ajsys.c at line 398: > > cannot find program 'clustalw' > > > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > > > Does anyone know how this is supposed to work? > > Alternatively, is there another good way to do multiple sequence > > alignment? > > Looking ahead, I do not find any obvious way to do contig assembly, a > > la Phrap, or CAP. > > > > thanks > > > > robin colgrove > > -- Gary Williams MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK Tel: +44 1223 494522 Fax: +44 1223 494512 E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk From jrvalverde at cnb.uam.es Thu Apr 21 05:58:51 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Thu, 21 Apr 2005 11:58:51 +0200 Subject: [EMBOSS] Wiki Message-ID: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> I would rather welcome a Wiki for EMBOSS documentation. I can host it at Es.EMBnet.Org/es.emboss.org, no problem at that. The reason is that as I run into problems/tricks/tasks to do, I see comments that might be added here and there in the documentation. I would rather go to a single site and make the changes myself than go throught he hassle of devising a 'diff' comment, finding out who to mail, mailing them andn waiting for a new doc release. If there is interest, I can set it up straight away. j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050421/4620bfb1/attachment.bin From pmr at ebi.ac.uk Thu Apr 21 12:20:24 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 21 Apr 2005 17:20:24 +0100 Subject: [EMBOSS] Wiki In-Reply-To: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> Message-ID: <4267D2C8.10009@ebi.ac.uk> Jos? R. Valverde wrote: > I would rather welcome a Wiki for EMBOSS documentation. We have all the documentation (including the sourceforge web pages) in CVS. Any member of the development/documentation team can make updates there. No need for a wiki for this - and a wiki would be difficult to manage as most of the documentation is generated automatically. > The reason is that as I run into problems/tricks/tasks to do, I see > comments that might be added here and there in the documentation. I > would rather go to a single site and make the changes myself than > go throught he hassle of devising a 'diff' comment, finding out who > to mail, mailing them andn waiting for a new doc release. Just mail anything like that to emboss-bug. After all ... there is not much point in changing a wiki version of the documentation if we are busy changing the application and the real documentation :-) regards, Peter From jrvalverde at cnb.uam.es Fri Apr 22 04:11:18 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Fri, 22 Apr 2005 10:11:18 +0200 Subject: [EMBOSS] Wiki (and Macs) In-Reply-To: <4267D2C8.10009@ebi.ac.uk> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> Message-ID: <20050422101118.33b19892.jrvalverde@cnb.uam.es> On Thu, 21 Apr 2005 17:20:24 +0100 Peter Rice wrote: > > After all ... there is not much point in changing a wiki version of the > documentation if we are busy changing the application and the real > documentation :-) > > regards, > > Peter Right you are Sir. I guess it's better as it is for now. And yet... Speaking generally, it probably boils down to the management model we want for EMBOSS. As it is now I tend to see it much like a Cathedral than a Bazaar. Truly it isn't, but you must agree it is not so evident from the docs what the procedures are for participation. At least not at first sight. I'm more for the Bazaar model, one where everyone is welcome and making changes is as trivial as possible (specially for end-users and end-user-related material, like docs). I'd rather have that as a 'common' to build a user community around. Game theory shows that to be the best strategy in the long run (see e.g. http://encyclopedia.laborlawtalk.com/Tragedy_of_the_commons ). In the short run, with limited resources as the EMBOSS team currently is, you are right it takes a significant effort and portion of the existing resources. It makes more sense to concentrate on the short term now and surviving enough to drive new resources in. But I think we should have that in sight for the long term. j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/7a0edd92/attachment.bin From jrvalverde at cnb.uam.es Fri Apr 22 04:20:58 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Fri, 22 Apr 2005 10:20:58 +0200 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <4267D2C8.10009@ebi.ac.uk> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> Message-ID: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> I'm trying to find out ways to fund EMBOSS in a way that I can justify locally. Mac users are a growing 'market' and a promising community. I've got here hundreds of Macs, and they need an easy to use, install and manage solution. What is needed (they tell me) is a good editor, and some interactive graphic facilities for common, simple tasks. Actually, locally, we are going to spend a significant amount into buying a handful of licenses for commercial software. I've tried Erik's CD, but it has some drawbacks regarding the configuration on non-user-managed Macs (as those where root belongs to a central authority): Here they can install software but not make modifications. I can't either, being on the SciComp side and not on the Offimatic end. I don't have the resources to do that locally, but would welcome a sensible way to fund it (like buying 'licenses', packages, CDs or manuals from an EMBOSS-centered company). I for one would certainly welcome a Macintosh edition ready to run, and easy to configure to use central databases. If I were to chose, I'd try to add those facilities to Jemboss (a sequence editor, and interactive drawing of clones and molecular graphics). This is the most lacking thing in EMBOSS now that every user has or can have a UNIX machine at their desktop. And, certainly, I would happily recommend locally that we buy a hundred+ licenses at a reasonable price if that would help fund EMBOSS. Most ideally, something like the LiveDVD from AT.EMBnet.Org but for Macs would be a candy. And an easy to justify buy. Any recommendations? Takers? Pointers? j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/49e14ae1/attachment.bin From kellert at ohsu.edu Fri Apr 22 12:33:44 2005 From: kellert at ohsu.edu (Thomas J Keller) Date: Fri, 22 Apr 2005 09:33:44 -0700 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> Message-ID: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> Greetings, Have you looked at the fink installation of emboss and kaptain? I use emboss from the command line, so I haven't tried the GUI application "kaptain". Here's what the fink database has to say about it: ####################### kaptain-0.71-22: Universal graphical front-end Kaptain is a universal graphical front-end for command line programs, and it works wherever Qt3 is available. Someone writes a simple script (so called grammar) which describes the possible arguments for a command line program and Kaptain brings up a friendly dialog to the user to set up the command line. Example grammars can be found in /sw/share/kaptain/. . Web site: http://kaptain.sourceforge.net . Maintainer: Koen van der Drift ###################### The emboss grammar has been written and is available through fink. You do need the developer tools installed on your Mac, but that's trivial, and comes with the OS, so no additional charge for your users. Just a thought. Tom Keller, Ph.D. http://www.ohsu.edu/research/core kellert at ohsu.edu 503-494-2442 On Apr 22, 2005, at 1:20 AM, Jos? R. Valverde wrote: > I'm trying to find out ways to fund EMBOSS in a way that I can > justify locally. > > Mac users are a growing 'market' and a promising community. I've got > here hundreds of Macs, and they need an easy to use, install and > manage solution. > > What is needed (they tell me) is a good editor, and some interactive > graphic facilities for common, simple tasks. Actually, locally, we are > going to spend a significant amount into buying a handful of licenses > for commercial software. > > I've tried Erik's CD, but it has some drawbacks regarding the > configuration > on non-user-managed Macs (as those where root belongs to a central > authority): Here they can install software but not make modifications. > I can't either, being on the SciComp side and not on the Offimatic > end. > > I don't have the resources to do that locally, but would welcome a > sensible way to fund it (like buying 'licenses', packages, CDs or > manuals from an EMBOSS-centered company). > > I for one would certainly welcome a Macintosh edition ready to run, > and easy to configure to use central databases. If I were to chose, > I'd try to add those facilities to Jemboss (a sequence editor, and > interactive drawing of clones and molecular graphics). This is the > most lacking thing in EMBOSS now that every user has or can have a > UNIX machine at their desktop. > > And, certainly, I would happily recommend locally that we buy a > hundred+ licenses at a reasonable price if that would help > fund EMBOSS. > > Most ideally, something like the LiveDVD from AT.EMBnet.Org but for > Macs would be a candy. And an easy to justify buy. > > Any recommendations? Takers? Pointers? > > j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 2879 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20050422/8bbb2284/attachment.bin From kvddrift at earthlink.net Fri Apr 22 16:17:48 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 22 Apr 2005 16:17:48 -0400 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> Message-ID: On Apr 22, 2005, at 12:33 PM, Thomas J Keller wrote: > Web site: http://kaptain.sourceforge.net > . > Actually, the package is emboss-kaptain. - Koen. From pmr at ebi.ac.uk Fri Apr 1 08:33:41 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 01 Apr 2005 09:33:41 +0100 Subject: [EMBOSS] CODON USAGE TABLES In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be> References: <20050330172118.GA14064@bigben.ulb.ac.be> Message-ID: <424D0765.1000306@ebi.ac.uk> Guy Bottu wrote: > Dear Peter, dear all, > > A few thoughts on the codon usage tables, now that you are working on > them. > > Do you intend to drop the existing tables from the distribution in favor > of tables from CUTG ? CUTG has one drawback : the entries for each > organism/organelle are made from all the genes, without taking account of > the fact that there exist distinct subpopulations. E.g. in E. coli there > are the highly expressed genes, the lowly expressed genes and the > horizontally transferred genes, which have different codon usage. I think > that in the distribution there are at least for some organisms specific > files (e.g. Eeco.cut and Eeco_h.cut). The great problem with the files > from the current distribution is that it is hard to find out which file > contains what. The file will be annotated with the species and the source database The _h files will be kept (the chips program needs them for example) ... but if we have no documentation on which genes are highly expressed we may have to keep the transterm files which are based on only a few genes. > There is the issue of the number of files in the face of GUI's. Some GUI's > for EMBOSS generate a selector from which the user can choose a codon > usage table. If the complete CUTG has been extracted and installed, this > does not work well anymore. A selector with more than 10000 entries is not > convenient and furthermore, in a WWW interface the HTML page takes a > perceptibly long time to download. Any cutgextract modification requests? I have added species selection. > At the BEN site I solved this the following (not necessarily satisfactory) > way : I modified cutgextract so that it creates files with extension .cutg > rather than .cut. The interface wEMBOSS only shows the *.cut files in the > selector. If a user wants to use a CUTG rather than a standard > distribution file under wEMBOSS, he must first copy it to his project > using embossdata (at the command line there is no problem). I will add an option to cutgextract for the output filename extension. > As formats, it would of course be nice if EMBOSS programs could read and > write codon usage tables (and other data) in any format, just as they do > for sequences. Which formats should we support besides what EMBOSS uses > now ? Is there such a thing as "native" CUTG format (with one entry a > file) ?. I know about GCG format (not useful for us, but other people > certainly might want it). There is Staden format. Staden format supports > also files with 2 tables (codon usage in genes + trinucleotide frequency > in noncoding DNA) ; what to do with this ? only read the first ? There is > also the format used by CODEHOP > (http://blocks.fhcrc.org/blocks/codehop.html). Does > someone know other formats ? CUTG has a format used on their web pages. It also has the spsum file which could be used. regards, Peter From pmr at ebi.ac.uk Fri Apr 1 13:50:52 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 01 Apr 2005 14:50:52 +0100 Subject: [EMBOSS] CODON USAGE TABLES In-Reply-To: <20050330172118.GA14064@bigben.ulb.ac.be> References: <20050330172118.GA14064@bigben.ulb.ac.be> Message-ID: <424D51BC.4030402@ebi.ac.uk> Guy Bottu wrote: > As formats, it would of course be nice if EMBOSS programs could read and > write codon usage tables (and other data) in any format, just as they do > for sequences. Which formats should we support besides what EMBOSS uses > now ? Is there such a thing as "native" CUTG format (with one entry a > file) ?. I know about GCG format (not useful for us, but other people > certainly might want it). There is Staden format. Staden format supports > also files with 2 tables (codon usage in genes + trinucleotide frequency > in noncoding DNA) ; what to do with this ? only read the first ? There is > also the format used by CODEHOP > (http://blocks.fhcrc.org/blocks/codehop.html). CODEHOP format is minimal, but can be used. It appears to be derived from CUTG's "spsum" files (which I will also add as a format). Other formats I know about (and will include): codonusage database ftp://ftp.ebi.ac.uk/pub/databases/codonusage transterm database ftp://ftp.ebi.ac.uk/pub/databases/transterm GCG (with extra header comments to contain species and other information) does anyone have example from GCG or from other sources that write "GCG format" files so we can convert U -> T and any other non-standard data. CUTG website format http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=Drosophila+melanogaster+%5Bgbinv%5D&aa=1&style=N SPSUM format (CUTG database .spsum files) CODEHOP format http://blocks.fhcrc.org/blocks/codehop.html Staden format: I have no example for this apart from one in the Staden src/seq_utils/genetics_codes.c source file - can someone send examples please? I would be happy reading an optional second file for some formats, although EMBOSS does not currently use the data the Staden format has. regards, Peter Rice From ableasby at hgmp.mrc.ac.uk Mon Apr 4 12:44:43 2005 From: ableasby at hgmp.mrc.ac.uk (Alan Bleasby) Date: Mon, 4 Apr 2005 13:44:43 +0100 (BST) Subject: [EMBOSS] Re: [EMBOSS-BUG] prophecy Message-ID: <200504041244.j34CihJm022967@bromine.hgmp.mrc.ac.uk> The short answer is that it was intentional. Until these programs are replaced (on the list of things to do) it ought to be documented though. HTH Alan From muratem at eng.uah.edu Mon Apr 4 15:07:30 2005 From: muratem at eng.uah.edu (Mike Muratet) Date: Mon, 4 Apr 2005 10:07:30 -0500 (CDT) Subject: [EMBOSS] Threading einverted Message-ID: Greetings Has anyone ever tried to port einverted to a parallel machine like the SGI altix? Has anyone tried to build multiple threads into einvertied? Thanks Mike From pmr at ebi.ac.uk Mon Apr 4 15:18:18 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 04 Apr 2005 16:18:18 +0100 Subject: [EMBOSS] Threading einverted In-Reply-To: References: Message-ID: <42515ABA.4080802@ebi.ac.uk> Dear Mike, > Has anyone ever tried to port einverted to a parallel machine like the SGI > altix? Has anyone tried to build multiple threads into einvertied? We have not heard of any attempt to make einverted multi-threaded. The algorithm is not the easiest to thread. But if you would like to try, I would be happy to help! regards, Peter Rice From muratem at eng.uah.edu Mon Apr 4 15:39:57 2005 From: muratem at eng.uah.edu (Mike Muratet) Date: Mon, 4 Apr 2005 10:39:57 -0500 (CDT) Subject: [EMBOSS] Threading einverted In-Reply-To: <42515ABA.4080802@ebi.ac.uk> Message-ID: On Mon, 4 Apr 2005, Peter Rice wrote: > Dear Mike, > > > Has anyone ever tried to port einverted to a parallel machine like the SGI > > altix? Has anyone tried to build multiple threads into einvertied? > > We have not heard of any attempt to make einverted multi-threaded. > > The algorithm is not the easiest to thread. But if you would like to try, I > would be happy to help! > > regards, > > Peter Rice > Peter I'm willing to have a go at it. I have an immediate need (isn't that always the case?) and the shortest path may be threading. The biggest machine I have access to is an altix. The Itaniums's are supposed to scream. The system is down at the moment, but when it's available again I'll compile it and run a benchmark with the existing source. Do you have anything that describes the algorithm? I don't recall seeing a reference to a paper. I'll print out the source and stare at it tonight. The Altix has 16-cpu SMB nodes. It would be nice to hit on all 16. Cheers Mike From msarachu at biol.unlp.edu.ar Mon Apr 11 15:30:54 2005 From: msarachu at biol.unlp.edu.ar (Martin Sarachu) Date: Mon, 11 Apr 2005 12:30:54 -0300 Subject: [EMBOSS] wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2 Message-ID: <425A982E.3060206@biol.unlp.edu.ar> This is to announce the release of wEMBOSS-1.4.0 & wrappers4EMBOSS-1.2 Changes in wEMBOSS include: - Small bugfixes and improvements of the ACD parser - Added support for Opera browser - Added multiple deletion of results Changes in wrappers4EMBOSS include: - Compatibility for EMBOSS 2.8, 2.9 and 2.10 - ps_scan wrappers updated for the last version of ps_scan.pl - Minor ACD enhancements wrappers4EMBOSS can be installed together with wEMBOSS and is included in its distribution. wrappers4EMBOSS can also be downloaded as a single package. You can download both packages from http://www.wemboss.org -- Martin Sarachu msarachu at biol.unlp.edu.ar AR.EMBnet http://www.ar.embnet.org From pmr at ebi.ac.uk Tue Apr 12 08:57:27 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 12 Apr 2005 09:57:27 +0100 Subject: [EMBOSS] Codon usage file improvements In-Reply-To: <424ACAB2.8090509@ebi.ac.uk> References: <424ACAB2.8090509@ebi.ac.uk> Message-ID: <425B8D77.7080409@ebi.ac.uk> Peter Rice wrote: > A quick check before I make changes to the EMBOSS codon usage files. Done. The codon usage files now committed to CVS (so this will happen from the next release) have the following changes: 1. file naming is Exxxxx where xxxxx is the UniProt/SwissProt 5-letter name for the species. Some species in UniProt/SwissProt have more than one name (strains used for genome projects, for example AGRTU and AGRT5 for Agrobacterium tumefasciens - EMBOSS will use Eagrtu.cut for the codon usage table, but has genes from the genome sequence). For example: #Species: Agrobacterium tumefaciens str. C58 #Division: gbbct #Release: CUTG146 #CdsCount: 10705 #Coding GC 59.76% #1st letter GC 63.11% #2nd letter GC 44.70% #3rd letter GC 71.47% #Codon AA Fraction Frequency Number GCA A 0.132 15.154 51011 GCC A 0.440 50.470 169886 GCG A 0.328 37.649 126730 GCT A 0.101 11.550 38879 TGC C 0.783 6.486 21834 2. The old filenames will stay until release 3.0.0 for those who are used to them. I will add comments to their headers. They came from the CODONUSAGE and TRANSTERM databases, and we copied their filenames! The attached file cut.txt lists the old file names and their species. I used the notes when selecting species for the new codon usage files. 3. EMBOSS will be able to read other codon usage table formats, and will extract the species and other information where possible 4. Codon usage files are checked for inconsistencies - if they specify the number of genes, then files with too many stop codons will give a warning. Some formats do not include the genetic code, so for some species and formats the warning can be ignored. The EMBOSS and GCG formats are safe. 5. Some EMBOSS programs read a codon usage file - but only use it to read a genetic code. These programs will instead prompt for a genetic code in the next release. For example, showseq and prettyseq only need a genetic code for translation. Backtranseq does need a codon usage table - for back translation it needs to know the most used codon for each amino acid. 6. A new file Cut.index (in the data/CODONS directory) will list all the codon usage files and their species so that a menu of installed codon usage files can be used by interfaces. A copy of Cut.index is attached as Cut_index.txt Hope this helps Peter -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cut.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Cut_index.txt URL: From robin at hms.harvard.edu Tue Apr 12 14:17:42 2005 From: robin at hms.harvard.edu (Robin Colgrove) Date: Tue, 12 Apr 2005 10:17:42 -0400 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <425B8D77.7080409@ebi.ac.uk> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> Message-ID: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Hello. I was trying to use emma for a multiple sequence alignment of dna sequencing reads, but it complained that it could not find clustalw. I could not find any mention of clustalw on the EMBOSS page, so I got a copy from the clustalw homepage and -not knowing where to place it- tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and emma gives the error: EMBOSS An error in ajsys.c at line 398: cannot find program 'clustalw' Looking in the emma.acd and ajsys.c files, I can't find any guidance. Does anyone know how this is supposed to work? Alternatively, is there another good way to do multiple sequence alignment? Looking ahead, I do not find any obvious way to do contig assembly, a la Phrap, or CAP. thanks robin colgrove From pmr at ebi.ac.uk Tue Apr 12 14:26:57 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 12 Apr 2005 15:26:57 +0100 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Message-ID: <425BDAB1.6000606@ebi.ac.uk> Robin Colgrove wrote: > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, and > emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? Option 1: Install clustalw in your path so that you (and the emma program) can run it from the commandline. The directory where you installed EMBOSS is one possible place you can put it. Option 2: Emma will look for a variable EMBOSS_CLUSTALW (an environment variable or a variable defined inemboss.defaults or .embossrc) that has the full path for clustalw. Now ... we should document this ... and perhaps update the emma documentation which looks rather old and has too much old clustal information in it. Hope this helps, Peter From robin at hms.harvard.edu Tue Apr 12 19:20:01 2005 From: robin at hms.harvard.edu (Robin Colgrove) Date: Tue, 12 Apr 2005 15:20:01 -0400 Subject: [EMBOSS] using emma: where to put clustalw In-Reply-To: <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> Message-ID: <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu> Thanks to all for suggestions. Just putting clustalw in /usr/local/bin did the trick. Now, I need to figure out why emma/clustalw is giving me such bad alignments. Since I only had 4 sequences, I ended up aligning them pairwise with needle, then pieced together the full alignment in vi, but this is not going to fly as the number of sequences increases. The online tool I use for quick alignments ( http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but the same fasta file sent either to emma or directly to clustalw gives obviously wrong alignments, even though the nucleotide sequences are highly homologous. thanks again robin On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > Hello. > > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > and emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? > Alternatively, is there another good way to do multiple sequence > alignment? > Looking ahead, I do not find any obvious way to do contig assembly, a > la Phrap, or CAP. > > thanks > > robin colgrove > From David.Bauer at Schering.de Wed Apr 13 06:12:25 2005 From: David.Bauer at Schering.de (David.Bauer at Schering.de) Date: Wed, 13 Apr 2005 08:12:25 +0200 Subject: Antwort: Re: [EMBOSS] using emma: where to put clustalw Message-ID: Hi Robin, how long are your 4 sequences ? I observed that clustalw has problems with nucleotide alignments, if there are larger differences in sequence length. So e.g. if a 1 kb sequence is nearly completely contained with high homology within another 2 kb sequence, the resulting alignment can be very far from optimal. If you try to align coding sequences there is a program "tranalign" in EMBOSS. You can first align the protein sequences (which usually works better than a multiple alignment of DNA) and then use this alignment with tranalign to guide the alignment of the corresponding cDNA. Hope this helps, David. Robin Colgrove An: emboss at embnet.org Gesendet von: Kopie: owner-emboss at hgm Thema: Re: [EMBOSS] using emma: where to put clustalw p.mrc.ac.uk 12.04.2005 21:20 Thanks to all for suggestions. Just putting clustalw in /usr/local/bin did the trick. Now, I need to figure out why emma/clustalw is giving me such bad alignments. Since I only had 4 sequences, I ended up aligning them pairwise with needle, then pieced together the full alignment in vi, but this is not going to fly as the number of sequences increases. The online tool I use for quick alignments ( http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but the same fasta file sent either to emma or directly to clustalw gives obviously wrong alignments, even though the nucleotide sequences are highly homologous. thanks again robin On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > Hello. > > I was trying to use emma for a multiple sequence alignment of dna > sequencing reads, but it complained that it could not find clustalw. I > could not find any mention of clustalw on the EMBOSS page, so I got a > copy from the clustalw homepage and -not knowing where to place it- > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > and emma gives the error: > > EMBOSS An error in ajsys.c at line 398: > cannot find program 'clustalw' > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > Does anyone know how this is supposed to work? > Alternatively, is there another good way to do multiple sequence > alignment? > Looking ahead, I do not find any obvious way to do contig assembly, a > la Phrap, or CAP. > > thanks > > robin colgrove > From gwilliam at hgmp.mrc.ac.uk Wed Apr 13 08:22:19 2005 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 13 Apr 2005 09:22:19 +0100 Subject: [EMBOSS] using emma: where to put clustalw References: <424ACAB2.8090509@ebi.ac.uk> <425B8D77.7080409@ebi.ac.uk> <99916265778bc955eb8a68c9430b5e47@hms.harvard.edu> <94f67cce6e9a12f4f8291185abe35c43@hms.harvard.edu> Message-ID: <425CD6BB.68F8C486@hgmp.mrc.ac.uk> Some alternate multiple alignment programs for nucleotide sequences on the web are at: http://www.hgmp.mrc.ac.uk/GenomeWeb/nuc-mult.html I would recommend DIALIGN Gary Robin Colgrove wrote: > > Thanks to all for suggestions. > Just putting clustalw in /usr/local/bin did the trick. > > Now, I need to figure out why emma/clustalw is giving me such bad > alignments. > Since I only had 4 sequences, I ended up aligning them pairwise with > needle, then pieced together the full alignment in vi, but this is not > going to fly as the number of sequences increases. The online tool I > use for quick alignments ( > http://prodes.toulouse.inra.fr/multalin/multalin.html ) does fine, but > the same fasta file sent either to emma or directly to clustalw gives > obviously wrong alignments, even though the nucleotide sequences are > highly homologous. > > thanks again > > robin > > On Apr 12, 2005, at 10:17 AM, Robin Colgrove wrote: > > > > > Hello. > > > > I was trying to use emma for a multiple sequence alignment of dna > > sequencing reads, but it complained that it could not find clustalw. I > > could not find any mention of clustalw on the EMBOSS page, so I got a > > copy from the clustalw homepage and -not knowing where to place it- > > tried the /usr/local/share/ EMBOSS/acd directory. That didn't work, > > and emma gives the error: > > > > EMBOSS An error in ajsys.c at line 398: > > cannot find program 'clustalw' > > > > Looking in the emma.acd and ajsys.c files, I can't find any guidance. > > > > Does anyone know how this is supposed to work? > > Alternatively, is there another good way to do multiple sequence > > alignment? > > Looking ahead, I do not find any obvious way to do contig assembly, a > > la Phrap, or CAP. > > > > thanks > > > > robin colgrove > > -- Gary Williams MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK Tel: +44 1223 494522 Fax: +44 1223 494512 E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk From jrvalverde at cnb.uam.es Thu Apr 21 09:58:51 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Thu, 21 Apr 2005 11:58:51 +0200 Subject: [EMBOSS] Wiki Message-ID: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> I would rather welcome a Wiki for EMBOSS documentation. I can host it at Es.EMBnet.Org/es.emboss.org, no problem at that. The reason is that as I run into problems/tricks/tasks to do, I see comments that might be added here and there in the documentation. I would rather go to a single site and make the changes myself than go throught he hassle of devising a 'diff' comment, finding out who to mail, mailing them andn waiting for a new doc release. If there is interest, I can set it up straight away. j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From pmr at ebi.ac.uk Thu Apr 21 16:20:24 2005 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 21 Apr 2005 17:20:24 +0100 Subject: [EMBOSS] Wiki In-Reply-To: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> Message-ID: <4267D2C8.10009@ebi.ac.uk> Jos? R. Valverde wrote: > I would rather welcome a Wiki for EMBOSS documentation. We have all the documentation (including the sourceforge web pages) in CVS. Any member of the development/documentation team can make updates there. No need for a wiki for this - and a wiki would be difficult to manage as most of the documentation is generated automatically. > The reason is that as I run into problems/tricks/tasks to do, I see > comments that might be added here and there in the documentation. I > would rather go to a single site and make the changes myself than > go throught he hassle of devising a 'diff' comment, finding out who > to mail, mailing them andn waiting for a new doc release. Just mail anything like that to emboss-bug. After all ... there is not much point in changing a wiki version of the documentation if we are busy changing the application and the real documentation :-) regards, Peter From jrvalverde at cnb.uam.es Fri Apr 22 08:11:18 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Fri, 22 Apr 2005 10:11:18 +0200 Subject: [EMBOSS] Wiki (and Macs) In-Reply-To: <4267D2C8.10009@ebi.ac.uk> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> Message-ID: <20050422101118.33b19892.jrvalverde@cnb.uam.es> On Thu, 21 Apr 2005 17:20:24 +0100 Peter Rice wrote: > > After all ... there is not much point in changing a wiki version of the > documentation if we are busy changing the application and the real > documentation :-) > > regards, > > Peter Right you are Sir. I guess it's better as it is for now. And yet... Speaking generally, it probably boils down to the management model we want for EMBOSS. As it is now I tend to see it much like a Cathedral than a Bazaar. Truly it isn't, but you must agree it is not so evident from the docs what the procedures are for participation. At least not at first sight. I'm more for the Bazaar model, one where everyone is welcome and making changes is as trivial as possible (specially for end-users and end-user-related material, like docs). I'd rather have that as a 'common' to build a user community around. Game theory shows that to be the best strategy in the long run (see e.g. http://encyclopedia.laborlawtalk.com/Tragedy_of_the_commons ). In the short run, with limited resources as the EMBOSS team currently is, you are right it takes a significant effort and portion of the existing resources. It makes more sense to concentrate on the short term now and surviving enough to drive new resources in. But I think we should have that in sight for the long term. j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From jrvalverde at cnb.uam.es Fri Apr 22 08:20:58 2005 From: jrvalverde at cnb.uam.es (=?ISO-8859-15?Q?Jos=E9?= R. Valverde) Date: Fri, 22 Apr 2005 10:20:58 +0200 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <4267D2C8.10009@ebi.ac.uk> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> Message-ID: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> I'm trying to find out ways to fund EMBOSS in a way that I can justify locally. Mac users are a growing 'market' and a promising community. I've got here hundreds of Macs, and they need an easy to use, install and manage solution. What is needed (they tell me) is a good editor, and some interactive graphic facilities for common, simple tasks. Actually, locally, we are going to spend a significant amount into buying a handful of licenses for commercial software. I've tried Erik's CD, but it has some drawbacks regarding the configuration on non-user-managed Macs (as those where root belongs to a central authority): Here they can install software but not make modifications. I can't either, being on the SciComp side and not on the Offimatic end. I don't have the resources to do that locally, but would welcome a sensible way to fund it (like buying 'licenses', packages, CDs or manuals from an EMBOSS-centered company). I for one would certainly welcome a Macintosh edition ready to run, and easy to configure to use central databases. If I were to chose, I'd try to add those facilities to Jemboss (a sequence editor, and interactive drawing of clones and molecular graphics). This is the most lacking thing in EMBOSS now that every user has or can have a UNIX machine at their desktop. And, certainly, I would happily recommend locally that we buy a hundred+ licenses at a reasonable price if that would help fund EMBOSS. Most ideally, something like the LiveDVD from AT.EMBnet.Org but for Macs would be a candy. And an easy to justify buy. Any recommendations? Takers? Pointers? j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From kellert at ohsu.edu Fri Apr 22 16:33:44 2005 From: kellert at ohsu.edu (Thomas J Keller) Date: Fri, 22 Apr 2005 09:33:44 -0700 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> Message-ID: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> Greetings, Have you looked at the fink installation of emboss and kaptain? I use emboss from the command line, so I haven't tried the GUI application "kaptain". Here's what the fink database has to say about it: ####################### kaptain-0.71-22: Universal graphical front-end Kaptain is a universal graphical front-end for command line programs, and it works wherever Qt3 is available. Someone writes a simple script (so called grammar) which describes the possible arguments for a command line program and Kaptain brings up a friendly dialog to the user to set up the command line. Example grammars can be found in /sw/share/kaptain/. . Web site: http://kaptain.sourceforge.net . Maintainer: Koen van der Drift ###################### The emboss grammar has been written and is available through fink. You do need the developer tools installed on your Mac, but that's trivial, and comes with the OS, so no additional charge for your users. Just a thought. Tom Keller, Ph.D. http://www.ohsu.edu/research/core kellert at ohsu.edu 503-494-2442 On Apr 22, 2005, at 1:20 AM, Jos? R. Valverde wrote: > I'm trying to find out ways to fund EMBOSS in a way that I can > justify locally. > > Mac users are a growing 'market' and a promising community. I've got > here hundreds of Macs, and they need an easy to use, install and > manage solution. > > What is needed (they tell me) is a good editor, and some interactive > graphic facilities for common, simple tasks. Actually, locally, we are > going to spend a significant amount into buying a handful of licenses > for commercial software. > > I've tried Erik's CD, but it has some drawbacks regarding the > configuration > on non-user-managed Macs (as those where root belongs to a central > authority): Here they can install software but not make modifications. > I can't either, being on the SciComp side and not on the Offimatic > end. > > I don't have the resources to do that locally, but would welcome a > sensible way to fund it (like buying 'licenses', packages, CDs or > manuals from an EMBOSS-centered company). > > I for one would certainly welcome a Macintosh edition ready to run, > and easy to configure to use central databases. If I were to chose, > I'd try to add those facilities to Jemboss (a sequence editor, and > interactive drawing of clones and molecular graphics). This is the > most lacking thing in EMBOSS now that every user has or can have a > UNIX machine at their desktop. > > And, certainly, I would happily recommend locally that we buy a > hundred+ licenses at a reasonable price if that would help > fund EMBOSS. > > Most ideally, something like the LiveDVD from AT.EMBnet.Org but for > Macs would be a candy. And an easy to justify buy. > > Any recommendations? Takers? Pointers? > > j -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 2879 bytes Desc: not available URL: From kvddrift at earthlink.net Fri Apr 22 20:17:48 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 22 Apr 2005 16:17:48 -0400 Subject: [EMBOSS] Macintosh EMBOSS In-Reply-To: <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> References: <20050421115851.49380dc9.jrvalverde@cnb.uam.es> <4267D2C8.10009@ebi.ac.uk> <20050422102058.2ca36edb.jrvalverde@cnb.uam.es> <3f777be8352bb5e5cb4f3d73839c9321@ohsu.edu> Message-ID: On Apr 22, 2005, at 12:33 PM, Thomas J Keller wrote: > Web site: http://kaptain.sourceforge.net > . > Actually, the package is emboss-kaptain. - Koen.