From stefan.rensing at biologie.uni-freiburg.de Fri Feb 3 10:16:14 2006 From: stefan.rensing at biologie.uni-freiburg.de (Stefan Rensing) Date: Fri, 03 Feb 2006 16:16:14 +0100 Subject: [EMBOSS] patmatdb Message-ID: <43E373BE.10202@biologie.uni-freiburg.de> Dear all, what exactly does the flag -snucleotide1 toggle in patmatdb? >From the doc I was thinking it would enable searching against nucleotide acid sequences instead of proteins. However, execution aborts with an error message saying that the sequence is not protein. Cheers, Stefan -- Dr. Stefan Rensing, Group Leader Computational Biology Plant Biotechnology, Faculty of Biology, University of Freiburg Schaenzlestr. 1, D-79104 Freiburg, Fon: +49 761 203-6974, Fax: -6945 http://www.plant-biotech.net/ http://www.cosmoss.org/ stefan.rensing at biologie.uni-freiburg.de "An old man dies. A young girl lives. A fair trade. I love you, Nancy." From pmr at ebi.ac.uk Fri Feb 3 12:31:39 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 03 Feb 2006 17:31:39 +0000 Subject: [EMBOSS] patmatdb In-Reply-To: <43E373BE.10202@biologie.uni-freiburg.de> References: <43E373BE.10202@biologie.uni-freiburg.de> Message-ID: <43E3937B.607@ebi.ac.uk> Hi Stefan, > what exactly does the flag -snucleotide1 toggle in patmatdb? -snucleotide (and -sprotein) are available for all sequence inputs. They are used for programs that can read DNA of rpotein sequences, where you have a sequence that can be both types (a short sequence ni FASTA format for example) >>From the doc I was thinking it would enable searching against nucleotide > acid sequences instead of proteins. However, execution aborts with an > error message saying that the sequence is not protein. The sequence type tells patmatdb to only accept protein. Thinking about this ... we can change the -help output (and the program documentation) to describe the sequence type much better than the current "sequence database USA". We can, for example, say whether the sequence can be DNA or protein and whether gaps, stops, and other characters are used. All we need is a short description (which we have) for each sequence type. This is something we should have added long ago :-) The pattern syntax was defined for the PROSITE database ... but we can also allow it to search nucleotide data. It is only a small change to the program. Does anyone neet to search a nucleotide database with patmatdb? I suspect patmatdb is rather redundant ... you can get the same results from fuzzpro (for protein) or fuzznuc (for nucleotide) ... -rformat dbmotif will give you the same output format as patmatdb. Any preferences? Peter From andrespinzon at gmail.com Mon Feb 6 14:18:50 2006 From: andrespinzon at gmail.com (Andres Pinzon) Date: Mon, 6 Feb 2006 14:18:50 -0500 Subject: [EMBOSS] seqretsplit in batch mode In-Reply-To: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> References: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> Message-ID: <8968fc7e0602061118s2bf6fdc4l@mail.gmail.com> 2006/2/6, Andres Pinzon : > [...] > Anybody knows how can i tell seqretsplit not to prompt for the Output sequence? > Best regards, Ok, I answer my own question, just in case ;-) ============ seqretsplit -sequence mysequence.fasta -auto ============ -- --------- Andr?s Pinz?n [http://www.andrespinzon.com] Bioinformatics Center, Colombia EMBnet node Biotechnology Institute - National University of Colombia http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 ---------- From andrespinzon at gmail.com Mon Feb 6 13:15:15 2006 From: andrespinzon at gmail.com (Andres Pinzon) Date: Mon, 6 Feb 2006 13:15:15 -0500 Subject: [EMBOSS] seqretsplit in batch mode Message-ID: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> Hi all, Im trying to use seqretsplit in thousands of fasta files in a directory at once. But it fails (I think) because for each fasta file one is supoussed to hit , for instance: andipin at linux:~/BYB/byDir> seqretsplit 11150.fasta Reads and writes (returns) sequences in individual files Output sequence [q3ybd3.fasta]: <--- HIT ENTER!!!! Anybody knows how can i tell seqretsplit not to prompt for the Output sequence? Best regards, -- --------- Andr?s Pinz?n [http://www.andrespinzon.com] Bioinformatics Center, Colombia EMBnet node Biotechnology Institute - National University of Colombia http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 ---------- From David.Bauer at schering.de Tue Feb 7 01:46:16 2006 From: David.Bauer at schering.de (David.Bauer at schering.de) Date: Tue, 7 Feb 2006 07:46:16 +0100 Subject: [EMBOSS] seqretsplit in batch mode In-Reply-To: <8968fc7e0602061118s2bf6fdc4l@mail.gmail.com> Message-ID: And if you become more familiar with emboss you can set the environment variable EMBOSS_AUTO=1 which causes alle emboss programs to run like if -auto has been specified on the command line. Cheers, David. emboss-bounces at emboss.open-bio.org schrieb am 06/02/2006 20:18:50: > 2006/2/6, Andres Pinzon : > > > [...] > > Anybody knows how can i tell seqretsplit not to prompt for the > Output sequence? > > Best regards, > > Ok, I answer my own question, just in case ;-) > ============ > seqretsplit -sequence mysequence.fasta -auto > ============ From golharam at umdnj.edu Wed Feb 8 23:46:43 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Wed, 08 Feb 2006 23:46:43 -0500 Subject: [EMBOSS] Tool to mutate DNA sequence Message-ID: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Does anyone know of tool to mutate a DNA sequence by a specified amount? For instance, say I have a DNA sequence 1000 bases long, and I want to simulate mutations to make it 75% (or 80%, etc) similar to the original. Ryan From pmr at ebi.ac.uk Thu Feb 9 03:25:24 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Thu, 9 Feb 2006 08:25:24 -0000 (GMT) Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <2714.86.132.216.50.1139473524.squirrel@webmail.ebi.ac.uk> Ryan Golhar writes: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. EMBOSS has the msbar program ("mutate sequence beyond all recognition") which allows you to select the number and type of changes. With some tuning of options to match the sequence length you should be able to get results that match whatever your definition of 75% similar might be (amazing how much more similarity you can get by adding gaps in an alignment :-) If you can specify a clear and generally useful way to define what you need we could of course add a "percent change" option to the msbar program for a future release. Hope that helps, Peter From torsten.seemann at infotech.monash.edu.au Thu Feb 9 06:15:28 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Thu, 09 Feb 2006 22:15:28 +1100 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <43EB2450.6000606@infotech.monash.edu.au> Ryan, > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. The EMBOSS suite comes with a tool called "msbar" which can controllably mutate sequences: http://emboss.sourceforge.net/apps/msbar.html -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia http://www.vicbioinformatics.com/ From jason.stajich at duke.edu Thu Feb 9 14:10:54 2006 From: jason.stajich at duke.edu (Jason Stajich) Date: Thu, 9 Feb 2006 14:10:54 -0500 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Depending on whether or not you want to use evolutionary realistic models... * evolver which comes with PAML lets you evolve sequences on a tree * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? id=seqgen also lets you do this I believe there are PISE interfaces to both of these at the pasteur bioweb site - http://bioweb.pasteur.fr/ -jason On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich Duke University http://www.duke.edu/~jes12 From heikki at sanbi.ac.za Thu Feb 9 06:31:20 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 13:31:20 +0200 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091331.21690.heikki@sanbi.ac.za> Ryan, Instructions in pseudo code: take the sequence string out of the object use a hash to store changed locations repeat pick a location in the string randomly if the location is not in a hash , i.e. changed already, change it into something else add the changed location into the hash if enough locations have been changed (scalar keys hash), exit loop put the sequence string back into the seq object -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From heikki at sanbi.ac.za Thu Feb 9 09:54:30 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 16:54:30 +0200 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091654.30890.heikki@sanbi.ac.za> Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From golharam at umdnj.edu Thu Feb 9 16:19:46 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Thu, 09 Feb 2006 16:19:46 -0500 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <200602091654.30890.heikki@sanbi.ac.za> Message-ID: <002801c62dbe$8d4d7e20$e6028a0a@GOLHARMOBILE1> Thanks all. The responses I got were definitely more than helpful. FYI - I did initially look at msbar. I glanced over the "Number of times to perform mutation operations", which is what I was looking for. I'm looking to statistically test some simply scoring matrices. I think msbar will do. Ryan -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Heikki Lehvaslaiho Sent: Thursday, February 09, 2006 9:55 AM To: bioperl-l at lists.open-bio.org; golharam at umdnj.edu Cc: 'The general forum at Bioinformatics.Org'; 'bioperl-l'; emboss at emboss.open-bio.org Subject: Re: [Bioperl-l] Tool to mutate DNA sequence Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? For instance, say I have a DNA sequence 1000 bases long, and I > want to simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From Marc.Logghe at DEVGEN.com Fri Feb 10 06:06:35 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 12:06:35 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> Hi all, I have noticed that the output table format of infoseq is not consistent. In the sense that the whitespace field separator is sometimes two spaces and sometimes a tab followed by a space. This makes it impossible (at least, if you don't want an extra processing step) to properly import the infoseq output into a spreadsheet. Is there a way to force infoseq to use tabs only as field separator ? Regards, Marc From Marc.Logghe at DEVGEN.com Fri Feb 10 06:39:26 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 12:39:26 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B07@ANTARESIA.be.devgen.com> Never mind. Just found out that in Excel you have the option to define multiple delimiters AND treat consecutive delimiters as one ... Guess I don't use spreadsheets regularly enough ;-) On the other hand, I think it would be 'cleaner' if infoseq would stick to a single type of delimiter. Cheers, Marc > -----Original Message----- > From: emboss-bounces at emboss.open-bio.org > [mailto:emboss-bounces at emboss.open-bio.org] On Behalf Of Marc Logghe > Sent: Friday, February 10, 2006 12:07 PM > To: emboss at emboss.open-bio.org > Subject: [EMBOSS] infoseq table format issues > > Hi all, > I have noticed that the output table format of infoseq is not > consistent. In the sense that the whitespace field separator > is sometimes two spaces and sometimes a tab followed by a > space. This makes it impossible (at least, if you don't want > an extra processing step) to properly import the infoseq > output into a spreadsheet. Is there a way to force infoseq to > use tabs only as field separator ? > Regards, > Marc > > _______________________________________________ > EMBOSS mailing list > EMBOSS at emboss.open-bio.org > http://newportal.open-bio.org/mailman/listinfo/emboss > From pmr at ebi.ac.uk Fri Feb 10 06:45:41 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 10 Feb 2006 11:45:41 +0000 Subject: [EMBOSS] infoseq table format issues In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> References: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> Message-ID: <43EC7CE5.8020101@ebi.ac.uk> Marc Logghe wrote: > Hi all, > I have noticed that the output table format of infoseq is not > consistent. In the sense that the whitespace field separator is > sometimes two spaces and sometimes a tab followed by a space. This makes > it impossible (at least, if you don't want an extra processing step) to > properly import the infoseq output into a spreadsheet. Is there a way to > force infoseq to use tabs only as field separator ? Yes .... but ... infoseq produces a report or HTML. Is the HTML useful? Is the text output useful (and is it better to use spaces as text) Is tab-delimited output useful (also for other programs)? Do we need some kind of XML output to replace the HTML? Meanwhile, I will look at cleaning up the current infoseq regards, Peter From Marc.Logghe at DEVGEN.com Fri Feb 10 08:18:13 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 14:18:13 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B0A@ANTARESIA.be.devgen.com> Hi Peter, > > step) to properly import the infoseq output into a spreadsheet. Is > > there a way to force infoseq to use tabs only as field separator ? > > Yes .... but ... > > infoseq produces a report or HTML. > > Is the HTML useful? Yes, just found out that you can perfectly import this format into Excel ... > > Is the text output useful (and is it better to use spaces as text) > > Is tab-delimited output useful (also for other programs)? I think tab delimited is better (compared to spaces as delimiter). That way you can easily process the table with cut or sort and that kind of things. > > Do we need some kind of XML output to replace the HTML? > > Meanwhile, I will look at cleaning up the current infoseq Thanks and regards, Marc From fnovo at unav.es Mon Feb 13 07:29:29 2006 From: fnovo at unav.es (F.J. Novo) Date: Mon, 13 Feb 2006 13:29:29 +0100 Subject: [EMBOSS] dottup and dotmatcher in 3.0.0 Message-ID: <43F07BA9.6020606@unav.es> Hi all. There seems to be something wrong with dottup and dotmatcher in 3.0.0 (under Fedora Core 4). I consistently get an error: [fnovo at localhost ~]$ dotmatcher Displays a thresholded dotplot of two sequences Input sequence: p53mRNA.seq Second sequence: p53mRNA.seq Graph type [x11]: png Devices allowed are:- postscript ps hpgl hp7470 hp7580 meta colourps cps tektronics tekt tek4107t tek none null text data png Error: Invalid XY graph value 'x11' Died: dotmatcher terminated: Bad value for '-xygraph' and no prompt and the same for dottup. Needless to say, png is properly installed and works fine with other applications (including dotpath and polydot), so it seems something specific to these two programs. Has anyone else seen this? Any suggestions? Thanks, From maximilianh at gmail.com Tue Feb 14 05:11:42 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 14 Feb 2006 11:11:42 +0100 Subject: [EMBOSS] [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Message-ID: <76f031ae0602140211n2a0bbf4fl@mail.gmail.com> The tool ROSE also evolves sequences on a tree. There is a web interface and downloadable source at http://bibiserv.techfak.uni-bielefeld.de/rose/ Max On 09/02/06, Jason Stajich wrote: > Depending on whether or not you want to use evolutionary realistic > models... > * evolver which comes with PAML lets you evolve sequences on a tree > * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? > id=seqgen > also lets you do this > I believe there are PISE interfaces to both of these at the pasteur > bioweb site - http://bioweb.pasteur.fr/ > > -jason > On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > > > Does anyone know of tool to mutate a DNA sequence by a specified > > amount? > > For instance, say I have a DNA sequence 1000 bases long, and I want to > > simulate mutations to make it 75% (or 80%, etc) similar to the > > original. > > > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > Duke University > http://www.duke.edu/~jes12 > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From scott at cs.wits.ac.za Wed Feb 15 13:41:02 2006 From: scott at cs.wits.ac.za (Scott Hazelhurst) Date: Wed, 15 Feb 2006 20:41:02 +0200 (SAST) Subject: [EMBOSS] Emma options Message-ID: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> Dear all, I am not sure if this is a question for this list or the wemboss forum -- but their site seems down so I'll ask it here. I have users who want to specify their own penalty matrices. For example, for nucleotide sequences the options are clustalw, iub or own. However, if you pick "own" nothing changes: the option to specify your own file doesn't appear. I had a look at the acd file -- not that I properly understand it, and it seems I had a look at the acd file -- not that I understand acd syntax properly so it's a like my trying to fix a broken Jumbo engine. This seems to be the relevant list: dnamatrix [ additional: "@(!$(acdprotein))" default: "i" minimum: "1" maximum: "1" header: "Nucleotide multiple alignment matrix options" values: "i:iub, c:clustalw, o:own" delimiter: "," codedelimiter: ":" information: "Select matrix" button: "Y" help: "This gives a menu where you are offered amenu where a single matrix (not a series) can be selected." ] variable: umamatrix "@($(dnamatrix) == own)" infile: mamatrix [ additional: "@($(usermamatrix)?True:$(umamatrix))" default: "" nullok: "@($(usermamatrix)?@(!$(umamatrix)):True)" information: "Filename of user multiple alignment matrix" knowntype: "comparison matrix" ] I changed the "additional" field in mamatrix to be additional: "@($(umamatrix))" and it seems to work in that the file input box now appears when the "own" option is seleced. As an aside, does anyone know the format of what the matrix file should look like. I did some web searching and looked at the clustalw source code but it's not so easy to re-engineer.. Can anyone help with either of these queries. Thanks Scott From gbottu at ben.vub.ac.be Fri Feb 17 06:27:11 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Fri, 17 Feb 2006 12:27:11 +0100 Subject: [EMBOSS] Emma options - Checked by AntiVir DEMO version - In-Reply-To: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> References: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> Message-ID: <20060217112711.GA23686@bigben.ulb.ac.be> On Wed, Feb 15, 2006 at 08:41:02PM +0200, Scott Hazelhurst wrote: > As an aside, does anyone know the format of what the matrix file > should look like. I did some web searching and looked at the clustalw > source code but it's not so easy to re-engineer.. In case you did not already found it by yourself, here's the answer (copied from a user manaul I composed some time ago) : ------------------------------------------------------------- Data files clustal uses symbol comparison matrices for scoring bases or amino acids. CLUSTAL has built-in symbol comparison matrices, but allows you to provide your own matrix. For proteins, but not for nucleic acids, you can give a series of matrices as input. You can choose different matrices for pairwise alignment and for multiple alignment. Single matrix input file The format used for a single matrix is the same as that used by the BLAST program. The scores in the new weight matrix should be similarities. You can use negative as well as positive values if you wish, although for proteins the matrix will be automatically adjusted to all positive scores, unless the -norescale option is selected. Any lines beginning with a # character are assumed to be comments. The first non-comment line should contain a list of bases or amino acids in any order, using the 1 letter code, followed by a * character. This should be followed by a square matrix of scores, with one row and one column for each base or amino acid. The last row and column of the matrix (corresponding to the * character) contain the minimum score over the whole matrix. # Matrix made by matblas from blosum62.iij # * column uses minimum score # BLOSUM Clustered Scoring Matrix in 1/2 Bit Units # Blocks Database = /data/blocks_5.0/blocks.dat # Cluster Percentage: >= 62 # Entropy = 0.6979, Expected = -0.5209 A R N D C Q E G H I L K M F P S T W Y V B Z X * A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -4 R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -4 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1 -4 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 4 1 -1 -4 C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 -4 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1 -4 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 -1 -4 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1 -4 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -3 -3 -1 -4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1 -4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1 -4 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -3 -1 -1 -4 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1 -4 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2 -4 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0 -4 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0 -4 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 -2 -4 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1 -4 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1 -4 B -2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1 -4 Z -1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1 -4 * -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 1 Matrix series input format For proteins, CLUSTAL uses by default different matrices depending on the mean percent identity of the sequences to be aligned. For proteins, but not for nucleic acids, you can specify yourself a series of matrices and the range of the percent identity for each matrix in a matrix series file. The file is automatically recognised by the word CLUSTAL_SERIES at the beginning of the file. Each matrix in the series is then specified on one line which should start with the word MATRIX. This is followed by the lower and upper limits of the sequence percent identities for which you want to apply the matrix. The final entry on the matrix line is the filename of a BLAST format matrix file (see above for details of the single matrix file format). CLUSTAL_SERIES MATRIX 81 100 blosum80 MATRIX 61 80 blosum62 MATRIX 31 60 blosum45 MATRIX 0 30 blosum30 ---------------------------------------------------------------------- Regards, Guy Bottu, Belgian EMBnet Node From biopyte at yahoo.de Fri Feb 17 20:03:19 2006 From: biopyte at yahoo.de (Hans Meier) Date: Sat, 18 Feb 2006 02:03:19 +0100 (CET) Subject: [EMBOSS] 'octanol' - output to stdout not possible? Message-ID: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> Dear friends, with the general qualifiers "-filter" and "-stdout" I tried to make 'octanol -graph png' write to stdout and not to a file (for certain reasons). Anyway, a file is written and I can't catch the output directly from stdout. Is there a solution for this problem or is writing a file hardcoded? Regards, Harald --------------------------------- Telefonieren Sie ohne weitere Kosten mit Ihren Freunden von PC zu PC! Jetzt Yahoo! Messenger installieren! From dstates at bioinformatics.med.umich.edu Fri Feb 17 16:00:19 2006 From: dstates at bioinformatics.med.umich.edu (David States) Date: Fri, 17 Feb 2006 16:00:19 -0500 Subject: [EMBOSS] Building EMBOSS in cygwin - undefined reference to _c_pl... Message-ID: <46949FAC535B4245B58BB9A1262F5EC40824@mail.bicc.med.umich.edu> I am trying to build EMBOSS-3.0.0 under cygwin (recent install) and follow the directions in http://emboss.sourceforge.net/download/cygwin.html, but when I get to ./libs/ajgraph.o, the build fails with a large number of error messages of the form .libs/ajgraph.o:ajgraph.c:(.text+0xc4): undefined reference to `_c_plxsfnam' .libs/ajgraph.o:ajgraph.c:(.text+0x15d): undefined reference to `_c_pladv' .libs/ajgraph.o:ajgraph.c:(.text+0x195): undefined reference to `_c_plssub' ... Any suggestions? David David J. States, M.D., Ph.D. Professor of Human Genetics University of Michigan School of Medicine? Palmer Commons,??2035B 100 Washtenaw Rd. Ann Arbor, MI 48109 USA email: dstates at umich.edu tel: (734) 615-5510 fax: (734) 615-6553 URL: http://stateslab.bioinformatics.med.umich.edu From maximilianh at gmail.com Sun Feb 19 08:52:37 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Sun, 19 Feb 2006 14:52:37 +0100 Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <76f031ae0602190552v5f2542dbv@mail.gmail.com> Hi bio-mailinglists, does anyone here know of a tool or a library to display two (or more) sequences at the same time with coloured features? Possibly with lines, connecting some features from one sequence to the other (synteny-plot) ? Or to display two multiple alignments, one on top of each other, with colored features added? It's not that it would be difficult to write, but programming visualisation usually takes a lot of time. Bio::Graphics seems mainly concerned with one main sequence and features on it. Well, I could copy together two of these gif-images, but then there would be no connecting lines. Same applies for the graphics in Biojava or the gff2ps tool or all the multiple alignment viewers that I know (Bioedit, ClustalX). There is something called Toucan in Java, which displays at least several lines of gff-style-features, but no visible sequences and more importantly, no connecting lines. A recent software, Djinn lite, is using a similar kind of visualization to compare different spliced genes from various species, but it's mainly aimed at splicing and written in Visual Basic. I guess a good compromise might be the 3D viewer Sockeye, but I haven't seen any synteny-lines in sockeye yet. I guess I must have missed something here. I cannot be the first one that would like to compare, say, two gff files, or two multiple alignments? Thanks a lot for any idea, Max From gbottu at ben.vub.ac.be Mon Feb 20 03:04:11 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 20 Feb 2006 09:04:11 +0100 Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence - Checked by AntiVir In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <20060220080411.GB26540@bigben.ulb.ac.be> On Sun, Feb 19, 2006 at 02:52:37PM +0100, Maximilian Haeussler wrote: > does anyone here know of a tool or a library to display two (or more) > sequences at the same time with coloured features? Possibly with lines, > connecting some features from one sequence to the other (synteny-plot) ? > Or to display two multiple alignments, one on top of each other, with > colored features added? Well, there is Alfresco http://www.sanger.ac.uk/Software/Alfresco/ Guy Bottu, Belgian EMBnet Node From pmr at ebi.ac.uk Mon Feb 20 03:19:07 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Mon, 20 Feb 2006 08:19:07 -0000 (GMT) Subject: [EMBOSS] 'octanol' - output to stdout not possible? In-Reply-To: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> References: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> Message-ID: <1256.86.137.129.90.1140423547.squirrel@webmail.ebi.ac.uk> Dear Harald, > with the general qualifiers "-filter" and "-stdout" > I tried to make 'octanol -graph png' write to stdout > and not to a file (for certain reasons). > Anyway, a file is written and I can't catch the output > directly from stdout. For graphs output must go to a file because many programs will write more than one PNG file. > Is there a solution for this problem or > is writing a file hardcoded? Unfortunately the -goutfile stdout option will not work - the graphics library writes the message "Created octanol.1.png" to stdout. But for octanol there is a single image file produces, so you can use the command line option: -goutfile x.png and than cat x.png to stdout. For programs that write more than one PNG file (prettyplot for example) you will get x.png.1.png x.png.2.png and so on. I do not see an easy way around this for PNG files. Hope that helps, Peter From shameer at ncbs.res.in Mon Feb 20 01:21:01 2006 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 20 Feb 2006 11:51:01 +0530 (IST) Subject: [EMBOSS] Matrix Average Code / Module ? In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <59825.192.168.1.176.1140416461.squirrel@192.168.1.176> Hi all, Is there any program/module to calculate the average of a blosum/pam any matrix ? I have a matrix and I need to see the average for example 11 22 43 54 50 27 87 74 32 10 66 58 98 78 20 22 23 44 16 34 I have gone through Bio::Matrix::MatrixI and Bio::Matrix::GenericMatrix and other perl modules like Math::Matrix http://search.cpan.org/~ulpfr/Math-Matrix-0.4/Matrix.pm and Math::Cephes::Matrix - but none of them have a provison to do matrix average calculation. Any help ??? thanks in advance, Happy biocomputing !!! -- Shameer Khadar National Centre for Biological Sciences (TIFR) UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India T - 91-080-23636420-32 EXT 4241 F - 91-080-23636662/23636675 W - http://www.ncbs.res.in -------------------------------------------------- "Refrain from illusions, insist on work and not words, patiently seek divine and scientific truth." MM From d.gatherer at vir.gla.ac.uk Wed Feb 22 07:03:11 2006 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Wed, 22 Feb 2006 12:03:11 +0000 Subject: [EMBOSS] showseq and overlapping ORFs Message-ID: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> Hi EMBOSSers I'm using showseq as follows: [gath01d at gamma ebv]$ showseq hhv8.fa -format 4 -trans 105-944,1112-2764,3179-6577,6594-8681,8665-11202,11329-14367,14485-15741,15756-16979,28774-29154,30242-30769 -out test.showseq -auto EMBOSS An error in showseq.c at line 198: Translation ranges are not in ascending, non-overlapping order. As can be seen, the failure originates with the subsequence 6594-8681 slightly overlapping with the next one 8665-11202. Is there a way round this on the command line or would it require a code tweak? It would be good if there was, since often in viral genomes (this is HHV8, as it happens) ORFs are not cleanly "in ascending, non-overlapping order" as the program would seem to require. A related question: all the above are top-strand ORFs, but further down there are a few complementary strand ones. What combination of parameters would I use to indicate that I want some translated on the top and some on the bottom? I could of course use format -6 and get all six frames, but that is a bit messy for the output I want. I was thinking that maybe it would need to be something like: showseq hhv8.fa -things B,N,T,S,B,1,A,F -trans 105-944,1112-2764,3179-6577,6594-8681,8665-11202,11329-14367 -things B,N,T,S,B,-1,A,F -trans 14485-15741,15756-16979,28774-29154,30242-30769 -out test.showseq -auto ie, using things to specify that for some ORFs I want the translation on -1 instead of 1, but the above command just outputs DNA sequence with no translation. Any ideas gratefully appreciated Derek From d.gatherer at vir.gla.ac.uk Wed Feb 22 10:19:58 2006 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Wed, 22 Feb 2006 15:19:58 +0000 Subject: [EMBOSS] showseq and overlapping ORFs In-Reply-To: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> References: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> Message-ID: <6.2.3.4.1.20060222151212.02bb2328@lenzie.gla.ac.uk> Hello again I wonder if showseq is bugged. Look at the following: [gath01d at gamma ebv]$ showseq hhv8.fa -format 0 -trans 100-200 -out test.showseq Display a sequence with features, translation etc.. Specify your own things to display S : Sequence B : Blank line 1 : Frame1 translation 2 : Frame2 translation 3 : Frame3 translation -1 : CompFrame1 translation -2 : CompFrame2 translation -3 : CompFrame3 translation T : Ticks line N : Number ticks line C : Complement sequence F : Features R : Restriction enzyme cut sites in forward sense -R : Restriction enzyme cut sites in reverse sense A : Annotation Enter a list of things to display [B,N,T,S,A,F]: b,s,1 Choosing b,s,1 here gives: AF148805.2 Human herpesvirus 8 isolate GK18, complete genome TACTAATTTTGAAAGGCGGGGTTCTGCCAGGCATAGTCTTTTTTTGTGGCGGCCCTTGTG TAAACCTGTCTTTCAGACCTTGTTGGACATCCCGTACAATCAAGATGTTCCTGTATGTTG S R C S C M L TTTGCAGTCTGGCGGTTTGCTTTCGAGGACTATTAAGCCTTTCTCTGCAATCGTCTCCAA F A V W R F A F E D Y * A F L C N R L Q ATCTCTGCCCTGGAGTGATTTCAACGCCTTACACGTTGACCTGTCCGTCTAATACATCCT I S A L E * X TGCCAACATCCTGGTATTGCAACGATACTCGGCTTTTACGAGTGACGCAGGGAACATTGA ie. a nice frame 1 translation in the frame requested. but if I choose b,s,-1, I get: AF148805.2 Human herpesvirus 8 isolate GK18, complete genome TACTAATTTTGAAAGGCGGGGTTCTGCCAGGCATAGTCTTTTTTTGTGGCGGCCCTTGTG V L K S L R P E A L C L R K K H R G K H TAAACCTGTCTTTCAGACCTTGTTGGACATCCCGTACAATCAAGATGTTCCTGTATGTTG L G T K * V K N S M G Y L * S T G T H Q TTTGCAGTCTGGCGGTTTGCTTTCGAGGACTATTAAGCCTTTCTCTGCAATCGTCTCCAA K C D P P K S E L V I L G K E A I T E L ATCTCTGCCCTGGAGTGATTTCAACGCCTTACACGTTGACCTGTCCGTCTAATACATCCT D R G Q L S K L A K C T S R D T * Y M R TGCCAACATCCTGGTATTGCAACGATACTCGGCTTTTACGAGTGACGCAGGGAACATTGA A L M R T N C R Y E A K V L S A P F M S ie the frame -1 translation over the whole thing. So is -trans only really supposed to work with frame 1?? or is this a bug? I notice that transeq has in its "known bugs": "When using the '-regions' option, you should always leave the '-frames' option at the default of frame '1'." Has this been carried over into showseq? Cheers Derek From golharam at umdnj.edu Thu Feb 23 15:25:39 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Thu, 23 Feb 2006 15:25:39 -0500 Subject: [EMBOSS] MEME for EMBOSS Message-ID: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> I downloaded and installed MEME for EMBOSS. When I run 'tfm' and enter meme I get the output. The description and algorithm information is missing.... Function Motif detection Description **************** EDIT HERE **************** Algorithm **************** EDIT HERE **************** Usage Here is a sample session with meme -- Ryan Golhar - golharam at umdnj.edu The Informatics Institute of UMDNJ From ajb at ebi.ac.uk Thu Feb 23 16:32:16 2006 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Thu, 23 Feb 2006 21:32:16 -0000 (GMT) Subject: [EMBOSS] MEME for EMBOSS In-Reply-To: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> References: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> Message-ID: <55463.81.96.70.96.1140730336.squirrel@webmail.ebi.ac.uk> Hello Ryan, Yes, it is rather sparse. We hope to update meme to version 3.5.1 in the near future. However, the authors supply no documentation with their releases other than that gleaned from using the -help flag. The meme homepage gives similarly sparse information, although you can download a 1994 paper as a PDF. See http://meme.sdsc.edu/meme/papers.html In the meantime, meme -help will give some information. We will try to write more material for tfm in the next release. Thanks for reminding us. Alan From Ahmad.N.Abou.Tayoun at Dartmouth.EDU Fri Feb 24 21:50:28 2006 From: Ahmad.N.Abou.Tayoun at Dartmouth.EDU (Ahmad N. Abou Tayoun) Date: 24 Feb 2006 21:50:28 EST Subject: [EMBOSS] Accessing Applications Message-ID: <51444772@comet.Dartmouth.EDU> Hello, I am a graduate student at Dartmouth Medical School. I was trying to use some of EMBOSS applications but it always gave me the error shown below. Can you please help me figure that out ? An error has been encountered in accessing this page. 1. Server: emboss.sourceforge.net 2. URL path: /apps/groups/diffseq.html 3. Error notes: File does not exist: /home/groups/e/em/emboss/htdocs/apps/groups/diffseq.html 4. Error type: 404 5. Request method: GET 6. Request query string: 7. Time: 2006-02-24 18:48:16 PST (1140835696) Thank you alot, username: Ahmadtayoun Ahmad From pmr at ebi.ac.uk Tue Feb 28 04:44:12 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 28 Feb 2006 09:44:12 +0000 Subject: [EMBOSS] Accessing Applications In-Reply-To: <51444772@comet.Dartmouth.EDU> References: <51444772@comet.Dartmouth.EDU> Message-ID: <44041B6C.6030600@ebi.ac.uk> Ahmad N. Abou Tayoun wrote: > Hello, > > I am a graduate student at Dartmouth Medical School. I was trying to use some of EMBOSS applications but it always gave me the error shown below. Can you please help me figure that out ? > > > An error has been encountered in accessing this page. > > 1. Server: emboss.sourceforge.net > 2. URL path: /apps/groups/diffseq.html > 3. Error notes: File does not exist: /home/groups/e/em/emboss/htdocs/apps/groups/diffseq.html Oops ... we are making changes to the documentation on the website. We appear to have broken some links. What we are doing is to make separate documentation for the latest release and for the current development code. This has broken the links if you go via the application groups pages. You will still be able to use your local EMBOSS installation (which includes full documentation on the programs). From your message it could be you are looking for a site where you can run EMBOSS through a web interface ... if so, there are many to choose from. We do not advertise any in particular, but perhaps we should. Meanwhile, we will fix the broken links. Hope that helps, Peter From Marc.Logghe at DEVGEN.com Tue Feb 28 04:56:12 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Tue, 28 Feb 2006 10:56:12 +0100 Subject: [EMBOSS] Accessing Applications Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> Hi Ahmad, > Hello, > > I am a graduate student at Dartmouth Medical School. I was > trying to use some of EMBOSS applications but it always gave > me the error shown below. Can you please help me figure that out ? I have noticed some changes at the EMBOSS web site as well. The docs for the individual applications can be found at this url: http://emboss.sourceforge.net/apps/cvs/index.html A direct link to the docs of diffseq is http://emboss.sourceforge.net/apps/cvs/diffseq.html. Beware that at the sourceforge web site you can not actually run the applications ! You have to install it yourself and run applications at the command line or check out some sites that publish an EMBOSS web interface. I wanted to point you to some wEMBOSS implementations at various EMBL nodes but ... in order to use that you need to register, unfortunately. I don't know their policies (http://bigben.vub.ac.be:6080/wEMBOSS, https://emb1.bcc.univie.ac.at/component/option,com_wrapper/Itemid,104/). Anyhow, the EMBL policies seem to differ from those used by the NCBI or EBI for instance, cos there everybody can make use of the tools. Even if you don't pay taxes ;-) HTH, Marc From pmr at ebi.ac.uk Tue Feb 28 06:06:25 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 28 Feb 2006 11:06:25 +0000 Subject: [EMBOSS] Accessing Applications In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> References: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> Message-ID: <44042EB1.8000204@ebi.ac.uk> Marc Logghe wrote: > I have noticed some changes at the EMBOSS web site as well. The docs for > the individual applications can be found at this url: > http://emboss.sourceforge.net/apps/cvs/index.html > A direct link to the docs of diffseq is > http://emboss.sourceforge.net/apps/cvs/diffseq.html. These URLs will change as we update the website (we will change the apps/cvs part). A new path will appear for the docs for release 3.0.0. Thsi wil be the first time we have documented the latest release on the website - although few users noticed we were always documenting the latest CVS developers release :-) EMBOSS does include full documentation for all programs which is installed into the share/EMBOSS/doc/programs/html/ directory. This will change in release 4.0.0 (July 2006). In release 3.0.0 the EMBASSY packages are still documented in the same directory as the EMBOSS main programs. > I wanted to point you to some wEMBOSS implementations at various EMBL > nodes but ... in order to use that you need to register, unfortunately. > I don't know their policies (http://bigben.vub.ac.be:6080/wEMBOSS, > https://emb1.bcc.univie.ac.at/component/option,com_wrapper/Itemid,104/). > Anyhow, the EMBL policies seem to differ from those used by the NCBI or > EBI for instance, cos there everybody can make use of the tools. Even if > you don't pay taxes ;-) I think you mean "EMBnet" ... I have to be picky as EBI is part of EMBL (European Molecular Biology Laboratory in Heidelberg, Germany) ... and also a member of EMBnet :-) National EMBnet servers provide access to their registered scientists, - that is what they are funded for - and may also allow access to those outside. If you are in the UK, you no longer have a national EMBnet server as that was the HGMP/RFCGR where EMBOSS was developed until it closed in July 2005. The Austrian server machine (univie.ac.at) is about to move so may be down for a while. EBI does provide access to EMBOSS, but not a simple web interface to all the programs by name. You can also use many of EMBOSS programs through the EBI "Toolbox", but not diffseq which was the that started this thread. EBI also provides simple web service access through WSEmboss frok our External Services Group http://www.ebi.ac.uk/Tools/webservices/WSEmboss.html and fully featured web wervice access through SoapLab http://www.ebi.ac.uk/soaplab/ developed by Martin Senger in my group as part of the myGrid project (and now part of the EMBRACE project). I still plan to survey all the EMBOSS interfaces in the near future ... once funding is sorted out ... Hope that helps, Peter From stefan.rensing at biologie.uni-freiburg.de Fri Feb 3 15:16:14 2006 From: stefan.rensing at biologie.uni-freiburg.de (Stefan Rensing) Date: Fri, 03 Feb 2006 16:16:14 +0100 Subject: [EMBOSS] patmatdb Message-ID: <43E373BE.10202@biologie.uni-freiburg.de> Dear all, what exactly does the flag -snucleotide1 toggle in patmatdb? >From the doc I was thinking it would enable searching against nucleotide acid sequences instead of proteins. However, execution aborts with an error message saying that the sequence is not protein. Cheers, Stefan -- Dr. Stefan Rensing, Group Leader Computational Biology Plant Biotechnology, Faculty of Biology, University of Freiburg Schaenzlestr. 1, D-79104 Freiburg, Fon: +49 761 203-6974, Fax: -6945 http://www.plant-biotech.net/ http://www.cosmoss.org/ stefan.rensing at biologie.uni-freiburg.de "An old man dies. A young girl lives. A fair trade. I love you, Nancy." From pmr at ebi.ac.uk Fri Feb 3 17:31:39 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 03 Feb 2006 17:31:39 +0000 Subject: [EMBOSS] patmatdb In-Reply-To: <43E373BE.10202@biologie.uni-freiburg.de> References: <43E373BE.10202@biologie.uni-freiburg.de> Message-ID: <43E3937B.607@ebi.ac.uk> Hi Stefan, > what exactly does the flag -snucleotide1 toggle in patmatdb? -snucleotide (and -sprotein) are available for all sequence inputs. They are used for programs that can read DNA of rpotein sequences, where you have a sequence that can be both types (a short sequence ni FASTA format for example) >>From the doc I was thinking it would enable searching against nucleotide > acid sequences instead of proteins. However, execution aborts with an > error message saying that the sequence is not protein. The sequence type tells patmatdb to only accept protein. Thinking about this ... we can change the -help output (and the program documentation) to describe the sequence type much better than the current "sequence database USA". We can, for example, say whether the sequence can be DNA or protein and whether gaps, stops, and other characters are used. All we need is a short description (which we have) for each sequence type. This is something we should have added long ago :-) The pattern syntax was defined for the PROSITE database ... but we can also allow it to search nucleotide data. It is only a small change to the program. Does anyone neet to search a nucleotide database with patmatdb? I suspect patmatdb is rather redundant ... you can get the same results from fuzzpro (for protein) or fuzznuc (for nucleotide) ... -rformat dbmotif will give you the same output format as patmatdb. Any preferences? Peter From andrespinzon at gmail.com Mon Feb 6 19:18:50 2006 From: andrespinzon at gmail.com (Andres Pinzon) Date: Mon, 6 Feb 2006 14:18:50 -0500 Subject: [EMBOSS] seqretsplit in batch mode In-Reply-To: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> References: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> Message-ID: <8968fc7e0602061118s2bf6fdc4l@mail.gmail.com> 2006/2/6, Andres Pinzon : > [...] > Anybody knows how can i tell seqretsplit not to prompt for the Output sequence? > Best regards, Ok, I answer my own question, just in case ;-) ============ seqretsplit -sequence mysequence.fasta -auto ============ -- --------- Andr?s Pinz?n [http://www.andrespinzon.com] Bioinformatics Center, Colombia EMBnet node Biotechnology Institute - National University of Colombia http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 ---------- From andrespinzon at gmail.com Mon Feb 6 18:15:15 2006 From: andrespinzon at gmail.com (Andres Pinzon) Date: Mon, 6 Feb 2006 13:15:15 -0500 Subject: [EMBOSS] seqretsplit in batch mode Message-ID: <8968fc7e0602061015r551ce5dcn@mail.gmail.com> Hi all, Im trying to use seqretsplit in thousands of fasta files in a directory at once. But it fails (I think) because for each fasta file one is supoussed to hit , for instance: andipin at linux:~/BYB/byDir> seqretsplit 11150.fasta Reads and writes (returns) sequences in individual files Output sequence [q3ybd3.fasta]: <--- HIT ENTER!!!! Anybody knows how can i tell seqretsplit not to prompt for the Output sequence? Best regards, -- --------- Andr?s Pinz?n [http://www.andrespinzon.com] Bioinformatics Center, Colombia EMBnet node Biotechnology Institute - National University of Colombia http://bioinf.ibun.unal.edu.co Tel +57 3165000 ext 16961 Fax +571 3165415 ---------- From David.Bauer at schering.de Tue Feb 7 06:46:16 2006 From: David.Bauer at schering.de (David.Bauer at schering.de) Date: Tue, 7 Feb 2006 07:46:16 +0100 Subject: [EMBOSS] seqretsplit in batch mode In-Reply-To: <8968fc7e0602061118s2bf6fdc4l@mail.gmail.com> Message-ID: And if you become more familiar with emboss you can set the environment variable EMBOSS_AUTO=1 which causes alle emboss programs to run like if -auto has been specified on the command line. Cheers, David. emboss-bounces at emboss.open-bio.org schrieb am 06/02/2006 20:18:50: > 2006/2/6, Andres Pinzon : > > > [...] > > Anybody knows how can i tell seqretsplit not to prompt for the > Output sequence? > > Best regards, > > Ok, I answer my own question, just in case ;-) > ============ > seqretsplit -sequence mysequence.fasta -auto > ============ From golharam at umdnj.edu Thu Feb 9 04:46:43 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Wed, 08 Feb 2006 23:46:43 -0500 Subject: [EMBOSS] Tool to mutate DNA sequence Message-ID: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Does anyone know of tool to mutate a DNA sequence by a specified amount? For instance, say I have a DNA sequence 1000 bases long, and I want to simulate mutations to make it 75% (or 80%, etc) similar to the original. Ryan From pmr at ebi.ac.uk Thu Feb 9 08:25:24 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Thu, 9 Feb 2006 08:25:24 -0000 (GMT) Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <2714.86.132.216.50.1139473524.squirrel@webmail.ebi.ac.uk> Ryan Golhar writes: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. EMBOSS has the msbar program ("mutate sequence beyond all recognition") which allows you to select the number and type of changes. With some tuning of options to match the sequence length you should be able to get results that match whatever your definition of 75% similar might be (amazing how much more similarity you can get by adding gaps in an alignment :-) If you can specify a clear and generally useful way to define what you need we could of course add a "percent change" option to the msbar program for a future release. Hope that helps, Peter From torsten.seemann at infotech.monash.edu.au Thu Feb 9 11:15:28 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Thu, 09 Feb 2006 22:15:28 +1100 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <43EB2450.6000606@infotech.monash.edu.au> Ryan, > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. The EMBOSS suite comes with a tool called "msbar" which can controllably mutate sequences: http://emboss.sourceforge.net/apps/msbar.html -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia http://www.vicbioinformatics.com/ From jason.stajich at duke.edu Thu Feb 9 19:10:54 2006 From: jason.stajich at duke.edu (Jason Stajich) Date: Thu, 9 Feb 2006 14:10:54 -0500 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Depending on whether or not you want to use evolutionary realistic models... * evolver which comes with PAML lets you evolve sequences on a tree * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? id=seqgen also lets you do this I believe there are PISE interfaces to both of these at the pasteur bioweb site - http://bioweb.pasteur.fr/ -jason On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich Duke University http://www.duke.edu/~jes12 From heikki at sanbi.ac.za Thu Feb 9 11:31:20 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 13:31:20 +0200 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091331.21690.heikki@sanbi.ac.za> Ryan, Instructions in pseudo code: take the sequence string out of the object use a hash to store changed locations repeat pick a location in the string randomly if the location is not in a hash , i.e. changed already, change it into something else add the changed location into the hash if enough locations have been changed (scalar keys hash), exit loop put the sequence string back into the seq object -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From heikki at sanbi.ac.za Thu Feb 9 14:54:30 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 16:54:30 +0200 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091654.30890.heikki@sanbi.ac.za> Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From golharam at umdnj.edu Thu Feb 9 21:19:46 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Thu, 09 Feb 2006 16:19:46 -0500 Subject: [EMBOSS] [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <200602091654.30890.heikki@sanbi.ac.za> Message-ID: <002801c62dbe$8d4d7e20$e6028a0a@GOLHARMOBILE1> Thanks all. The responses I got were definitely more than helpful. FYI - I did initially look at msbar. I glanced over the "Number of times to perform mutation operations", which is what I was looking for. I'm looking to statistically test some simply scoring matrices. I think msbar will do. Ryan -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Heikki Lehvaslaiho Sent: Thursday, February 09, 2006 9:55 AM To: bioperl-l at lists.open-bio.org; golharam at umdnj.edu Cc: 'The general forum at Bioinformatics.Org'; 'bioperl-l'; emboss at emboss.open-bio.org Subject: Re: [Bioperl-l] Tool to mutate DNA sequence Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? For instance, say I have a DNA sequence 1000 bases long, and I > want to simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From Marc.Logghe at DEVGEN.com Fri Feb 10 11:06:35 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 12:06:35 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> Hi all, I have noticed that the output table format of infoseq is not consistent. In the sense that the whitespace field separator is sometimes two spaces and sometimes a tab followed by a space. This makes it impossible (at least, if you don't want an extra processing step) to properly import the infoseq output into a spreadsheet. Is there a way to force infoseq to use tabs only as field separator ? Regards, Marc From Marc.Logghe at DEVGEN.com Fri Feb 10 11:39:26 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 12:39:26 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B07@ANTARESIA.be.devgen.com> Never mind. Just found out that in Excel you have the option to define multiple delimiters AND treat consecutive delimiters as one ... Guess I don't use spreadsheets regularly enough ;-) On the other hand, I think it would be 'cleaner' if infoseq would stick to a single type of delimiter. Cheers, Marc > -----Original Message----- > From: emboss-bounces at emboss.open-bio.org > [mailto:emboss-bounces at emboss.open-bio.org] On Behalf Of Marc Logghe > Sent: Friday, February 10, 2006 12:07 PM > To: emboss at emboss.open-bio.org > Subject: [EMBOSS] infoseq table format issues > > Hi all, > I have noticed that the output table format of infoseq is not > consistent. In the sense that the whitespace field separator > is sometimes two spaces and sometimes a tab followed by a > space. This makes it impossible (at least, if you don't want > an extra processing step) to properly import the infoseq > output into a spreadsheet. Is there a way to force infoseq to > use tabs only as field separator ? > Regards, > Marc > > _______________________________________________ > EMBOSS mailing list > EMBOSS at emboss.open-bio.org > http://newportal.open-bio.org/mailman/listinfo/emboss > From pmr at ebi.ac.uk Fri Feb 10 11:45:41 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 10 Feb 2006 11:45:41 +0000 Subject: [EMBOSS] infoseq table format issues In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> References: <0C528E3670D8CE4B8E013F6749231AA6746B04@ANTARESIA.be.devgen.com> Message-ID: <43EC7CE5.8020101@ebi.ac.uk> Marc Logghe wrote: > Hi all, > I have noticed that the output table format of infoseq is not > consistent. In the sense that the whitespace field separator is > sometimes two spaces and sometimes a tab followed by a space. This makes > it impossible (at least, if you don't want an extra processing step) to > properly import the infoseq output into a spreadsheet. Is there a way to > force infoseq to use tabs only as field separator ? Yes .... but ... infoseq produces a report or HTML. Is the HTML useful? Is the text output useful (and is it better to use spaces as text) Is tab-delimited output useful (also for other programs)? Do we need some kind of XML output to replace the HTML? Meanwhile, I will look at cleaning up the current infoseq regards, Peter From Marc.Logghe at DEVGEN.com Fri Feb 10 13:18:13 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Fri, 10 Feb 2006 14:18:13 +0100 Subject: [EMBOSS] infoseq table format issues Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B0A@ANTARESIA.be.devgen.com> Hi Peter, > > step) to properly import the infoseq output into a spreadsheet. Is > > there a way to force infoseq to use tabs only as field separator ? > > Yes .... but ... > > infoseq produces a report or HTML. > > Is the HTML useful? Yes, just found out that you can perfectly import this format into Excel ... > > Is the text output useful (and is it better to use spaces as text) > > Is tab-delimited output useful (also for other programs)? I think tab delimited is better (compared to spaces as delimiter). That way you can easily process the table with cut or sort and that kind of things. > > Do we need some kind of XML output to replace the HTML? > > Meanwhile, I will look at cleaning up the current infoseq Thanks and regards, Marc From fnovo at unav.es Mon Feb 13 12:29:29 2006 From: fnovo at unav.es (F.J. Novo) Date: Mon, 13 Feb 2006 13:29:29 +0100 Subject: [EMBOSS] dottup and dotmatcher in 3.0.0 Message-ID: <43F07BA9.6020606@unav.es> Hi all. There seems to be something wrong with dottup and dotmatcher in 3.0.0 (under Fedora Core 4). I consistently get an error: [fnovo at localhost ~]$ dotmatcher Displays a thresholded dotplot of two sequences Input sequence: p53mRNA.seq Second sequence: p53mRNA.seq Graph type [x11]: png Devices allowed are:- postscript ps hpgl hp7470 hp7580 meta colourps cps tektronics tekt tek4107t tek none null text data png Error: Invalid XY graph value 'x11' Died: dotmatcher terminated: Bad value for '-xygraph' and no prompt and the same for dottup. Needless to say, png is properly installed and works fine with other applications (including dotpath and polydot), so it seems something specific to these two programs. Has anyone else seen this? Any suggestions? Thanks, From maximilianh at gmail.com Tue Feb 14 10:11:42 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 14 Feb 2006 11:11:42 +0100 Subject: [EMBOSS] [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Message-ID: <76f031ae0602140211n2a0bbf4fl@mail.gmail.com> The tool ROSE also evolves sequences on a tree. There is a web interface and downloadable source at http://bibiserv.techfak.uni-bielefeld.de/rose/ Max On 09/02/06, Jason Stajich wrote: > Depending on whether or not you want to use evolutionary realistic > models... > * evolver which comes with PAML lets you evolve sequences on a tree > * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? > id=seqgen > also lets you do this > I believe there are PISE interfaces to both of these at the pasteur > bioweb site - http://bioweb.pasteur.fr/ > > -jason > On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > > > Does anyone know of tool to mutate a DNA sequence by a specified > > amount? > > For instance, say I have a DNA sequence 1000 bases long, and I want to > > simulate mutations to make it 75% (or 80%, etc) similar to the > > original. > > > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > Duke University > http://www.duke.edu/~jes12 > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From scott at cs.wits.ac.za Wed Feb 15 18:41:02 2006 From: scott at cs.wits.ac.za (Scott Hazelhurst) Date: Wed, 15 Feb 2006 20:41:02 +0200 (SAST) Subject: [EMBOSS] Emma options Message-ID: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> Dear all, I am not sure if this is a question for this list or the wemboss forum -- but their site seems down so I'll ask it here. I have users who want to specify their own penalty matrices. For example, for nucleotide sequences the options are clustalw, iub or own. However, if you pick "own" nothing changes: the option to specify your own file doesn't appear. I had a look at the acd file -- not that I properly understand it, and it seems I had a look at the acd file -- not that I understand acd syntax properly so it's a like my trying to fix a broken Jumbo engine. This seems to be the relevant list: dnamatrix [ additional: "@(!$(acdprotein))" default: "i" minimum: "1" maximum: "1" header: "Nucleotide multiple alignment matrix options" values: "i:iub, c:clustalw, o:own" delimiter: "," codedelimiter: ":" information: "Select matrix" button: "Y" help: "This gives a menu where you are offered amenu where a single matrix (not a series) can be selected." ] variable: umamatrix "@($(dnamatrix) == own)" infile: mamatrix [ additional: "@($(usermamatrix)?True:$(umamatrix))" default: "" nullok: "@($(usermamatrix)?@(!$(umamatrix)):True)" information: "Filename of user multiple alignment matrix" knowntype: "comparison matrix" ] I changed the "additional" field in mamatrix to be additional: "@($(umamatrix))" and it seems to work in that the file input box now appears when the "own" option is seleced. As an aside, does anyone know the format of what the matrix file should look like. I did some web searching and looked at the clustalw source code but it's not so easy to re-engineer.. Can anyone help with either of these queries. Thanks Scott From gbottu at ben.vub.ac.be Fri Feb 17 11:27:11 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Fri, 17 Feb 2006 12:27:11 +0100 Subject: [EMBOSS] Emma options - Checked by AntiVir DEMO version - In-Reply-To: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> References: <20060215184102.6176F40F54B@midi.cs.wits.ac.za> Message-ID: <20060217112711.GA23686@bigben.ulb.ac.be> On Wed, Feb 15, 2006 at 08:41:02PM +0200, Scott Hazelhurst wrote: > As an aside, does anyone know the format of what the matrix file > should look like. I did some web searching and looked at the clustalw > source code but it's not so easy to re-engineer.. In case you did not already found it by yourself, here's the answer (copied from a user manaul I composed some time ago) : ------------------------------------------------------------- Data files clustal uses symbol comparison matrices for scoring bases or amino acids. CLUSTAL has built-in symbol comparison matrices, but allows you to provide your own matrix. For proteins, but not for nucleic acids, you can give a series of matrices as input. You can choose different matrices for pairwise alignment and for multiple alignment. Single matrix input file The format used for a single matrix is the same as that used by the BLAST program. The scores in the new weight matrix should be similarities. You can use negative as well as positive values if you wish, although for proteins the matrix will be automatically adjusted to all positive scores, unless the -norescale option is selected. Any lines beginning with a # character are assumed to be comments. The first non-comment line should contain a list of bases or amino acids in any order, using the 1 letter code, followed by a * character. This should be followed by a square matrix of scores, with one row and one column for each base or amino acid. The last row and column of the matrix (corresponding to the * character) contain the minimum score over the whole matrix. # Matrix made by matblas from blosum62.iij # * column uses minimum score # BLOSUM Clustered Scoring Matrix in 1/2 Bit Units # Blocks Database = /data/blocks_5.0/blocks.dat # Cluster Percentage: >= 62 # Entropy = 0.6979, Expected = -0.5209 A R N D C Q E G H I L K M F P S T W Y V B Z X * A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -4 R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -4 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1 -4 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 4 1 -1 -4 C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 -4 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1 -4 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 -1 -4 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1 -4 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -3 -3 -1 -4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1 -4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1 -4 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -3 -1 -1 -4 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1 -4 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2 -4 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0 -4 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0 -4 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 -2 -4 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1 -4 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1 -4 B -2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1 -4 Z -1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1 -4 * -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 1 Matrix series input format For proteins, CLUSTAL uses by default different matrices depending on the mean percent identity of the sequences to be aligned. For proteins, but not for nucleic acids, you can specify yourself a series of matrices and the range of the percent identity for each matrix in a matrix series file. The file is automatically recognised by the word CLUSTAL_SERIES at the beginning of the file. Each matrix in the series is then specified on one line which should start with the word MATRIX. This is followed by the lower and upper limits of the sequence percent identities for which you want to apply the matrix. The final entry on the matrix line is the filename of a BLAST format matrix file (see above for details of the single matrix file format). CLUSTAL_SERIES MATRIX 81 100 blosum80 MATRIX 61 80 blosum62 MATRIX 31 60 blosum45 MATRIX 0 30 blosum30 ---------------------------------------------------------------------- Regards, Guy Bottu, Belgian EMBnet Node From biopyte at yahoo.de Sat Feb 18 01:03:19 2006 From: biopyte at yahoo.de (Hans Meier) Date: Sat, 18 Feb 2006 02:03:19 +0100 (CET) Subject: [EMBOSS] 'octanol' - output to stdout not possible? Message-ID: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> Dear friends, with the general qualifiers "-filter" and "-stdout" I tried to make 'octanol -graph png' write to stdout and not to a file (for certain reasons). Anyway, a file is written and I can't catch the output directly from stdout. Is there a solution for this problem or is writing a file hardcoded? Regards, Harald --------------------------------- Telefonieren Sie ohne weitere Kosten mit Ihren Freunden von PC zu PC! Jetzt Yahoo! Messenger installieren! From dstates at bioinformatics.med.umich.edu Fri Feb 17 21:00:19 2006 From: dstates at bioinformatics.med.umich.edu (David States) Date: Fri, 17 Feb 2006 16:00:19 -0500 Subject: [EMBOSS] Building EMBOSS in cygwin - undefined reference to _c_pl... Message-ID: <46949FAC535B4245B58BB9A1262F5EC40824@mail.bicc.med.umich.edu> I am trying to build EMBOSS-3.0.0 under cygwin (recent install) and follow the directions in http://emboss.sourceforge.net/download/cygwin.html, but when I get to ./libs/ajgraph.o, the build fails with a large number of error messages of the form .libs/ajgraph.o:ajgraph.c:(.text+0xc4): undefined reference to `_c_plxsfnam' .libs/ajgraph.o:ajgraph.c:(.text+0x15d): undefined reference to `_c_pladv' .libs/ajgraph.o:ajgraph.c:(.text+0x195): undefined reference to `_c_plssub' ... Any suggestions? David David J. States, M.D., Ph.D. Professor of Human Genetics University of Michigan School of Medicine? Palmer Commons,??2035B 100 Washtenaw Rd. Ann Arbor, MI 48109 USA email: dstates at umich.edu tel: (734) 615-5510 fax: (734) 615-6553 URL: http://stateslab.bioinformatics.med.umich.edu From maximilianh at gmail.com Sun Feb 19 13:52:37 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Sun, 19 Feb 2006 14:52:37 +0100 Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <76f031ae0602190552v5f2542dbv@mail.gmail.com> Hi bio-mailinglists, does anyone here know of a tool or a library to display two (or more) sequences at the same time with coloured features? Possibly with lines, connecting some features from one sequence to the other (synteny-plot) ? Or to display two multiple alignments, one on top of each other, with colored features added? It's not that it would be difficult to write, but programming visualisation usually takes a lot of time. Bio::Graphics seems mainly concerned with one main sequence and features on it. Well, I could copy together two of these gif-images, but then there would be no connecting lines. Same applies for the graphics in Biojava or the gff2ps tool or all the multiple alignment viewers that I know (Bioedit, ClustalX). There is something called Toucan in Java, which displays at least several lines of gff-style-features, but no visible sequences and more importantly, no connecting lines. A recent software, Djinn lite, is using a similar kind of visualization to compare different spliced genes from various species, but it's mainly aimed at splicing and written in Visual Basic. I guess a good compromise might be the 3D viewer Sockeye, but I haven't seen any synteny-lines in sockeye yet. I guess I must have missed something here. I cannot be the first one that would like to compare, say, two gff files, or two multiple alignments? Thanks a lot for any idea, Max From gbottu at ben.vub.ac.be Mon Feb 20 08:04:11 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 20 Feb 2006 09:04:11 +0100 Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence - Checked by AntiVir In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <20060220080411.GB26540@bigben.ulb.ac.be> On Sun, Feb 19, 2006 at 02:52:37PM +0100, Maximilian Haeussler wrote: > does anyone here know of a tool or a library to display two (or more) > sequences at the same time with coloured features? Possibly with lines, > connecting some features from one sequence to the other (synteny-plot) ? > Or to display two multiple alignments, one on top of each other, with > colored features added? Well, there is Alfresco http://www.sanger.ac.uk/Software/Alfresco/ Guy Bottu, Belgian EMBnet Node From pmr at ebi.ac.uk Mon Feb 20 08:19:07 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Mon, 20 Feb 2006 08:19:07 -0000 (GMT) Subject: [EMBOSS] 'octanol' - output to stdout not possible? In-Reply-To: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> References: <20060218010319.77871.qmail@web26314.mail.ukl.yahoo.com> Message-ID: <1256.86.137.129.90.1140423547.squirrel@webmail.ebi.ac.uk> Dear Harald, > with the general qualifiers "-filter" and "-stdout" > I tried to make 'octanol -graph png' write to stdout > and not to a file (for certain reasons). > Anyway, a file is written and I can't catch the output > directly from stdout. For graphs output must go to a file because many programs will write more than one PNG file. > Is there a solution for this problem or > is writing a file hardcoded? Unfortunately the -goutfile stdout option will not work - the graphics library writes the message "Created octanol.1.png" to stdout. But for octanol there is a single image file produces, so you can use the command line option: -goutfile x.png and than cat x.png to stdout. For programs that write more than one PNG file (prettyplot for example) you will get x.png.1.png x.png.2.png and so on. I do not see an easy way around this for PNG files. Hope that helps, Peter From shameer at ncbs.res.in Mon Feb 20 06:21:01 2006 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 20 Feb 2006 11:51:01 +0530 (IST) Subject: [EMBOSS] Matrix Average Code / Module ? In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <59825.192.168.1.176.1140416461.squirrel@192.168.1.176> Hi all, Is there any program/module to calculate the average of a blosum/pam any matrix ? I have a matrix and I need to see the average for example 11 22 43 54 50 27 87 74 32 10 66 58 98 78 20 22 23 44 16 34 I have gone through Bio::Matrix::MatrixI and Bio::Matrix::GenericMatrix and other perl modules like Math::Matrix http://search.cpan.org/~ulpfr/Math-Matrix-0.4/Matrix.pm and Math::Cephes::Matrix - but none of them have a provison to do matrix average calculation. Any help ??? thanks in advance, Happy biocomputing !!! -- Shameer Khadar National Centre for Biological Sciences (TIFR) UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India T - 91-080-23636420-32 EXT 4241 F - 91-080-23636662/23636675 W - http://www.ncbs.res.in -------------------------------------------------- "Refrain from illusions, insist on work and not words, patiently seek divine and scientific truth." MM From d.gatherer at vir.gla.ac.uk Wed Feb 22 12:03:11 2006 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Wed, 22 Feb 2006 12:03:11 +0000 Subject: [EMBOSS] showseq and overlapping ORFs Message-ID: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> Hi EMBOSSers I'm using showseq as follows: [gath01d at gamma ebv]$ showseq hhv8.fa -format 4 -trans 105-944,1112-2764,3179-6577,6594-8681,8665-11202,11329-14367,14485-15741,15756-16979,28774-29154,30242-30769 -out test.showseq -auto EMBOSS An error in showseq.c at line 198: Translation ranges are not in ascending, non-overlapping order. As can be seen, the failure originates with the subsequence 6594-8681 slightly overlapping with the next one 8665-11202. Is there a way round this on the command line or would it require a code tweak? It would be good if there was, since often in viral genomes (this is HHV8, as it happens) ORFs are not cleanly "in ascending, non-overlapping order" as the program would seem to require. A related question: all the above are top-strand ORFs, but further down there are a few complementary strand ones. What combination of parameters would I use to indicate that I want some translated on the top and some on the bottom? I could of course use format -6 and get all six frames, but that is a bit messy for the output I want. I was thinking that maybe it would need to be something like: showseq hhv8.fa -things B,N,T,S,B,1,A,F -trans 105-944,1112-2764,3179-6577,6594-8681,8665-11202,11329-14367 -things B,N,T,S,B,-1,A,F -trans 14485-15741,15756-16979,28774-29154,30242-30769 -out test.showseq -auto ie, using things to specify that for some ORFs I want the translation on -1 instead of 1, but the above command just outputs DNA sequence with no translation. Any ideas gratefully appreciated Derek From d.gatherer at vir.gla.ac.uk Wed Feb 22 15:19:58 2006 From: d.gatherer at vir.gla.ac.uk (Derek Gatherer) Date: Wed, 22 Feb 2006 15:19:58 +0000 Subject: [EMBOSS] showseq and overlapping ORFs In-Reply-To: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> References: <6.2.3.4.1.20060222114454.02b36000@lenzie.gla.ac.uk> Message-ID: <6.2.3.4.1.20060222151212.02bb2328@lenzie.gla.ac.uk> Hello again I wonder if showseq is bugged. Look at the following: [gath01d at gamma ebv]$ showseq hhv8.fa -format 0 -trans 100-200 -out test.showseq Display a sequence with features, translation etc.. Specify your own things to display S : Sequence B : Blank line 1 : Frame1 translation 2 : Frame2 translation 3 : Frame3 translation -1 : CompFrame1 translation -2 : CompFrame2 translation -3 : CompFrame3 translation T : Ticks line N : Number ticks line C : Complement sequence F : Features R : Restriction enzyme cut sites in forward sense -R : Restriction enzyme cut sites in reverse sense A : Annotation Enter a list of things to display [B,N,T,S,A,F]: b,s,1 Choosing b,s,1 here gives: AF148805.2 Human herpesvirus 8 isolate GK18, complete genome TACTAATTTTGAAAGGCGGGGTTCTGCCAGGCATAGTCTTTTTTTGTGGCGGCCCTTGTG TAAACCTGTCTTTCAGACCTTGTTGGACATCCCGTACAATCAAGATGTTCCTGTATGTTG S R C S C M L TTTGCAGTCTGGCGGTTTGCTTTCGAGGACTATTAAGCCTTTCTCTGCAATCGTCTCCAA F A V W R F A F E D Y * A F L C N R L Q ATCTCTGCCCTGGAGTGATTTCAACGCCTTACACGTTGACCTGTCCGTCTAATACATCCT I S A L E * X TGCCAACATCCTGGTATTGCAACGATACTCGGCTTTTACGAGTGACGCAGGGAACATTGA ie. a nice frame 1 translation in the frame requested. but if I choose b,s,-1, I get: AF148805.2 Human herpesvirus 8 isolate GK18, complete genome TACTAATTTTGAAAGGCGGGGTTCTGCCAGGCATAGTCTTTTTTTGTGGCGGCCCTTGTG V L K S L R P E A L C L R K K H R G K H TAAACCTGTCTTTCAGACCTTGTTGGACATCCCGTACAATCAAGATGTTCCTGTATGTTG L G T K * V K N S M G Y L * S T G T H Q TTTGCAGTCTGGCGGTTTGCTTTCGAGGACTATTAAGCCTTTCTCTGCAATCGTCTCCAA K C D P P K S E L V I L G K E A I T E L ATCTCTGCCCTGGAGTGATTTCAACGCCTTACACGTTGACCTGTCCGTCTAATACATCCT D R G Q L S K L A K C T S R D T * Y M R TGCCAACATCCTGGTATTGCAACGATACTCGGCTTTTACGAGTGACGCAGGGAACATTGA A L M R T N C R Y E A K V L S A P F M S ie the frame -1 translation over the whole thing. So is -trans only really supposed to work with frame 1?? or is this a bug? I notice that transeq has in its "known bugs": "When using the '-regions' option, you should always leave the '-frames' option at the default of frame '1'." Has this been carried over into showseq? Cheers Derek From golharam at umdnj.edu Thu Feb 23 20:25:39 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Thu, 23 Feb 2006 15:25:39 -0500 Subject: [EMBOSS] MEME for EMBOSS Message-ID: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> I downloaded and installed MEME for EMBOSS. When I run 'tfm' and enter meme I get the output. The description and algorithm information is missing.... Function Motif detection Description **************** EDIT HERE **************** Algorithm **************** EDIT HERE **************** Usage Here is a sample session with meme -- Ryan Golhar - golharam at umdnj.edu The Informatics Institute of UMDNJ From ajb at ebi.ac.uk Thu Feb 23 21:32:16 2006 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Thu, 23 Feb 2006 21:32:16 -0000 (GMT) Subject: [EMBOSS] MEME for EMBOSS In-Reply-To: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> References: <001901c638b7$4f8625c0$e6028a0a@GOLHARMOBILE1> Message-ID: <55463.81.96.70.96.1140730336.squirrel@webmail.ebi.ac.uk> Hello Ryan, Yes, it is rather sparse. We hope to update meme to version 3.5.1 in the near future. However, the authors supply no documentation with their releases other than that gleaned from using the -help flag. The meme homepage gives similarly sparse information, although you can download a 1994 paper as a PDF. See http://meme.sdsc.edu/meme/papers.html In the meantime, meme -help will give some information. We will try to write more material for tfm in the next release. Thanks for reminding us. Alan From Ahmad.N.Abou.Tayoun at Dartmouth.EDU Sat Feb 25 02:50:28 2006 From: Ahmad.N.Abou.Tayoun at Dartmouth.EDU (Ahmad N. Abou Tayoun) Date: 24 Feb 2006 21:50:28 EST Subject: [EMBOSS] Accessing Applications Message-ID: <51444772@comet.Dartmouth.EDU> Hello, I am a graduate student at Dartmouth Medical School. I was trying to use some of EMBOSS applications but it always gave me the error shown below. Can you please help me figure that out ? An error has been encountered in accessing this page. 1. Server: emboss.sourceforge.net 2. URL path: /apps/groups/diffseq.html 3. Error notes: File does not exist: /home/groups/e/em/emboss/htdocs/apps/groups/diffseq.html 4. Error type: 404 5. Request method: GET 6. Request query string: 7. Time: 2006-02-24 18:48:16 PST (1140835696) Thank you alot, username: Ahmadtayoun Ahmad From pmr at ebi.ac.uk Tue Feb 28 09:44:12 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 28 Feb 2006 09:44:12 +0000 Subject: [EMBOSS] Accessing Applications In-Reply-To: <51444772@comet.Dartmouth.EDU> References: <51444772@comet.Dartmouth.EDU> Message-ID: <44041B6C.6030600@ebi.ac.uk> Ahmad N. Abou Tayoun wrote: > Hello, > > I am a graduate student at Dartmouth Medical School. I was trying to use some of EMBOSS applications but it always gave me the error shown below. Can you please help me figure that out ? > > > An error has been encountered in accessing this page. > > 1. Server: emboss.sourceforge.net > 2. URL path: /apps/groups/diffseq.html > 3. Error notes: File does not exist: /home/groups/e/em/emboss/htdocs/apps/groups/diffseq.html Oops ... we are making changes to the documentation on the website. We appear to have broken some links. What we are doing is to make separate documentation for the latest release and for the current development code. This has broken the links if you go via the application groups pages. You will still be able to use your local EMBOSS installation (which includes full documentation on the programs). From your message it could be you are looking for a site where you can run EMBOSS through a web interface ... if so, there are many to choose from. We do not advertise any in particular, but perhaps we should. Meanwhile, we will fix the broken links. Hope that helps, Peter From Marc.Logghe at DEVGEN.com Tue Feb 28 09:56:12 2006 From: Marc.Logghe at DEVGEN.com (Marc Logghe) Date: Tue, 28 Feb 2006 10:56:12 +0100 Subject: [EMBOSS] Accessing Applications Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> Hi Ahmad, > Hello, > > I am a graduate student at Dartmouth Medical School. I was > trying to use some of EMBOSS applications but it always gave > me the error shown below. Can you please help me figure that out ? I have noticed some changes at the EMBOSS web site as well. The docs for the individual applications can be found at this url: http://emboss.sourceforge.net/apps/cvs/index.html A direct link to the docs of diffseq is http://emboss.sourceforge.net/apps/cvs/diffseq.html. Beware that at the sourceforge web site you can not actually run the applications ! You have to install it yourself and run applications at the command line or check out some sites that publish an EMBOSS web interface. I wanted to point you to some wEMBOSS implementations at various EMBL nodes but ... in order to use that you need to register, unfortunately. I don't know their policies (http://bigben.vub.ac.be:6080/wEMBOSS, https://emb1.bcc.univie.ac.at/component/option,com_wrapper/Itemid,104/). Anyhow, the EMBL policies seem to differ from those used by the NCBI or EBI for instance, cos there everybody can make use of the tools. Even if you don't pay taxes ;-) HTH, Marc From pmr at ebi.ac.uk Tue Feb 28 11:06:25 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 28 Feb 2006 11:06:25 +0000 Subject: [EMBOSS] Accessing Applications In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> References: <0C528E3670D8CE4B8E013F6749231AA6746B65@ANTARESIA.be.devgen.com> Message-ID: <44042EB1.8000204@ebi.ac.uk> Marc Logghe wrote: > I have noticed some changes at the EMBOSS web site as well. The docs for > the individual applications can be found at this url: > http://emboss.sourceforge.net/apps/cvs/index.html > A direct link to the docs of diffseq is > http://emboss.sourceforge.net/apps/cvs/diffseq.html. These URLs will change as we update the website (we will change the apps/cvs part). A new path will appear for the docs for release 3.0.0. Thsi wil be the first time we have documented the latest release on the website - although few users noticed we were always documenting the latest CVS developers release :-) EMBOSS does include full documentation for all programs which is installed into the share/EMBOSS/doc/programs/html/ directory. This will change in release 4.0.0 (July 2006). In release 3.0.0 the EMBASSY packages are still documented in the same directory as the EMBOSS main programs. > I wanted to point you to some wEMBOSS implementations at various EMBL > nodes but ... in order to use that you need to register, unfortunately. > I don't know their policies (http://bigben.vub.ac.be:6080/wEMBOSS, > https://emb1.bcc.univie.ac.at/component/option,com_wrapper/Itemid,104/). > Anyhow, the EMBL policies seem to differ from those used by the NCBI or > EBI for instance, cos there everybody can make use of the tools. Even if > you don't pay taxes ;-) I think you mean "EMBnet" ... I have to be picky as EBI is part of EMBL (European Molecular Biology Laboratory in Heidelberg, Germany) ... and also a member of EMBnet :-) National EMBnet servers provide access to their registered scientists, - that is what they are funded for - and may also allow access to those outside. If you are in the UK, you no longer have a national EMBnet server as that was the HGMP/RFCGR where EMBOSS was developed until it closed in July 2005. The Austrian server machine (univie.ac.at) is about to move so may be down for a while. EBI does provide access to EMBOSS, but not a simple web interface to all the programs by name. You can also use many of EMBOSS programs through the EBI "Toolbox", but not diffseq which was the that started this thread. EBI also provides simple web service access through WSEmboss frok our External Services Group http://www.ebi.ac.uk/Tools/webservices/WSEmboss.html and fully featured web wervice access through SoapLab http://www.ebi.ac.uk/soaplab/ developed by Martin Senger in my group as part of the myGrid project (and now part of the EMBRACE project). I still plan to survey all the EMBOSS interfaces in the near future ... once funding is sorted out ... Hope that helps, Peter