From Bastien.Chevreux at dsm.com Mon May 3 10:24:29 2010 From: Bastien.Chevreux at dsm.com (Chevreux, Bastien) Date: Mon, 3 May 2010 16:24:29 +0200 Subject: [EMBOSS] Convert scf to fasta In-Reply-To: References: <0D2BD44B-6881-48DA-BA12-8465A038A048@liverpool.ac.uk>

Message-ID: > Thanks very much for the replies. I've applied to get phred and will > give that a try - sounds promising. I'll let you know how I get on. (as long as EMBOSS doesn't have a converter ...) I'm late in the reply, but if you have .ab1 files, I'd definitively take a look at TraceTuner (on SourceForge) for doing the base calling. If that doesn't suit your need: in MIRA (also on SourceForge), take a look at "scftool" if you just want to extract the called bases of an SCF. It's simple, but it should do. Regards, Bastien -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203 / 115 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. From jison at ebi.ac.uk Tue May 4 04:51:37 2010 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 4 May 2010 09:51:37 +0100 (BST) Subject: [EMBOSS] Workshop for Web Service Providers Message-ID: <49936.172.22.100.208.1272963097.squirrel@webmail.ebi.ac.uk> Details for a workshop which may be of interest ... EMBRACE Workshop for Web Service Providers in Bioinformatics: Syntax, Semantics and Publishing June 2-4, 2010 CBS, Department of Systems Biology, Technical University of Denmark The workshop will introduce comprehensive guidelines for design, semantic annotation and publishing of SOAP-based Web Services in the area of life sciences. The participants should be familiar with the basic concepts of Web Services, and should preferably already have produced or have been involved in the production of one or more such services. The emerging EMBRACE Guidelines for Web Service providers will be discussed in detail. The following issues of key importance to Web Service interoperability, easiness of use and visibility to the community will then be covered in depth: use of externally defined data types (example: BioXSD), semantic annotation (example: EDAM) and publishing (example: BioCatalogue). All the points above will be illustrated with hands-on exercises. They will be performed on the participants' own Web Services. Thus, each participant should have remote access to his/her own Web Service generation workbench. Windows laptops with SSH and X11 will be provided on request. The workshop will be held at the Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, building 208, in Lyngby, Denmark, very close to Copenhagen. It will both start and end at mid-day (Jun 2 and 4, respectively) leaving ample time to reach most European destinations on the same day. To apply for the workshop, please sign up through the on-line form at http://www.cbs.dtu.dk/courses/embrace/2010-06-02/ [1] From jeedward at yahoo.com Fri May 14 16:13:52 2010 From: jeedward at yahoo.com (John Edward) Date: Fri, 14 May 2010 13:13:52 -0700 (PDT) Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010 Message-ID: <245719.48654.qm@web45903.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers: BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From charles-listes-emboss at plessy.org Tue May 25 08:24:42 2010 From: charles-listes-emboss at plessy.org (Charles Plessy) Date: Tue, 25 May 2010 21:24:42 +0900 Subject: [EMBOSS] Is it reasonable to build EMBOSS with a system copy of the zlib? Message-ID: <20100525122442.GH6611@kunpuu.plessy.org> Dear EMBOSS developers, I just realised that EMBOSS uses a local copy of the zlib. Doing so is discouraged in large distributions like Debian and Fedora, where source code factorisation is an essential strategy for keeping thousands of packages up to date. I am tempted to apply to the Debian source package the patch that is already in use in Fedora, to use the system's zlib (now version 1.2.3.4, but version 1.2.5 is under way). http://cvs.fedoraproject.org/viewvc/rpms/EMBOSS/devel/EMBOSS-6.2.0-system-zlib.patch?revision=1.1&view=markup Does EMBOSS depend on a specific modification of the zlib? Have a nice day, -- Charles Plessy Debian Med packaging team http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan From jchavird at gmail.com Wed May 26 14:50:05 2010 From: jchavird at gmail.com (Justin Havird) Date: Wed, 26 May 2010 13:50:05 -0500 Subject: [EMBOSS] Tranalign relaxation? Message-ID: Hi, I am trying to align nucleic acid sequences based on amino acid alignments using the program tranalign. The program normally works fine for me, but lately I have been using mitochondrial genes and am beginning to run into problems. These occur when the nucleotide sequence does not match the amino acid translation exactly. For example, in the prawn M. japonicus, the first amino acid (MET) in the COX1 gene is encoded by the codon "ACG" rather than the typical "ATG". Tranalign doesn't recognize ACG as encoding MET, so it throws up this message: Error: Guide protein sequence M. japonicus not found in nucleic sequence M. japonicus These errors occur on a taxa by taxa basis and are usually because of the first codon. However, the error also occurs when the nucleotide sequence has an ambiguous nucleotide (e.g., Y), even if the ambiguous nucleotide position doesn't affect the translation (e.g., both GTC and GTT = VAL). I can usually pinpoint the error to a specific nucleotide/codon like in these examples. These errors are relatively rare, but happen more frequently in some groups (inverts and fishes mostly). So, does anyone know a way to "relax" the tranalign translation rules to circumvent this problem? Or have another program/solution? I think the user from the message below had a similar problem, but I see no answer. :( Thanks! Justin >From Nov 12, 2006: > Hello - I'm trying to use tranalign to align DNA sequences but it > keeps throwing errors. I tested it on the example input files from > the documentation web pages and those work fine. > > Error: Guide protein sequence SS1G01814 not found in nucleic sequence > SS1G01814 > > it throws the same error for every pair of proteins in the file > > here are the sequence names in the files. I can supply the full > files if anyone thinks they can help. Yes, please send the full files to emboss-bug at emboss.open-bio.org The message is not an ID mismatch - it says the protein sequence did not match the DNA sequence. regards, Peter Rice ********************************************************************** Justin C. Havird Department of Biological Sciences & Cellular and Molecular Biosciences Program Auburn University 101 Life Science Building Auburn, AL 36849 Tele # (334) 844-3223 Fax # (334) 844-1645 Email: jhavird at auburn.edu Lab Website: http://www.auburn.edu/~santosr/ ********************************************************************** From sebastien.moretti at unil.ch Fri May 28 06:35:57 2010 From: sebastien.moretti at unil.ch (=?UTF-8?B?U8KOw6liYXN0aWVuIE1PUkVUVEk=?=) Date: Fri, 28 May 2010 12:35:57 +0200 Subject: [EMBOSS] Output files based on input argument Message-ID: <4BFF9C8D.4010006@unil.ch> Hi I try to define output section for one of my tool but I cannot find the right way: The tool takes a mandatory argument and returns three different files with names starting by the mandatory argument. e.g.: input ./tool --pdb=2SRC outputs 2SRC.out, 2SRC.gff, 2SRC.bed How to properly code this in ACD format ? Thanks -- S?bastien Moretti SIB Vital-IT EMBnet, Quartier Sorge - Genopode CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4079/4221 http://ch.embnet.org/ http://myhits.vital-it.ch/ From pmr at ebi.ac.uk Fri May 28 12:01:47 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 28 May 2010 17:01:47 +0100 Subject: [EMBOSS] Output files based on input argument In-Reply-To: <4BFF9C8D.4010006@unil.ch> References: <4BFF9C8D.4010006@unil.ch> Message-ID: <4BFFE8EB.7070304@ebi.ac.uk> On 28/05/2010 11:35, S??bastien MORETTI wrote: > Hi > > I try to define output section for one of my tool but I cannot find the > right way: > > The tool takes a mandatory argument and returns three different files > with names starting by the mandatory argument. > e.g.: input ./tool --pdb=2SRC > outputs 2SRC.out, 2SRC.gff, 2SRC.bed > > > How to properly code this in ACD format ? Two choices. The easy way is to get the 2SRC from an input file: infile: pdb [ information: "PDB file name" ] outfile: outfile [ extension: "out" ] outfile: gffoutfile [ extension; "gff" ] outfile: bedoutfile [ extension; "bed" ] This will use the first part of the input file name to make output filenames. There is a slight catch that the names will be converted to lower case (2src.out 2src.gff 2src.bed) If you need exactly what you ask for, assuming --pdb is a string, add to each outfile definition: name: "$(pdb)" The first part of the name will now default to the value of the --pdb qualifier. Hope that helps Peter From koenvanderdrift at gmail.com Fri May 28 22:44:11 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Fri, 28 May 2010 22:44:11 -0400 Subject: [EMBOSS] accessing emboss ftp site In-Reply-To: <4BD1C05D.5010109@sonsorol.org> References: <6F57C2D1-8927-420C-940C-C6EC0C62AABE@gmail.com> <4BD0DB9B.5050005@sonsorol.org> <4BD1C05D.5010109@sonsorol.org> Message-ID: Hi Chris, Did you have a chance to look at this? Just tried again, and Transmit still won't let me access the emboss ftp site. Thanks, - Koen. On Apr 23, 2010, at 11:44 AM, Chris Dagdigian wrote: > > In the last few months the open-bio.org servers switched > datacenters, IP addresses and firewall/IDS appliances. Lots of juicy > things to look at and debug. > > Koen - if you have a chance can you send me the IP address that you > are using to connect from? I might be able to find some relevant log > entries with that info. > > -Chris > > > > Koen van der Drift wrote: >> Just for the record, it used to work with Transmit, this is only from >> the last few months. >> >> - Koen. >> >> On Thu, Apr 22, 2010 at 7:28 PM, Chris Dagdigian >> wrote: >>> Might be an issue with the Juniper Netscreen firewall/IDS security >>> appliance >>> that sits upstream of the EMBOSS FTP server. I'll take a look at the >>> security logs and alerts. >>> >>> -Chris >>> >>> >>> Koen van der Drift wrote: >>>> Hi, >>>> >>>> For a while now I am unable to access the emboss ftp site using >>>> the OS X >>>> client Transmit. Loggin in works fine, but it chokes on the LIST >>>> command. I have no problems accessing it from the command line. I >>>> have >>>> added the output from Transmit below. I don't know if this is a >>>> Transmit >>>> or emboss issue, but just wanted to let you know. >>>> >>>> Thanks, >>>> >>>> - Koen. >>>> >>>> >>>> Transmit 3.6.9 Session Transcript >>>> LibNcFTP 3.2.1 (August 13, 2007) compiled for UNIX >>>> Uname: Darwin|exile.local|9.8.0|Darwin Kernel Version 9.8.0: Wed >>>> Jul 15 >>>> 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC|Power Macintosh >>>> 220: (vsFTPd 2.0.1) >>>> Connected to emboss.open-bio.org. >>>> Cmd: USER anonymous >>>> 331: Please specify the password. >>>> Cmd: PASS NcFTP@ >>>> 230: Login successful. >>>> Cmd: TYPE A >>>> 200: Switching to ASCII mode. >>>> Logged in to emboss.open-bio.org as anonymous. >>>> Cmd: SYST >>>> 215: UNIX Type: L8 >>>> Cmd: PWD >>>> 257: "/" >>>> Cmd: CWD /pub/EMBOSS/fixes >>>> 250: Directory successfully changed. >>>> Cmd: PWD >>>> 257: "/pub/EMBOSS/fixes" >>>> Cmd: PASV >>>> 227: Entering Passive Mode (208,94,50,58,83,232) >>>> Cmd: LIST -a >>>> Could not read reply from control connection -- timed out. >>>> (SReadline 1) >>>> 220: (vsFTPd 2.0.1) >>>> Connected to emboss.open-bio.org. >>>> Cmd: USER anonymous >>>> 331: Please specify the password. >>>> Cmd: PASS NcFTP@ >>>> 230: Login successful. >>>> Logged in to emboss.open-bio.org as anonymous. >>>> Cmd: SYST >>>> 215: UNIX Type: L8 >>>> Cmd: PWD >>>> 257: "/" >>>> Cmd: CWD /pub/EMBOSS/fixes >>>> 250: Directory successfully changed. >>>> Cmd: PWD >>>> 257: "/pub/EMBOSS/fixes" >>>> Cmd: PASV >>>> 227: Entering Passive Mode (208,94,50,58,222,100) >>>> Cmd: LIST -a >>>> Could not read reply from control connection -- timed out. >>>> (SReadline 1) >>>> >>>> _______________________________________________ >>>> EMBOSS mailing list >>>> EMBOSS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/emboss From sebastien.moretti at unil.ch Sat May 29 05:09:59 2010 From: sebastien.moretti at unil.ch (=?UTF-8?B?U8OpYmFzdGllbiBNb3JldHRp?=) Date: Sat, 29 May 2010 11:09:59 +0200 Subject: [EMBOSS] Output files based on input argument In-Reply-To: <4BFFE8EB.7070304@ebi.ac.uk> References: <4BFF9C8D.4010006@unil.ch> <4BFFE8EB.7070304@ebi.ac.uk> Message-ID: <4C00D9E7.1020705@unil.ch> I will try this. Thanks >> Hi >> >> I try to define output section for one of my tool but I cannot find the >> right way: >> >> The tool takes a mandatory argument and returns three different files >> with names starting by the mandatory argument. >> e.g.: input ./tool --pdb=2SRC >> outputs 2SRC.out, 2SRC.gff, 2SRC.bed >> >> >> How to properly code this in ACD format ? > > Two choices. The easy way is to get the 2SRC from an input file: > > infile: pdb [ > information: "PDB file name" > ] > > outfile: outfile [ > extension: "out" > ] > > outfile: gffoutfile [ > extension; "gff" > ] > > outfile: bedoutfile [ > extension; "bed" > ] > > This will use the first part of the input file name to make output > filenames. There is a slight catch that the names will be converted to > lower case (2src.out 2src.gff 2src.bed) > > If you need exactly what you ask for, assuming --pdb is a string, add to > each outfile definition: > > name: "$(pdb)" > > The first part of the name will now default to the value of the --pdb > qualifier. > > Hope that helps > > Peter -- S?bastien Moretti SIB Vital-IT EMBnet, Quartier Sorge - Genopode CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4079/4221 From biopython at maubp.freeserve.co.uk Mon May 31 06:57:44 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 May 2010 11:57:44 +0100 Subject: [EMBOSS] Tranalign relaxation? In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 7:50 PM, Justin Havird wrote: > > Hi, > > I am trying to align nucleic acid sequences based on amino acid alignments > using the program tranalign. The program normally works fine for me, but > lately I have been using mitochondrial genes and am beginning to run into > problems. > > These occur when the nucleotide sequence does not match the amino acid > translation exactly. For example, in the prawn M. japonicus, the first amino > acid (MET) in the COX1 gene is encoded by the codon "ACG" rather than the > typical "ATG". Tranalign doesn't recognize ACG as encoding MET, so it throws > up this message: > > Error: Guide protein sequence M. japonicus not found in nucleic sequence M. > japonicus > > These errors occur on a taxa by taxa basis and are usually because of the > first codon. However, the error also occurs when the nucleotide sequence has > an ambiguous nucleotide (e.g., Y), even if the ambiguous nucleotide position > doesn't affect the translation (e.g., both GTC and GTT = VAL). I can usually > pinpoint the error to a specific nucleotide/codon like in these examples. > > These errors are relatively rare, but happen more frequently in some groups > (inverts and fishes mostly). > > So, does anyone know a way to "relax" the tranalign translation rules to > circumvent this problem? Or have another program/solution? Hi Justin, This might be a silly question, but have you used the tranalign argument -table to specify which genetic code table to use? I'd guess you probably want the Vertebrate Mitochondrial Code instead of the Standard Code. Peter C. From Bastien.Chevreux at dsm.com Mon May 3 14:24:29 2010 From: Bastien.Chevreux at dsm.com (Chevreux, Bastien) Date: Mon, 3 May 2010 16:24:29 +0200 Subject: [EMBOSS] Convert scf to fasta In-Reply-To: References: <0D2BD44B-6881-48DA-BA12-8465A038A048@liverpool.ac.uk>

Message-ID: > Thanks very much for the replies. I've applied to get phred and will > give that a try - sounds promising. I'll let you know how I get on. (as long as EMBOSS doesn't have a converter ...) I'm late in the reply, but if you have .ab1 files, I'd definitively take a look at TraceTuner (on SourceForge) for doing the base calling. If that doesn't suit your need: in MIRA (also on SourceForge), take a look at "scftool" if you just want to extract the called bases of an SCF. It's simple, but it should do. Regards, Bastien -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203 / 115 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. From jison at ebi.ac.uk Tue May 4 08:51:37 2010 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 4 May 2010 09:51:37 +0100 (BST) Subject: [EMBOSS] Workshop for Web Service Providers Message-ID: <49936.172.22.100.208.1272963097.squirrel@webmail.ebi.ac.uk> Details for a workshop which may be of interest ... EMBRACE Workshop for Web Service Providers in Bioinformatics: Syntax, Semantics and Publishing June 2-4, 2010 CBS, Department of Systems Biology, Technical University of Denmark The workshop will introduce comprehensive guidelines for design, semantic annotation and publishing of SOAP-based Web Services in the area of life sciences. The participants should be familiar with the basic concepts of Web Services, and should preferably already have produced or have been involved in the production of one or more such services. The emerging EMBRACE Guidelines for Web Service providers will be discussed in detail. The following issues of key importance to Web Service interoperability, easiness of use and visibility to the community will then be covered in depth: use of externally defined data types (example: BioXSD), semantic annotation (example: EDAM) and publishing (example: BioCatalogue). All the points above will be illustrated with hands-on exercises. They will be performed on the participants' own Web Services. Thus, each participant should have remote access to his/her own Web Service generation workbench. Windows laptops with SSH and X11 will be provided on request. The workshop will be held at the Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, building 208, in Lyngby, Denmark, very close to Copenhagen. It will both start and end at mid-day (Jun 2 and 4, respectively) leaving ample time to reach most European destinations on the same day. To apply for the workshop, please sign up through the on-line form at http://www.cbs.dtu.dk/courses/embrace/2010-06-02/ [1] From jeedward at yahoo.com Fri May 14 20:13:52 2010 From: jeedward at yahoo.com (John Edward) Date: Fri, 14 May 2010 13:13:52 -0700 (PDT) Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010 Message-ID: <245719.48654.qm@web45903.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers: BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From charles-listes-emboss at plessy.org Tue May 25 12:24:42 2010 From: charles-listes-emboss at plessy.org (Charles Plessy) Date: Tue, 25 May 2010 21:24:42 +0900 Subject: [EMBOSS] Is it reasonable to build EMBOSS with a system copy of the zlib? Message-ID: <20100525122442.GH6611@kunpuu.plessy.org> Dear EMBOSS developers, I just realised that EMBOSS uses a local copy of the zlib. Doing so is discouraged in large distributions like Debian and Fedora, where source code factorisation is an essential strategy for keeping thousands of packages up to date. I am tempted to apply to the Debian source package the patch that is already in use in Fedora, to use the system's zlib (now version 1.2.3.4, but version 1.2.5 is under way). http://cvs.fedoraproject.org/viewvc/rpms/EMBOSS/devel/EMBOSS-6.2.0-system-zlib.patch?revision=1.1&view=markup Does EMBOSS depend on a specific modification of the zlib? Have a nice day, -- Charles Plessy Debian Med packaging team http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan From jchavird at gmail.com Wed May 26 18:50:05 2010 From: jchavird at gmail.com (Justin Havird) Date: Wed, 26 May 2010 13:50:05 -0500 Subject: [EMBOSS] Tranalign relaxation? Message-ID: Hi, I am trying to align nucleic acid sequences based on amino acid alignments using the program tranalign. The program normally works fine for me, but lately I have been using mitochondrial genes and am beginning to run into problems. These occur when the nucleotide sequence does not match the amino acid translation exactly. For example, in the prawn M. japonicus, the first amino acid (MET) in the COX1 gene is encoded by the codon "ACG" rather than the typical "ATG". Tranalign doesn't recognize ACG as encoding MET, so it throws up this message: Error: Guide protein sequence M. japonicus not found in nucleic sequence M. japonicus These errors occur on a taxa by taxa basis and are usually because of the first codon. However, the error also occurs when the nucleotide sequence has an ambiguous nucleotide (e.g., Y), even if the ambiguous nucleotide position doesn't affect the translation (e.g., both GTC and GTT = VAL). I can usually pinpoint the error to a specific nucleotide/codon like in these examples. These errors are relatively rare, but happen more frequently in some groups (inverts and fishes mostly). So, does anyone know a way to "relax" the tranalign translation rules to circumvent this problem? Or have another program/solution? I think the user from the message below had a similar problem, but I see no answer. :( Thanks! Justin >From Nov 12, 2006: > Hello - I'm trying to use tranalign to align DNA sequences but it > keeps throwing errors. I tested it on the example input files from > the documentation web pages and those work fine. > > Error: Guide protein sequence SS1G01814 not found in nucleic sequence > SS1G01814 > > it throws the same error for every pair of proteins in the file > > here are the sequence names in the files. I can supply the full > files if anyone thinks they can help. Yes, please send the full files to emboss-bug at emboss.open-bio.org The message is not an ID mismatch - it says the protein sequence did not match the DNA sequence. regards, Peter Rice ********************************************************************** Justin C. Havird Department of Biological Sciences & Cellular and Molecular Biosciences Program Auburn University 101 Life Science Building Auburn, AL 36849 Tele # (334) 844-3223 Fax # (334) 844-1645 Email: jhavird at auburn.edu Lab Website: http://www.auburn.edu/~santosr/ ********************************************************************** From sebastien.moretti at unil.ch Fri May 28 10:35:57 2010 From: sebastien.moretti at unil.ch (=?UTF-8?B?U8KOw6liYXN0aWVuIE1PUkVUVEk=?=) Date: Fri, 28 May 2010 12:35:57 +0200 Subject: [EMBOSS] Output files based on input argument Message-ID: <4BFF9C8D.4010006@unil.ch> Hi I try to define output section for one of my tool but I cannot find the right way: The tool takes a mandatory argument and returns three different files with names starting by the mandatory argument. e.g.: input ./tool --pdb=2SRC outputs 2SRC.out, 2SRC.gff, 2SRC.bed How to properly code this in ACD format ? Thanks -- S?bastien Moretti SIB Vital-IT EMBnet, Quartier Sorge - Genopode CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4079/4221 http://ch.embnet.org/ http://myhits.vital-it.ch/ From pmr at ebi.ac.uk Fri May 28 16:01:47 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 28 May 2010 17:01:47 +0100 Subject: [EMBOSS] Output files based on input argument In-Reply-To: <4BFF9C8D.4010006@unil.ch> References: <4BFF9C8D.4010006@unil.ch> Message-ID: <4BFFE8EB.7070304@ebi.ac.uk> On 28/05/2010 11:35, S??bastien MORETTI wrote: > Hi > > I try to define output section for one of my tool but I cannot find the > right way: > > The tool takes a mandatory argument and returns three different files > with names starting by the mandatory argument. > e.g.: input ./tool --pdb=2SRC > outputs 2SRC.out, 2SRC.gff, 2SRC.bed > > > How to properly code this in ACD format ? Two choices. The easy way is to get the 2SRC from an input file: infile: pdb [ information: "PDB file name" ] outfile: outfile [ extension: "out" ] outfile: gffoutfile [ extension; "gff" ] outfile: bedoutfile [ extension; "bed" ] This will use the first part of the input file name to make output filenames. There is a slight catch that the names will be converted to lower case (2src.out 2src.gff 2src.bed) If you need exactly what you ask for, assuming --pdb is a string, add to each outfile definition: name: "$(pdb)" The first part of the name will now default to the value of the --pdb qualifier. Hope that helps Peter From koenvanderdrift at gmail.com Sat May 29 02:44:11 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Fri, 28 May 2010 22:44:11 -0400 Subject: [EMBOSS] accessing emboss ftp site In-Reply-To: <4BD1C05D.5010109@sonsorol.org> References: <6F57C2D1-8927-420C-940C-C6EC0C62AABE@gmail.com> <4BD0DB9B.5050005@sonsorol.org> <4BD1C05D.5010109@sonsorol.org> Message-ID: Hi Chris, Did you have a chance to look at this? Just tried again, and Transmit still won't let me access the emboss ftp site. Thanks, - Koen. On Apr 23, 2010, at 11:44 AM, Chris Dagdigian wrote: > > In the last few months the open-bio.org servers switched > datacenters, IP addresses and firewall/IDS appliances. Lots of juicy > things to look at and debug. > > Koen - if you have a chance can you send me the IP address that you > are using to connect from? I might be able to find some relevant log > entries with that info. > > -Chris > > > > Koen van der Drift wrote: >> Just for the record, it used to work with Transmit, this is only from >> the last few months. >> >> - Koen. >> >> On Thu, Apr 22, 2010 at 7:28 PM, Chris Dagdigian >> wrote: >>> Might be an issue with the Juniper Netscreen firewall/IDS security >>> appliance >>> that sits upstream of the EMBOSS FTP server. I'll take a look at the >>> security logs and alerts. >>> >>> -Chris >>> >>> >>> Koen van der Drift wrote: >>>> Hi, >>>> >>>> For a while now I am unable to access the emboss ftp site using >>>> the OS X >>>> client Transmit. Loggin in works fine, but it chokes on the LIST >>>> command. I have no problems accessing it from the command line. I >>>> have >>>> added the output from Transmit below. I don't know if this is a >>>> Transmit >>>> or emboss issue, but just wanted to let you know. >>>> >>>> Thanks, >>>> >>>> - Koen. >>>> >>>> >>>> Transmit 3.6.9 Session Transcript >>>> LibNcFTP 3.2.1 (August 13, 2007) compiled for UNIX >>>> Uname: Darwin|exile.local|9.8.0|Darwin Kernel Version 9.8.0: Wed >>>> Jul 15 >>>> 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC|Power Macintosh >>>> 220: (vsFTPd 2.0.1) >>>> Connected to emboss.open-bio.org. >>>> Cmd: USER anonymous >>>> 331: Please specify the password. >>>> Cmd: PASS NcFTP@ >>>> 230: Login successful. >>>> Cmd: TYPE A >>>> 200: Switching to ASCII mode. >>>> Logged in to emboss.open-bio.org as anonymous. >>>> Cmd: SYST >>>> 215: UNIX Type: L8 >>>> Cmd: PWD >>>> 257: "/" >>>> Cmd: CWD /pub/EMBOSS/fixes >>>> 250: Directory successfully changed. >>>> Cmd: PWD >>>> 257: "/pub/EMBOSS/fixes" >>>> Cmd: PASV >>>> 227: Entering Passive Mode (208,94,50,58,83,232) >>>> Cmd: LIST -a >>>> Could not read reply from control connection -- timed out. >>>> (SReadline 1) >>>> 220: (vsFTPd 2.0.1) >>>> Connected to emboss.open-bio.org. >>>> Cmd: USER anonymous >>>> 331: Please specify the password. >>>> Cmd: PASS NcFTP@ >>>> 230: Login successful. >>>> Logged in to emboss.open-bio.org as anonymous. >>>> Cmd: SYST >>>> 215: UNIX Type: L8 >>>> Cmd: PWD >>>> 257: "/" >>>> Cmd: CWD /pub/EMBOSS/fixes >>>> 250: Directory successfully changed. >>>> Cmd: PWD >>>> 257: "/pub/EMBOSS/fixes" >>>> Cmd: PASV >>>> 227: Entering Passive Mode (208,94,50,58,222,100) >>>> Cmd: LIST -a >>>> Could not read reply from control connection -- timed out. >>>> (SReadline 1) >>>> >>>> _______________________________________________ >>>> EMBOSS mailing list >>>> EMBOSS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/emboss From sebastien.moretti at unil.ch Sat May 29 09:09:59 2010 From: sebastien.moretti at unil.ch (=?UTF-8?B?U8OpYmFzdGllbiBNb3JldHRp?=) Date: Sat, 29 May 2010 11:09:59 +0200 Subject: [EMBOSS] Output files based on input argument In-Reply-To: <4BFFE8EB.7070304@ebi.ac.uk> References: <4BFF9C8D.4010006@unil.ch> <4BFFE8EB.7070304@ebi.ac.uk> Message-ID: <4C00D9E7.1020705@unil.ch> I will try this. Thanks >> Hi >> >> I try to define output section for one of my tool but I cannot find the >> right way: >> >> The tool takes a mandatory argument and returns three different files >> with names starting by the mandatory argument. >> e.g.: input ./tool --pdb=2SRC >> outputs 2SRC.out, 2SRC.gff, 2SRC.bed >> >> >> How to properly code this in ACD format ? > > Two choices. The easy way is to get the 2SRC from an input file: > > infile: pdb [ > information: "PDB file name" > ] > > outfile: outfile [ > extension: "out" > ] > > outfile: gffoutfile [ > extension; "gff" > ] > > outfile: bedoutfile [ > extension; "bed" > ] > > This will use the first part of the input file name to make output > filenames. There is a slight catch that the names will be converted to > lower case (2src.out 2src.gff 2src.bed) > > If you need exactly what you ask for, assuming --pdb is a string, add to > each outfile definition: > > name: "$(pdb)" > > The first part of the name will now default to the value of the --pdb > qualifier. > > Hope that helps > > Peter -- S?bastien Moretti SIB Vital-IT EMBnet, Quartier Sorge - Genopode CH-1015 Lausanne, Switzerland Tel.: +41 (21) 692 4079/4221 From biopython at maubp.freeserve.co.uk Mon May 31 10:57:44 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 May 2010 11:57:44 +0100 Subject: [EMBOSS] Tranalign relaxation? In-Reply-To: References: Message-ID: On Wed, May 26, 2010 at 7:50 PM, Justin Havird wrote: > > Hi, > > I am trying to align nucleic acid sequences based on amino acid alignments > using the program tranalign. The program normally works fine for me, but > lately I have been using mitochondrial genes and am beginning to run into > problems. > > These occur when the nucleotide sequence does not match the amino acid > translation exactly. For example, in the prawn M. japonicus, the first amino > acid (MET) in the COX1 gene is encoded by the codon "ACG" rather than the > typical "ATG". Tranalign doesn't recognize ACG as encoding MET, so it throws > up this message: > > Error: Guide protein sequence M. japonicus not found in nucleic sequence M. > japonicus > > These errors occur on a taxa by taxa basis and are usually because of the > first codon. However, the error also occurs when the nucleotide sequence has > an ambiguous nucleotide (e.g., Y), even if the ambiguous nucleotide position > doesn't affect the translation (e.g., both GTC and GTT = VAL). I can usually > pinpoint the error to a specific nucleotide/codon like in these examples. > > These errors are relatively rare, but happen more frequently in some groups > (inverts and fishes mostly). > > So, does anyone know a way to "relax" the tranalign translation rules to > circumvent this problem? Or have another program/solution? Hi Justin, This might be a silly question, but have you used the tranalign argument -table to specify which genetic code table to use? I'd guess you probably want the Vertebrate Mitochondrial Code instead of the Standard Code. Peter C.