From ajb at ebi.ac.uk Tue Mar 2 05:44:51 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 10:44:51 -0000 (UTC) Subject: [EMBOSS] EMBOSS-6.2.0 patch 1-18 available Message-ID: <34743.86.26.12.63.1267526691.squirrel@webmail.ebi.ac.uk> The first patch file for the EMBOSS-6.2.0 release is now available at: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-18.gz Discrete files used to create the above patch are held in the directory: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ The file README.fixes in the same directory describes what the fixes address and is attached to this email for convenience. A new mEMBOSS incorporating all relevant changes from the above is available as: ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.2.0.2-setup.exe Alan -------------- next part -------------- A non-text attachment was scrubbed... Name: README.fixes Type: application/octet-stream Size: 5674 bytes Desc: not available URL: From ajb at ebi.ac.uk Tue Mar 2 06:22:58 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 11:22:58 -0000 (UTC) Subject: [EMBOSS] mEMBOSS-6.2.0.1 reinstated Message-ID: <56161.86.26.12.63.1267528978.squirrel@webmail.ebi.ac.uk> A stability problem has been noticed with mEMBOSS-6.2.0.2 on the ftp server. As a result we've reinstated mEMBOSS-6.2.0.1. A further announcement will be posted when things are resolved. Apologies for any inconvenience. Alan From ajb at ebi.ac.uk Tue Mar 2 11:03:30 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 16:03:30 -0000 (UTC) Subject: [EMBOSS] mEMBOSS 6.2.0.2 re-released Message-ID: <57834.86.26.12.63.1267545810.squirrel@webmail.ebi.ac.uk> The stability issues have been resolved. mEMBOSS 6.2.0.2 is now re-released as: ftp://emboss.open-bio.org/pub/EMBOSS/windows/beta/mEMBOSS-6.2.0.2-setup.exe Note that this release is based on the current developers' CVS code and, as such, has not had the rigorous testing performed for major releases (or for patches to the UNIX version of EMBOSS). We are providing it as a beta release in the event it may be useful. Alan From michael.watson at bbsrc.ac.uk Fri Mar 5 09:26:06 2010 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Fri, 5 Mar 2010 14:26:06 +0000 Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to bootstrap with FPROTPARS Message-ID: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> Hello Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence. The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS. I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68 Many thanks Mick From pmr at ebi.ac.uk Fri Mar 5 10:40:54 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Mar 2010 15:40:54 +0000 Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to bootstrap with FPROTPARS In-Reply-To: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4B912606.9080906@ebi.ac.uk> Dear Michael, On 05/03/10 14:26, michael watson (IAH-C) wrote: > Hello > > Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence. > > The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS. > > I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68 My understanding of the phylip documentation is that you use (EMBOSS name) fseqboot to generate the bootstrap resampling of your original sequences and then use fprotpars to analyse the resulting output. In the original phylip package the seqboot application bootstraps several types of data. In the EMBASSY package, to make the input types clearer, we split it into fseqboot, fseqbootall, fdiscboot, ffreqboot and frestboot. Hope that helps, Peter Rice From jeedward at yahoo.com Fri Mar 5 19:35:35 2010 From: jeedward at yahoo.com (John Edward) Date: Fri, 5 Mar 2010 16:35:35 -0800 (PST) Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010 Message-ID: <915762.86810.qm@web45916.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers: BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From mbk0asis at gmail.com Sun Mar 7 10:05:27 2010 From: mbk0asis at gmail.com (Byungkuk Min) Date: Sun, 7 Mar 2010 07:05:27 -0800 Subject: [EMBOSS] A question about 'showdb' Message-ID: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> When I typed 'showdb', no list of databases appeared like the example in the tutorial. How can I set up the databases? xxxxx at ubuntu:~$ showdb Displays information on configured databases # Name Type ID Qry All Comment # ============ ==== == === === ======= From pmr at ebi.ac.uk Sun Mar 7 17:28:23 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Sun, 07 Mar 2010 22:28:23 +0000 Subject: [EMBOSS] A question about 'showdb' In-Reply-To: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> References: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> Message-ID: <4B942887.7090608@ebi.ac.uk> Dear Byungkuk, On 07/03/2010 15:05, Byungkuk Min wrote: > When I typed 'showdb', no list of databases appeared like the example in the > tutorial. > How can I set up the databases? The databases are defined in a file emboss.defaults in the share/EMBOSS/ directory where EMBOSS is installed. In that directory you will find a file emboss.default.template with example database definitions. Some databases are remote (e.g. method: "srs") and can be defined and used. Others need local data files and a local index created by EMBOSS (method: emboss and method: emblcd) creatted by the dbx* and dbi* programs in EMBOSS. Let us know if you need any more help. We are working on more detailed instructions which will appear on the EMNBOSS website. regards, Peter Rice From shrish at ccmb.res.in Mon Mar 8 03:58:26 2010 From: shrish at ccmb.res.in (Shrish Tiwari) Date: Mon, 8 Mar 2010 14:28:26 +0530 (IST) Subject: [EMBOSS] (no subject) Message-ID: <777482836.160381268038706946.JavaMail.root@127.0.0.1> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From simon.andrews at bbsrc.ac.uk Mon Mar 8 08:53:06 2010 From: simon.andrews at bbsrc.ac.uk (Simon Andrews) Date: Mon, 8 Mar 2010 13:53:06 +0000 Subject: [EMBOSS] Data for Jasextract Message-ID: I've been trying to use EMBOSS to search using the Jaspar database (jaspextract / jaspscan), but with no success. I think the problem is coming from jaspextract. TFM says: Input file format The input files are the uncompressed and extracted JASPAR_CORE.tgz, JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the JASPAR MatrixDir download directory of the JASPAR homepage (http://jaspar.genereg.net). ..but there are no files named that way (the only google hit to those names is the jaspextract manpage!). The main jaspar archive file is Archive.zip. If I unzip this and run jaspextract on the expanded directory it runs with no errors or warnings, but if I subsequently try to run jaspscan I get an error saying: Warning: Matrix file(s) *.pfm not found EMBOSS An error in jaspscan.c at line 870: Matrix list file JASPAR_CORE/matrix_list.txt not found I've tried loads of different subdirectories within the JASPAR database dump, but can't find anything which actually puts data into the appropriate EMBOSS data directories. Can anyone else make this work? Thanks Simon. From ajb at ebi.ac.uk Mon Mar 8 11:07:48 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 8 Mar 2010 16:07:48 -0000 (UTC) Subject: [EMBOSS] Data for Jasextract In-Reply-To: References: Message-ID: <45660.86.26.12.63.1268064468.squirrel@webmail.ebi.ac.uk> Hello Simon, The Jaspar people altered the structure and content of their ftp server recently. There is a patch in the fixes/patches area of the EMBOSS ftp server which updates jaspextract and jaspscan appropriately. The README.fixes file in the 'fixes' directory explains further. HTH Alan > I've been trying to use EMBOSS to search using the Jaspar database > (jaspextract / jaspscan), but with no success. > > I think the problem is coming from jaspextract. TFM says: > > Input file format > > The input files are the uncompressed and extracted JASPAR_CORE.tgz, > JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the > JASPAR > MatrixDir download directory of the JASPAR homepage > (http://jaspar.genereg.net). > > > ..but there are no files named that way (the only google hit to those > names is the jaspextract manpage!). > > The main jaspar archive file is Archive.zip. If I unzip this and run > jaspextract on the expanded directory it runs with no errors or > warnings, but if I subsequently try to run jaspscan I get an error > saying: > > Warning: Matrix file(s) *.pfm not found > > EMBOSS An error in jaspscan.c at line 870: > Matrix list file JASPAR_CORE/matrix_list.txt not found > > I've tried loads of different subdirectories within the JASPAR > database dump, but can't find anything which actually puts data into > the appropriate EMBOSS data directories. > > Can anyone else make this work? > > Thanks > > Simon. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From stephen.taylor at imm.ox.ac.uk Tue Mar 9 09:20:38 2010 From: stephen.taylor at imm.ox.ac.uk (Steve Taylor) Date: Tue, 09 Mar 2010 14:20:38 +0000 Subject: [EMBOSS] Galaxy and EMBOSS Message-ID: <4B965936.6030102@imm.ox.ac.uk> Hi, I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone on this list know if there planned support for EMBOSS 6? We have found using our local installation of EMBOSS 6 that a few tools don't work. Is there a person who maintains the Galaxy/EMBOSS configuration? I know this is *really* a Galaxy question but I posted this to the Galaxy list but haven't had any response so far. :-) Thanks, Steve From pmr at ebi.ac.uk Tue Mar 9 10:29:59 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 09 Mar 2010 15:29:59 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B965936.6030102@imm.ox.ac.uk> References: <4B965936.6030102@imm.ox.ac.uk> Message-ID: <4B966977.9050101@ebi.ac.uk> On 09/03/2010 14:20, Steve Taylor wrote: > Hi, > > I notice in the Galaxy distribution there is support for EMBOSS 5. Does > anyone on this list know if there planned support for EMBOSS 6? We have > found using our local installation of EMBOSS 6 that a few tools don't > work. Is there a person who maintains the Galaxy/EMBOSS configuration? > > I know this is *really* a Galaxy question but I posted this to the > Galaxy list but haven't had any response so far. :-) I am looking into it and will be going to the Galaxy Developers meeting in May. Any other interest among the EMBOSS users? regards, Peter Rice From hrh at fmi.ch Tue Mar 9 10:40:19 2010 From: hrh at fmi.ch (Hotz, Hans-Rudolf) Date: Tue, 09 Mar 2010 16:40:19 +0100 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B965936.6030102@imm.ox.ac.uk> Message-ID: Steve > I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone > on this list know if there planned support for EMBOSS 6? We have found using > our local installation of EMBOSS 6 that a few tools don't work. which tools don't work? We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I don't remember running into problems with the galaxy emboss 5 tool definitions (ie emboss_*.xml files). I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions (based on the release notes) you need to make. Generally speaking: EMBOSS tools are pretty stable. Maybe if you provide a list of problems/incompatibilities and resend this to the galaxy mailing list, you will get a response... Hans > Is there a person who maintains the Galaxy/EMBOSS configuration? > > I know this is *really* a Galaxy question but I posted this to the Galaxy list > but haven't had any response so far. :-) > > Thanks, > > Steve > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From stephen.taylor at imm.ox.ac.uk Tue Mar 9 11:30:08 2010 From: stephen.taylor at imm.ox.ac.uk (Steve Taylor) Date: Tue, 09 Mar 2010 16:30:08 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: References: Message-ID: <4B967790.8000609@imm.ox.ac.uk> Hi Hans, >> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone >> on this list know if there planned support for EMBOSS 6? We have found using >> our local installation of EMBOSS 6 that a few tools don't work. > > which tools don't work? > > We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I > don't remember running into problems with the galaxy emboss 5 tool > definitions (ie emboss_*.xml files). > Ok. That's good to know. > I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions > (based on the release notes) you need to make. Generally speaking: EMBOSS > tools are pretty stable. > To give a bit of history, we are fairly new to using Galaxy and previously we used EMBOSS Explorer as our main web interface. With this we found when EMBOSS releases changed lots of things broke, so we ended up staying with EMBOSS v3. I am hoping this is not going to be true for EMBOSS/Galaxy because they are both great tools and I want them to be used routinely without us/users worrying if things are going to break, especially if they are going to be incorporated routinely into workflows. > Maybe if you provide a list of problems/incompatibilities and resend this to > the galaxy mailing list, you will get a response... > Maybe I was a bit unlucky because I tried a few more tools and generally things are ok. A couple of minor issues I came across: * antigenic (ran but gave an error) 14: antigenic on data 13 An error occurred running this job: Error: Unable to read feature tags data file 'Etags.gff3protein' * etandem produced two outputs (not exactly an error but I wondered if it was a misconfiguration in the xml) there may be more ... Your email answers my question that in general EMBOSS 6 is compatible with EMBOSS 5 but probably some minor tweaks may be required for certain tools. It would great if some form of unit testing could be employed to check compatibility with new builds. Thanks, Steve > Hans > > >> Is there a person who maintains the Galaxy/EMBOSS configuration? >> >> I know this is *really* a Galaxy question but I posted this to the Galaxy list >> but haven't had any response so far. :-) >> >> Thanks, >> >> Steve >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > From pmr at ebi.ac.uk Tue Mar 9 12:11:30 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 09 Mar 2010 17:11:30 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B967790.8000609@imm.ox.ac.uk> References: <4B967790.8000609@imm.ox.ac.uk> Message-ID: <4B968142.9080905@ebi.ac.uk> On 09/03/2010 16:30, Steve Taylor wrote: > To give a bit of history, we are fairly new to using Galaxy and > previously we used EMBOSS Explorer as our main web interface. With this > we found when EMBOSS releases changed lots of things broke, so we ended > up staying with EMBOSS v3. I am hoping this is not going to be true for > EMBOSS/Galaxy because they are both great tools and I want them to be > used routinely without us/users worrying if things are going to break, > especially if they are going to be incorporated routinely into workflows. >> Maybe if you provide a list of problems/incompatibilities and resend >> this to >> the galaxy mailing list, you will get a response... Yes please do ... I am on the Galaxy list too. > 14: antigenic on data 13 > An error occurred running this job: Error: Unable to read feature tags > data file 'Etags.gff3protein' Could be you have more than one version of EMBOSS running. That looks like a pure EMBOSS error suggesting EMBOSS 6 is trying to use EMBSOS5's data directory. Should be fixable by copying the missing file. Peter From n.binns at ed.ac.uk Tue Mar 9 11:59:13 2010 From: n.binns at ed.ac.uk (Nigel Binns) Date: Tue, 09 Mar 2010 16:59:13 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B967790.8000609@imm.ox.ac.uk> References: <4B967790.8000609@imm.ox.ac.uk> Message-ID: <4B967E61.4060502@ed.ac.uk> I'm running EMBOSS 6.2.0 (and Jemboss via JWS) with the latest patch applied and EMBOSS Explorer (v2.2.0) without any problems. The only issue I've experienced is that the link to the EMBOSS help files is broken The workaround is to copy the EMBOSS HTML help files to the location EE expects to find them - which is not where the current release of EMBOSS is places them :-) Nigel On 09/03/2010 16:30, Steve Taylor wrote: > Hi Hans, > >>> I notice in the Galaxy distribution there is support for EMBOSS 5. >>> Does anyone >>> on this list know if there planned support for EMBOSS 6? We have >>> found using >>> our local installation of EMBOSS 6 that a few tools don't work. >> >> which tools don't work? >> >> We are using most of the EMBOSS 6.1.0 tools in or local galaxy >> server. And I >> don't remember running into problems with the galaxy emboss 5 tool >> definitions (ie emboss_*.xml files). >> > > Ok. That's good to know. >> I haven't checked EMBOSS 6.2.0, but I guess there are just a few >> additions >> (based on the release notes) you need to make. Generally speaking: >> EMBOSS >> tools are pretty stable. >> > > To give a bit of history, we are fairly new to using Galaxy and > previously we used EMBOSS Explorer as our main web interface. With > this we found when EMBOSS releases changed lots of things broke, so we > ended up staying with EMBOSS v3. I am hoping this is not going to be > true for EMBOSS/Galaxy because they are both great tools and I want > them to be used routinely without us/users worrying if things are > going to break, especially if they are going to be incorporated > routinely into workflows. >> Maybe if you provide a list of problems/incompatibilities and resend >> this to >> the galaxy mailing list, you will get a response... >> > > > Maybe I was a bit unlucky because I tried a few more tools and > generally things are ok. A couple of minor issues I came across: > > * antigenic > > (ran but gave an error) > > 14: antigenic on data 13 > An error occurred running this job: Error: Unable to read feature tags > data file 'Etags.gff3protein' > > * etandem produced two outputs (not exactly an error but I wondered if > it was a misconfiguration in the xml) > > there may be more ... > > Your email answers my question that in general EMBOSS 6 is compatible > with EMBOSS 5 but probably some minor tweaks may be required for > certain tools. It would great if some form of unit testing could be > employed to check compatibility with new builds. > > Thanks, > > Steve > > > > >> Hans >> >> >>> Is there a person who maintains the Galaxy/EMBOSS configuration? >>> >>> I know this is *really* a Galaxy question but I posted this to the >>> Galaxy list >>> but haven't had any response so far. :-) >>> >>> Thanks, >>> >>> Steve >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From biopython at maubp.freeserve.co.uk Fri Mar 12 07:07:48 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 12 Mar 2010 12:07:48 +0000 Subject: [EMBOSS] Broken links on Emboss webpages Message-ID: <320fb6e01003120407u794bf5e6ue6c84522ac588c91@mail.gmail.com> Hi, I was just looking for the EMBOSS EMBASSY documentation for the PHYLIPNEW packages, and noticed they are missing from this page: http://emboss.sourceforge.net/embassy/ Perhaps this should redirect to the latest release? i.e. http://emboss.sourceforge.net/apps/release/6.2/embassy/index.html I also found the links on this page seem to be broken: http://emboss.sourceforge.net/apps/release/6.2/emboss/apps/phylogeny_molecular_sequence_group.html Regards, Peter From Perdeep.Mehta at STJUDE.ORG Fri Mar 12 10:22:35 2010 From: Perdeep.Mehta at STJUDE.ORG (Mehta, Perdeep) Date: Fri, 12 Mar 2010 09:22:35 -0600 Subject: [EMBOSS] Antwort: restrict In-Reply-To: References: <6EAE916704479E4BB6AB5A133BA224F728A54626D5@SJMEMXMBS11.stjude.sjcrh.local> Message-ID: <6EAE916704479E4BB6AB5A133BA224F728A5462746@SJMEMXMBS11.stjude.sjcrh.local> Hi List, We now have the Rebase locally installed. Strangely, I see a new error; "Input nucleotide sequence(s): chr10.fa Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2170" Above example is just for testing with chromosome 10, I plan to do either whole genome (all 23 chromosomes) or do 23 times with each chromosome. I have tested running on a queue with higher memory using following command; qsub -q normal-ib /path/restrict -sequence chr10.fa -enzymes hinfI -fragments -outfile chr10.res Then it threw following error; "Unable to run job: Script length does not match declared length." It may not be the restrict problem, I was just throwing it in here to see if anyone else have had seen such a problem. Any guess. Thanks, perdeep From: david.bauer at bayerhealthcare.com [mailto:david.bauer at bayerhealthcare.com] Sent: Tuesday, February 23, 2010 12:56 AM To: Mehta, Perdeep Cc: emboss; emboss-bounces at lists.open-bio.org Subject: Antwort: [EMBOSS] restrict Hi, emboss-bounces at lists.open-bio.org schrieb am 23/02/2010 00:21:38: > I have a few questions on EMBOSS restriction analysis and will > appreciate any ideas or thoughts on these. > > 1. What Rebase file we need to download to get "restrict" working? I > tried but there are files with different formats. Go to the /pub/rebase dir on ftp.neb.com. Download the withrefm.xxx and proto.xxx files (xxx stands for the version number, just take the latest that's there) Run rebaseextract -infile withrefm.xxx -protofile proto.xxx This reformats the neb files for use with emboss. You should now see 4 files embossre.... in the REBASE directory > 2. Is there a maximum size limit of a nucleotide sequence that I can > use? Can I use the whole Human genome or at least a full chromosome > to digest with a particular restriction enzyme? I'm not sure about the whole genome but I have used it for individual chromosomes without problems. > 3. What program can give me the list of all possible fragments > generated as well? Since I have not seen the output of "restrict", > perhaps that is already doing that. You can run restrict with the option -fragments to get them. Hope this helps, David. ________________________________ Email Disclaimer: www.stjude.org/emaildisclaimer From michael.watson at bbsrc.ac.uk Thu Mar 18 05:11:59 2010 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Thu, 18 Mar 2010 09:11:59 +0000 Subject: [EMBOSS] Memory problem with extractseq Message-ID: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Hi I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. I find it strange that extractseq reports a memory problem: -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711' Extract regions from a sequence Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2406 Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly. I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl? Thanks Mick From david.bauer at bayerhealthcare.com Thu Mar 18 06:01:33 2010 From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com) Date: Thu, 18 Mar 2010 11:01:33 +0100 Subject: [EMBOSS] Antwort: Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: Hi, I tested this on a larger machine and the job growth to ~7.3 Gb before it outputs the requested sequence part. The memory size is the same for extractseq and seqret. Chromosome 1 fasta file size is ~250 Mb so it seems that EMBOSS is not very memory efficient ;-) David. emboss-bounces at lists.open-bio.org schrieb am 18/03/2010 10:11:59: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: > > -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1. > fasta -outseq chr1_.1.fasta -regions '34415690-34415711' > Extract regions from a sequence > Uncaught exception: Allocation failed, insufficient memory > available, raised at ajstr.c:2406 > > Whereas if I write a Bioperl script using SeqIO and the trunk() > function, it works perfectly. > > I'd have thought EMBOSS would be more streamlined and memory > efficient than Bioperl? > > Thanks > Mick > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pmr at ebi.ac.uk Thu Mar 18 08:39:28 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 18 Mar 2010 12:39:28 +0000 Subject: [EMBOSS] Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4BA21F00.2060609@ebi.ac.uk> On 18/03/10 09:11, michael watson (IAH-C) wrote: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: > > -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711' > Extract regions from a sequence > Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2406 > > Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly. > > I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl? It appears to be in the buffering of input to detect the format. While we try to improve the performance, you can simply specify the format: -sformat fasta to turn off the file input buffering. Reading an unknown format requires a lot of input to be buffered, in case a GCG ".." checksum line appears. Hope that helps Peter From pmr at ebi.ac.uk Thu Mar 18 09:30:12 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 18 Mar 2010 13:30:12 +0000 Subject: [EMBOSS] Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4BA22AE4.4050507@ebi.ac.uk> On 18/03/10 09:11, michael watson (IAH-C) wrote: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: Some further investigation suggests several improvements for the next release: The input was being buffered with the entire input buffer (2000 bytes) saved per line. That is why it used so much memory. This can be reduced to a more reasonable figure (and we can save space in some other string copies). When processing FASTA format (and various others), once the '>' line has been found it cannot fail. It will read everything up to the next '>' or continue to the end of the file. This means we can turn off buffering of FASTA input (and other formats) once they no longer have any format tests that can fail. Both changes will have a similar effect to specifying the format on the command line for large input files. That should work for any release. Hope that helps, Peter From d.m.a.martin at dundee.ac.uk Tue Mar 23 07:12:42 2010 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 23 Mar 2010 11:12:42 +0000 Subject: [EMBOSS] tfscan output Message-ID: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do? ..d David Martin PhD College of Life Sciences University of Dundee 01382 388704 The University of Dundee is a Scottish Registered Charity, No. SC015096. ************************************************************ Please consider the environment. Do you really need to print this email? The University of Dundee is a registered Scottish charity, No: SC015096 From david.bauer at bayerhealthcare.com Tue Mar 23 07:59:16 2010 From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com) Date: Tue, 23 Mar 2010 12:59:16 +0100 Subject: [EMBOSS] Antwort: tfscan output In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> Message-ID: Have you considered using jaspscan ? It uses the JASPAR database of transcription factors (http://jaspar.cgb.ki.se/) David. emboss-bounces at lists.open-bio.org schrieb am 23/03/2010 12:12:42: > TFscan appears to be a bit of a dinosaur in EMBOSS as there is no > option to change the report format. It would be really nice to be > able to get (eg) GFF output or similar. How easy would this be to do? > > ..d > > > David Martin PhD > College of Life Sciences > University of Dundee > 01382 388704 > The University of Dundee is a Scottish Registered Charity, No. SC015096. > > > > ************************************************************ > Please consider the environment. Do you really need to print this email? > > The University of Dundee is a registered Scottish charity, No: SC015096 > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pmr at ebi.ac.uk Tue Mar 23 09:09:51 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 23 Mar 2010 13:09:51 +0000 Subject: [EMBOSS] tfscan output In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> References: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> Message-ID: <4BA8BD9F.1090104@ebi.ac.uk> On 23/03/10 11:12, David Martin wrote: > TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do? Not difficult, but the extra line needs to be attached to all hits to meet the requirements of report formats It will be in the next release. Peter From jeedward at yahoo.com Wed Mar 24 19:59:52 2010 From: jeedward at yahoo.com (John Edward) Date: Wed, 24 Mar 2010 16:59:52 -0700 (PDT) Subject: [EMBOSS] Call for papers (Deadline Extended): BCBGC-10, USA, July 2010 Message-ID: <268706.8648.qm@web45903.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers (Deadline Extended): BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From biopython at maubp.freeserve.co.uk Tue Mar 30 07:46:10 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 12:46:10 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret Message-ID: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Hi all, I've got some "Sanger" capillary sequence files in ABI trace file format, which I understand includes the probabilities of the 4 bases along the sequencing run. I'd like to extract this as a FASTQ file with meaningful quality scores based on the trace data (for use in assembly). This doesn't seem to work - the FASTQ quality score characters are all double quotes (ASCI 34), meaning PHRED quality 1. seqret -sformat abi -osformat fastq-sanger -sequence example.ab1 -outseq example.fastq -auto Output as FASTA seems fine: seqret -sformat abi -osformat fasta -sequence example.ab1 -outseq example.fasta -auto Is ABI to FASTQ a reasonable to expect seqret to support? If so, could it be added to the TODO list please? Peter C. P.S. I'd be interested to hear suggestions for alternative tools to tackle this conversion. From pmr at ebi.ac.uk Tue Mar 30 08:02:25 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 13:02:25 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Message-ID: <4BB1E851.1060607@ebi.ac.uk> On 30/03/2010 12:46, Peter C. wrote: > Hi all, > > I've got some "Sanger" capillary sequence files in ABI trace file > format, which I understand includes the probabilities of the 4 bases > along the sequencing run. I'd like to extract this as a FASTQ file > with meaningful quality scores based on the trace data (for use in > assembly). > > This doesn't seem to work - the FASTQ quality score characters are all > double quotes (ASCI 34), meaning PHRED quality 1. I will take a look. I don;t recall anyone using the quality scores from ABI data when we first imeplemented it (at that time Staden Experiment files were the only supported output format with any quality scores) regards, Peter R From biopython at maubp.freeserve.co.uk Tue Mar 30 08:17:23 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 13:17:23 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1E851.1060607@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1E851.1060607@ebi.ac.uk> Message-ID: <320fb6e01003300517q6e9358bj4a45112d3e23c57f@mail.gmail.com> On Tue, Mar 30, 2010 at 1:02 PM, Peter Rice wrote: > > On 30/03/2010 12:46, Peter C. wrote: >> >> Hi all, >> >> I've got some "Sanger" capillary sequence files in ABI trace file >> format, which I understand includes the probabilities of the 4 bases >> along the sequencing run. I'd like to extract this as a FASTQ file >> with meaningful quality scores based on the trace data (for use in >> assembly). >> >> This doesn't seem to work - the FASTQ quality score characters are all >> double quotes (ASCI 34), meaning PHRED quality 1. > > I will take a look. I don;t recall anyone using the quality scores from ABI > data when we first imeplemented it (at that time Staden Experiment files > were the only supported output format with any quality scores) > Thanks Peter, Regarding other possible tools, there is the obvious choice of PHRED (although getting a copy is non-trivial), and based on this thread: http://seqanswers.com/forums/showthread.php?t=3165 I've just tried TraceTuner 3.0.6beta which is open source (specifically, GPL v2 or later): https://sourceforge.net/projects/tracetuner/ With the ttuner -nocall option to reuse the sequence as-is from the ABI file results in zero quality scores. Allowing ttuner to re-call the bases (the default), it can output FASTA/QUAL/PHD with meaningful qualities (from which I can easily make a FASTQ file). Peter C. From pmr at ebi.ac.uk Tue Mar 30 09:13:28 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 14:13:28 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Message-ID: <4BB1F8F8.8050608@ebi.ac.uk> On 30/03/2010 12:46, Peter wrote: > Hi all, > > I've got some "Sanger" capillary sequence files in ABI trace file > format, which I understand includes the probabilities of the 4 bases > along the sequencing run. I'd like to extract this as a FASTQ file > with meaningful quality scores based on the trace data (for use in > assembly). > > This doesn't seem to work - the FASTQ quality score characters are all > double quotes (ASCI 34), meaning PHRED quality 1. We have code to extract various fields from ABI trace files, but I'm not familiar with the details fo the format, and documentation appears hard to find. Where do I look to find scores that we can use (and how do we convert those to phred quality scores)? regards, Peter From pmr at ebi.ac.uk Tue Mar 30 09:25:53 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 14:25:53 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1F8F8.8050608@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> Message-ID: <4BB1FBE1.8030400@ebi.ac.uk> On 30/03/2010 14:13, Peter Rice wrote: > Where do I look to find scores that we can use (and how do we convert > those to phred quality scores)? Aha, found something. The field is called PCON (confidence values), with values 0-255. There is a possibility that these could be phred scores, but I suspect they are whatever the basecaller has decided to write there. http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf Peter R. From ztu at msi.umn.edu Tue Mar 30 09:33:56 2010 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 30 Mar 2010 08:33:56 -0500 (CDT) Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: Hi Peter: You may want to check this URL about how to convert quality score: http://maq.sourceforge.net/fastq.shtml Thanks, TU ======================================= On Tue, 30 Mar 2010, Peter Rice wrote: > On 30/03/2010 14:13, Peter Rice wrote: > > > Where do I look to find scores that we can use (and how do we convert > > those to phred quality scores)? > > Aha, found something. The field is called PCON (confidence values), with > values 0-255. > > There is a possibility that these could be phred scores, but I suspect they > are whatever the basecaller has decided to write there. > > http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf > > Peter R. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From biopython at maubp.freeserve.co.uk Tue Mar 30 09:56:34 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 14:56:34 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> On Tue, Mar 30, 2010 at 2:25 PM, Peter Rice wrote: > > On 30/03/2010 14:13, Peter Rice wrote: > >> Where do I look to find scores that we can use (and how do we convert >> those to phred quality scores)? > > Aha, found something. The field is called PCON (confidence values), with > values 0-255. > > There is a possibility that these could be phred scores, but I suspect they > are whatever the basecaller has decided to write there. > > http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf > > Peter R. Hmm. Good question - I don't know, although if they are PHRED scores they could go unusually high (we'd expect say 0 to 50 for a raw read). It could be some other encoding (e.g. scaled from 0 for a poor base to 255 for a perfect base). Do you have any contacts at Applied Biosystems to ask? Peter C. From biopython at maubp.freeserve.co.uk Tue Mar 30 09:58:18 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 14:58:18 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: <320fb6e01003300658l2742a656ge766c3fcd5a2fa44@mail.gmail.com> On Tue, Mar 30, 2010 at 2:33 PM, Zheng Jin Tu wrote: > > > Hi Peter: > > You may want to check this URL about how to > convert quality score: > > ?http://maq.sourceforge.net/fastq.shtml > > Thanks, TU Thanks - but that just covers converting between PHRED scores and Solexa Scores. Peter Rice and I are well aware of this. The question here is what do the numbers in ABI files mean? Peter C. From georgios at biotek.uio.no Wed Mar 31 14:08:07 2010 From: georgios at biotek.uio.no (Georgios Magklaras) Date: Wed, 31 Mar 2010 20:08:07 +0200 Subject: [EMBOSS] MRS/EMBOSS lecture notes and videos Message-ID: <4BB38F87.3020803@biotek.uio.no> Hi, Just to let people know (some folks expressed interest). You can find some interesting lecture notes, as part of an EMBnet course given in Mexico about sequence mining with EMBOSS/MRS here: http://folk.uio.no/georgios/other/mrskurs.pdf Some video shots of the presented material can be obtained from this URL: http://www.nnb.unam.mx/video/track (I will try and obtain the videos in a non-flash format, however the URL should make them available in the meantime). Best regards, GM -- Best regards, -- George Magklaras BSc (Hons) MPhil RHCE IT Systems Manager/Senior Systems Engineer The Biotechnology Center of Oslo University of Oslo http://www.biotek.uio.no http://www.no.embnet.org http://folk.uio.no/georgios From ajb at ebi.ac.uk Tue Mar 2 10:44:51 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 10:44:51 -0000 (UTC) Subject: [EMBOSS] EMBOSS-6.2.0 patch 1-18 available Message-ID: <34743.86.26.12.63.1267526691.squirrel@webmail.ebi.ac.uk> The first patch file for the EMBOSS-6.2.0 release is now available at: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-18.gz Discrete files used to create the above patch are held in the directory: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ The file README.fixes in the same directory describes what the fixes address and is attached to this email for convenience. A new mEMBOSS incorporating all relevant changes from the above is available as: ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.2.0.2-setup.exe Alan -------------- next part -------------- A non-text attachment was scrubbed... Name: README.fixes Type: application/octet-stream Size: 5674 bytes Desc: not available URL: From ajb at ebi.ac.uk Tue Mar 2 11:22:58 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 11:22:58 -0000 (UTC) Subject: [EMBOSS] mEMBOSS-6.2.0.1 reinstated Message-ID: <56161.86.26.12.63.1267528978.squirrel@webmail.ebi.ac.uk> A stability problem has been noticed with mEMBOSS-6.2.0.2 on the ftp server. As a result we've reinstated mEMBOSS-6.2.0.1. A further announcement will be posted when things are resolved. Apologies for any inconvenience. Alan From ajb at ebi.ac.uk Tue Mar 2 16:03:30 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Tue, 2 Mar 2010 16:03:30 -0000 (UTC) Subject: [EMBOSS] mEMBOSS 6.2.0.2 re-released Message-ID: <57834.86.26.12.63.1267545810.squirrel@webmail.ebi.ac.uk> The stability issues have been resolved. mEMBOSS 6.2.0.2 is now re-released as: ftp://emboss.open-bio.org/pub/EMBOSS/windows/beta/mEMBOSS-6.2.0.2-setup.exe Note that this release is based on the current developers' CVS code and, as such, has not had the rigorous testing performed for major releases (or for patches to the UNIX version of EMBOSS). We are providing it as a beta release in the event it may be useful. Alan From michael.watson at bbsrc.ac.uk Fri Mar 5 14:26:06 2010 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Fri, 5 Mar 2010 14:26:06 +0000 Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to bootstrap with FPROTPARS Message-ID: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> Hello Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence. The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS. I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68 Many thanks Mick From pmr at ebi.ac.uk Fri Mar 5 15:40:54 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Mar 2010 15:40:54 +0000 Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to bootstrap with FPROTPARS In-Reply-To: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4B912606.9080906@ebi.ac.uk> Dear Michael, On 05/03/10 14:26, michael watson (IAH-C) wrote: > Hello > > Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence. > > The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS. > > I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68 My understanding of the phylip documentation is that you use (EMBOSS name) fseqboot to generate the bootstrap resampling of your original sequences and then use fprotpars to analyse the resulting output. In the original phylip package the seqboot application bootstraps several types of data. In the EMBASSY package, to make the input types clearer, we split it into fseqboot, fseqbootall, fdiscboot, ffreqboot and frestboot. Hope that helps, Peter Rice From jeedward at yahoo.com Sat Mar 6 00:35:35 2010 From: jeedward at yahoo.com (John Edward) Date: Fri, 5 Mar 2010 16:35:35 -0800 (PST) Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010 Message-ID: <915762.86810.qm@web45916.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers: BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From mbk0asis at gmail.com Sun Mar 7 15:05:27 2010 From: mbk0asis at gmail.com (Byungkuk Min) Date: Sun, 7 Mar 2010 07:05:27 -0800 Subject: [EMBOSS] A question about 'showdb' Message-ID: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> When I typed 'showdb', no list of databases appeared like the example in the tutorial. How can I set up the databases? xxxxx at ubuntu:~$ showdb Displays information on configured databases # Name Type ID Qry All Comment # ============ ==== == === === ======= From pmr at ebi.ac.uk Sun Mar 7 22:28:23 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Sun, 07 Mar 2010 22:28:23 +0000 Subject: [EMBOSS] A question about 'showdb' In-Reply-To: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> References: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com> Message-ID: <4B942887.7090608@ebi.ac.uk> Dear Byungkuk, On 07/03/2010 15:05, Byungkuk Min wrote: > When I typed 'showdb', no list of databases appeared like the example in the > tutorial. > How can I set up the databases? The databases are defined in a file emboss.defaults in the share/EMBOSS/ directory where EMBOSS is installed. In that directory you will find a file emboss.default.template with example database definitions. Some databases are remote (e.g. method: "srs") and can be defined and used. Others need local data files and a local index created by EMBOSS (method: emboss and method: emblcd) creatted by the dbx* and dbi* programs in EMBOSS. Let us know if you need any more help. We are working on more detailed instructions which will appear on the EMNBOSS website. regards, Peter Rice From shrish at ccmb.res.in Mon Mar 8 08:58:26 2010 From: shrish at ccmb.res.in (Shrish Tiwari) Date: Mon, 8 Mar 2010 14:28:26 +0530 (IST) Subject: [EMBOSS] (no subject) Message-ID: <777482836.160381268038706946.JavaMail.root@127.0.0.1> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From simon.andrews at bbsrc.ac.uk Mon Mar 8 13:53:06 2010 From: simon.andrews at bbsrc.ac.uk (Simon Andrews) Date: Mon, 8 Mar 2010 13:53:06 +0000 Subject: [EMBOSS] Data for Jasextract Message-ID: I've been trying to use EMBOSS to search using the Jaspar database (jaspextract / jaspscan), but with no success. I think the problem is coming from jaspextract. TFM says: Input file format The input files are the uncompressed and extracted JASPAR_CORE.tgz, JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the JASPAR MatrixDir download directory of the JASPAR homepage (http://jaspar.genereg.net). ..but there are no files named that way (the only google hit to those names is the jaspextract manpage!). The main jaspar archive file is Archive.zip. If I unzip this and run jaspextract on the expanded directory it runs with no errors or warnings, but if I subsequently try to run jaspscan I get an error saying: Warning: Matrix file(s) *.pfm not found EMBOSS An error in jaspscan.c at line 870: Matrix list file JASPAR_CORE/matrix_list.txt not found I've tried loads of different subdirectories within the JASPAR database dump, but can't find anything which actually puts data into the appropriate EMBOSS data directories. Can anyone else make this work? Thanks Simon. From ajb at ebi.ac.uk Mon Mar 8 16:07:48 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 8 Mar 2010 16:07:48 -0000 (UTC) Subject: [EMBOSS] Data for Jasextract In-Reply-To: References: Message-ID: <45660.86.26.12.63.1268064468.squirrel@webmail.ebi.ac.uk> Hello Simon, The Jaspar people altered the structure and content of their ftp server recently. There is a patch in the fixes/patches area of the EMBOSS ftp server which updates jaspextract and jaspscan appropriately. The README.fixes file in the 'fixes' directory explains further. HTH Alan > I've been trying to use EMBOSS to search using the Jaspar database > (jaspextract / jaspscan), but with no success. > > I think the problem is coming from jaspextract. TFM says: > > Input file format > > The input files are the uncompressed and extracted JASPAR_CORE.tgz, > JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the > JASPAR > MatrixDir download directory of the JASPAR homepage > (http://jaspar.genereg.net). > > > ..but there are no files named that way (the only google hit to those > names is the jaspextract manpage!). > > The main jaspar archive file is Archive.zip. If I unzip this and run > jaspextract on the expanded directory it runs with no errors or > warnings, but if I subsequently try to run jaspscan I get an error > saying: > > Warning: Matrix file(s) *.pfm not found > > EMBOSS An error in jaspscan.c at line 870: > Matrix list file JASPAR_CORE/matrix_list.txt not found > > I've tried loads of different subdirectories within the JASPAR > database dump, but can't find anything which actually puts data into > the appropriate EMBOSS data directories. > > Can anyone else make this work? > > Thanks > > Simon. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From stephen.taylor at imm.ox.ac.uk Tue Mar 9 14:20:38 2010 From: stephen.taylor at imm.ox.ac.uk (Steve Taylor) Date: Tue, 09 Mar 2010 14:20:38 +0000 Subject: [EMBOSS] Galaxy and EMBOSS Message-ID: <4B965936.6030102@imm.ox.ac.uk> Hi, I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone on this list know if there planned support for EMBOSS 6? We have found using our local installation of EMBOSS 6 that a few tools don't work. Is there a person who maintains the Galaxy/EMBOSS configuration? I know this is *really* a Galaxy question but I posted this to the Galaxy list but haven't had any response so far. :-) Thanks, Steve From pmr at ebi.ac.uk Tue Mar 9 15:29:59 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 09 Mar 2010 15:29:59 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B965936.6030102@imm.ox.ac.uk> References: <4B965936.6030102@imm.ox.ac.uk> Message-ID: <4B966977.9050101@ebi.ac.uk> On 09/03/2010 14:20, Steve Taylor wrote: > Hi, > > I notice in the Galaxy distribution there is support for EMBOSS 5. Does > anyone on this list know if there planned support for EMBOSS 6? We have > found using our local installation of EMBOSS 6 that a few tools don't > work. Is there a person who maintains the Galaxy/EMBOSS configuration? > > I know this is *really* a Galaxy question but I posted this to the > Galaxy list but haven't had any response so far. :-) I am looking into it and will be going to the Galaxy Developers meeting in May. Any other interest among the EMBOSS users? regards, Peter Rice From hrh at fmi.ch Tue Mar 9 15:40:19 2010 From: hrh at fmi.ch (Hotz, Hans-Rudolf) Date: Tue, 09 Mar 2010 16:40:19 +0100 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B965936.6030102@imm.ox.ac.uk> Message-ID: Steve > I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone > on this list know if there planned support for EMBOSS 6? We have found using > our local installation of EMBOSS 6 that a few tools don't work. which tools don't work? We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I don't remember running into problems with the galaxy emboss 5 tool definitions (ie emboss_*.xml files). I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions (based on the release notes) you need to make. Generally speaking: EMBOSS tools are pretty stable. Maybe if you provide a list of problems/incompatibilities and resend this to the galaxy mailing list, you will get a response... Hans > Is there a person who maintains the Galaxy/EMBOSS configuration? > > I know this is *really* a Galaxy question but I posted this to the Galaxy list > but haven't had any response so far. :-) > > Thanks, > > Steve > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From stephen.taylor at imm.ox.ac.uk Tue Mar 9 16:30:08 2010 From: stephen.taylor at imm.ox.ac.uk (Steve Taylor) Date: Tue, 09 Mar 2010 16:30:08 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: References: Message-ID: <4B967790.8000609@imm.ox.ac.uk> Hi Hans, >> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone >> on this list know if there planned support for EMBOSS 6? We have found using >> our local installation of EMBOSS 6 that a few tools don't work. > > which tools don't work? > > We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I > don't remember running into problems with the galaxy emboss 5 tool > definitions (ie emboss_*.xml files). > Ok. That's good to know. > I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions > (based on the release notes) you need to make. Generally speaking: EMBOSS > tools are pretty stable. > To give a bit of history, we are fairly new to using Galaxy and previously we used EMBOSS Explorer as our main web interface. With this we found when EMBOSS releases changed lots of things broke, so we ended up staying with EMBOSS v3. I am hoping this is not going to be true for EMBOSS/Galaxy because they are both great tools and I want them to be used routinely without us/users worrying if things are going to break, especially if they are going to be incorporated routinely into workflows. > Maybe if you provide a list of problems/incompatibilities and resend this to > the galaxy mailing list, you will get a response... > Maybe I was a bit unlucky because I tried a few more tools and generally things are ok. A couple of minor issues I came across: * antigenic (ran but gave an error) 14: antigenic on data 13 An error occurred running this job: Error: Unable to read feature tags data file 'Etags.gff3protein' * etandem produced two outputs (not exactly an error but I wondered if it was a misconfiguration in the xml) there may be more ... Your email answers my question that in general EMBOSS 6 is compatible with EMBOSS 5 but probably some minor tweaks may be required for certain tools. It would great if some form of unit testing could be employed to check compatibility with new builds. Thanks, Steve > Hans > > >> Is there a person who maintains the Galaxy/EMBOSS configuration? >> >> I know this is *really* a Galaxy question but I posted this to the Galaxy list >> but haven't had any response so far. :-) >> >> Thanks, >> >> Steve >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > From pmr at ebi.ac.uk Tue Mar 9 17:11:30 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 09 Mar 2010 17:11:30 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B967790.8000609@imm.ox.ac.uk> References: <4B967790.8000609@imm.ox.ac.uk> Message-ID: <4B968142.9080905@ebi.ac.uk> On 09/03/2010 16:30, Steve Taylor wrote: > To give a bit of history, we are fairly new to using Galaxy and > previously we used EMBOSS Explorer as our main web interface. With this > we found when EMBOSS releases changed lots of things broke, so we ended > up staying with EMBOSS v3. I am hoping this is not going to be true for > EMBOSS/Galaxy because they are both great tools and I want them to be > used routinely without us/users worrying if things are going to break, > especially if they are going to be incorporated routinely into workflows. >> Maybe if you provide a list of problems/incompatibilities and resend >> this to >> the galaxy mailing list, you will get a response... Yes please do ... I am on the Galaxy list too. > 14: antigenic on data 13 > An error occurred running this job: Error: Unable to read feature tags > data file 'Etags.gff3protein' Could be you have more than one version of EMBOSS running. That looks like a pure EMBOSS error suggesting EMBOSS 6 is trying to use EMBSOS5's data directory. Should be fixable by copying the missing file. Peter From n.binns at ed.ac.uk Tue Mar 9 16:59:13 2010 From: n.binns at ed.ac.uk (Nigel Binns) Date: Tue, 09 Mar 2010 16:59:13 +0000 Subject: [EMBOSS] Galaxy and EMBOSS In-Reply-To: <4B967790.8000609@imm.ox.ac.uk> References: <4B967790.8000609@imm.ox.ac.uk> Message-ID: <4B967E61.4060502@ed.ac.uk> I'm running EMBOSS 6.2.0 (and Jemboss via JWS) with the latest patch applied and EMBOSS Explorer (v2.2.0) without any problems. The only issue I've experienced is that the link to the EMBOSS help files is broken The workaround is to copy the EMBOSS HTML help files to the location EE expects to find them - which is not where the current release of EMBOSS is places them :-) Nigel On 09/03/2010 16:30, Steve Taylor wrote: > Hi Hans, > >>> I notice in the Galaxy distribution there is support for EMBOSS 5. >>> Does anyone >>> on this list know if there planned support for EMBOSS 6? We have >>> found using >>> our local installation of EMBOSS 6 that a few tools don't work. >> >> which tools don't work? >> >> We are using most of the EMBOSS 6.1.0 tools in or local galaxy >> server. And I >> don't remember running into problems with the galaxy emboss 5 tool >> definitions (ie emboss_*.xml files). >> > > Ok. That's good to know. >> I haven't checked EMBOSS 6.2.0, but I guess there are just a few >> additions >> (based on the release notes) you need to make. Generally speaking: >> EMBOSS >> tools are pretty stable. >> > > To give a bit of history, we are fairly new to using Galaxy and > previously we used EMBOSS Explorer as our main web interface. With > this we found when EMBOSS releases changed lots of things broke, so we > ended up staying with EMBOSS v3. I am hoping this is not going to be > true for EMBOSS/Galaxy because they are both great tools and I want > them to be used routinely without us/users worrying if things are > going to break, especially if they are going to be incorporated > routinely into workflows. >> Maybe if you provide a list of problems/incompatibilities and resend >> this to >> the galaxy mailing list, you will get a response... >> > > > Maybe I was a bit unlucky because I tried a few more tools and > generally things are ok. A couple of minor issues I came across: > > * antigenic > > (ran but gave an error) > > 14: antigenic on data 13 > An error occurred running this job: Error: Unable to read feature tags > data file 'Etags.gff3protein' > > * etandem produced two outputs (not exactly an error but I wondered if > it was a misconfiguration in the xml) > > there may be more ... > > Your email answers my question that in general EMBOSS 6 is compatible > with EMBOSS 5 but probably some minor tweaks may be required for > certain tools. It would great if some form of unit testing could be > employed to check compatibility with new builds. > > Thanks, > > Steve > > > > >> Hans >> >> >>> Is there a person who maintains the Galaxy/EMBOSS configuration? >>> >>> I know this is *really* a Galaxy question but I posted this to the >>> Galaxy list >>> but haven't had any response so far. :-) >>> >>> Thanks, >>> >>> Steve >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From biopython at maubp.freeserve.co.uk Fri Mar 12 12:07:48 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 12 Mar 2010 12:07:48 +0000 Subject: [EMBOSS] Broken links on Emboss webpages Message-ID: <320fb6e01003120407u794bf5e6ue6c84522ac588c91@mail.gmail.com> Hi, I was just looking for the EMBOSS EMBASSY documentation for the PHYLIPNEW packages, and noticed they are missing from this page: http://emboss.sourceforge.net/embassy/ Perhaps this should redirect to the latest release? i.e. http://emboss.sourceforge.net/apps/release/6.2/embassy/index.html I also found the links on this page seem to be broken: http://emboss.sourceforge.net/apps/release/6.2/emboss/apps/phylogeny_molecular_sequence_group.html Regards, Peter From Perdeep.Mehta at STJUDE.ORG Fri Mar 12 15:22:35 2010 From: Perdeep.Mehta at STJUDE.ORG (Mehta, Perdeep) Date: Fri, 12 Mar 2010 09:22:35 -0600 Subject: [EMBOSS] Antwort: restrict In-Reply-To: References: <6EAE916704479E4BB6AB5A133BA224F728A54626D5@SJMEMXMBS11.stjude.sjcrh.local> Message-ID: <6EAE916704479E4BB6AB5A133BA224F728A5462746@SJMEMXMBS11.stjude.sjcrh.local> Hi List, We now have the Rebase locally installed. Strangely, I see a new error; "Input nucleotide sequence(s): chr10.fa Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2170" Above example is just for testing with chromosome 10, I plan to do either whole genome (all 23 chromosomes) or do 23 times with each chromosome. I have tested running on a queue with higher memory using following command; qsub -q normal-ib /path/restrict -sequence chr10.fa -enzymes hinfI -fragments -outfile chr10.res Then it threw following error; "Unable to run job: Script length does not match declared length." It may not be the restrict problem, I was just throwing it in here to see if anyone else have had seen such a problem. Any guess. Thanks, perdeep From: david.bauer at bayerhealthcare.com [mailto:david.bauer at bayerhealthcare.com] Sent: Tuesday, February 23, 2010 12:56 AM To: Mehta, Perdeep Cc: emboss; emboss-bounces at lists.open-bio.org Subject: Antwort: [EMBOSS] restrict Hi, emboss-bounces at lists.open-bio.org schrieb am 23/02/2010 00:21:38: > I have a few questions on EMBOSS restriction analysis and will > appreciate any ideas or thoughts on these. > > 1. What Rebase file we need to download to get "restrict" working? I > tried but there are files with different formats. Go to the /pub/rebase dir on ftp.neb.com. Download the withrefm.xxx and proto.xxx files (xxx stands for the version number, just take the latest that's there) Run rebaseextract -infile withrefm.xxx -protofile proto.xxx This reformats the neb files for use with emboss. You should now see 4 files embossre.... in the REBASE directory > 2. Is there a maximum size limit of a nucleotide sequence that I can > use? Can I use the whole Human genome or at least a full chromosome > to digest with a particular restriction enzyme? I'm not sure about the whole genome but I have used it for individual chromosomes without problems. > 3. What program can give me the list of all possible fragments > generated as well? Since I have not seen the output of "restrict", > perhaps that is already doing that. You can run restrict with the option -fragments to get them. Hope this helps, David. ________________________________ Email Disclaimer: www.stjude.org/emaildisclaimer From michael.watson at bbsrc.ac.uk Thu Mar 18 09:11:59 2010 From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C)) Date: Thu, 18 Mar 2010 09:11:59 +0000 Subject: [EMBOSS] Memory problem with extractseq Message-ID: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Hi I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. I find it strange that extractseq reports a memory problem: -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711' Extract regions from a sequence Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2406 Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly. I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl? Thanks Mick From david.bauer at bayerhealthcare.com Thu Mar 18 10:01:33 2010 From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com) Date: Thu, 18 Mar 2010 11:01:33 +0100 Subject: [EMBOSS] Antwort: Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: Hi, I tested this on a larger machine and the job growth to ~7.3 Gb before it outputs the requested sequence part. The memory size is the same for extractseq and seqret. Chromosome 1 fasta file size is ~250 Mb so it seems that EMBOSS is not very memory efficient ;-) David. emboss-bounces at lists.open-bio.org schrieb am 18/03/2010 10:11:59: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: > > -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1. > fasta -outseq chr1_.1.fasta -regions '34415690-34415711' > Extract regions from a sequence > Uncaught exception: Allocation failed, insufficient memory > available, raised at ajstr.c:2406 > > Whereas if I write a Bioperl script using SeqIO and the trunk() > function, it works perfectly. > > I'd have thought EMBOSS would be more streamlined and memory > efficient than Bioperl? > > Thanks > Mick > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pmr at ebi.ac.uk Thu Mar 18 12:39:28 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 18 Mar 2010 12:39:28 +0000 Subject: [EMBOSS] Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4BA21F00.2060609@ebi.ac.uk> On 18/03/10 09:11, michael watson (IAH-C) wrote: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: > > -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711' > Extract regions from a sequence > Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2406 > > Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly. > > I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl? It appears to be in the buffering of input to detect the format. While we try to improve the performance, you can simply specify the format: -sformat fasta to turn off the file input buffering. Reading an unknown format requires a lot of input to be buffered, in case a GCG ".." checksum line appears. Hope that helps Peter From pmr at ebi.ac.uk Thu Mar 18 13:30:12 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 18 Mar 2010 13:30:12 +0000 Subject: [EMBOSS] Memory problem with extractseq In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk> Message-ID: <4BA22AE4.4050507@ebi.ac.uk> On 18/03/10 09:11, michael watson (IAH-C) wrote: > Hi > > I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM. > > I find it strange that extractseq reports a memory problem: Some further investigation suggests several improvements for the next release: The input was being buffered with the entire input buffer (2000 bytes) saved per line. That is why it used so much memory. This can be reduced to a more reasonable figure (and we can save space in some other string copies). When processing FASTA format (and various others), once the '>' line has been found it cannot fail. It will read everything up to the next '>' or continue to the end of the file. This means we can turn off buffering of FASTA input (and other formats) once they no longer have any format tests that can fail. Both changes will have a similar effect to specifying the format on the command line for large input files. That should work for any release. Hope that helps, Peter From d.m.a.martin at dundee.ac.uk Tue Mar 23 11:12:42 2010 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 23 Mar 2010 11:12:42 +0000 Subject: [EMBOSS] tfscan output Message-ID: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do? ..d David Martin PhD College of Life Sciences University of Dundee 01382 388704 The University of Dundee is a Scottish Registered Charity, No. SC015096. ************************************************************ Please consider the environment. Do you really need to print this email? The University of Dundee is a registered Scottish charity, No: SC015096 From david.bauer at bayerhealthcare.com Tue Mar 23 11:59:16 2010 From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com) Date: Tue, 23 Mar 2010 12:59:16 +0100 Subject: [EMBOSS] Antwort: tfscan output In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> Message-ID: Have you considered using jaspscan ? It uses the JASPAR database of transcription factors (http://jaspar.cgb.ki.se/) David. emboss-bounces at lists.open-bio.org schrieb am 23/03/2010 12:12:42: > TFscan appears to be a bit of a dinosaur in EMBOSS as there is no > option to change the report format. It would be really nice to be > able to get (eg) GFF output or similar. How easy would this be to do? > > ..d > > > David Martin PhD > College of Life Sciences > University of Dundee > 01382 388704 > The University of Dundee is a Scottish Registered Charity, No. SC015096. > > > > ************************************************************ > Please consider the environment. Do you really need to print this email? > > The University of Dundee is a registered Scottish charity, No: SC015096 > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pmr at ebi.ac.uk Tue Mar 23 13:09:51 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 23 Mar 2010 13:09:51 +0000 Subject: [EMBOSS] tfscan output In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> References: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK> Message-ID: <4BA8BD9F.1090104@ebi.ac.uk> On 23/03/10 11:12, David Martin wrote: > TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do? Not difficult, but the extra line needs to be attached to all hits to meet the requirements of report formats It will be in the next release. Peter From jeedward at yahoo.com Wed Mar 24 23:59:52 2010 From: jeedward at yahoo.com (John Edward) Date: Wed, 24 Mar 2010 16:59:52 -0700 (PDT) Subject: [EMBOSS] Call for papers (Deadline Extended): BCBGC-10, USA, July 2010 Message-ID: <268706.8648.qm@web45903.mail.sp1.yahoo.com> It would be highly appreciated if you could share this announcement with your colleagues, students and individuals whose research is in bioinformatics, computational biology, genomics, data-mining, and related areas. Call for papers (Deadline Extended): BCBGC-10, USA, July 2010 The 2010 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will be held during 12-14 of July 2010 in Orlando, FL, USA. BCBGC is an important event in the areas of bioinformatics, computational biology, genomics and chemoinformatics and focuses on all areas related to the conference. The conference will be held at the same time and location where several other major international conferences will be taking place. The conference will be held as part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to promote research and developmental activities in computer science, information technology, control engineering, and related fields. Another goal is to promote the dissemination of research to a multidisciplinary audience and to facilitate communication among researchers, developers, practitioners in different fields. The following conferences are planned to be organized as part of MULTICONF-10. ? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-10) ? International Conference on Automation, Robotics and Control Systems (ARCS-10) ? International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10) ? International Conference on Computer Communications and Networks (CCN-10) ? International Conference on Enterprise Information Systems and Web Technologies (EISWT-10) ? International Conference on High Performance Computing Systems (HPCS-10) ? International Conference on Information Security and Privacy (ISP-10) ? International Conference on Image and Video Processing and Computer Vision (IVPCV-10) ? International Conference on Software Engineering Theory and Practice (SETP-10) ? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) MULTICONF-10 will be held at Imperial Swan Hotel and Suites. It is a full-service resort that puts you in the middle of the fun! Located 1/2 block south of the famed International Drive, the hotel is just minutes from great entertainment like Walt Disney World? Resort, Universal Studios and Sea World Orlando. Guests can enjoy free scheduled transportation to these theme parks, as well as spacious accommodations, outdoor pools and on-site dining ? all situated on 10 tropically landscaped acres. Here, guests can experience a full-service resort with discount hotel pricing in Orlando. We invite draft paper submissions. Please see the website http://www.PromoteResearch.org for more details. Sincerely John Edward From biopython at maubp.freeserve.co.uk Tue Mar 30 11:46:10 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 12:46:10 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret Message-ID: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Hi all, I've got some "Sanger" capillary sequence files in ABI trace file format, which I understand includes the probabilities of the 4 bases along the sequencing run. I'd like to extract this as a FASTQ file with meaningful quality scores based on the trace data (for use in assembly). This doesn't seem to work - the FASTQ quality score characters are all double quotes (ASCI 34), meaning PHRED quality 1. seqret -sformat abi -osformat fastq-sanger -sequence example.ab1 -outseq example.fastq -auto Output as FASTA seems fine: seqret -sformat abi -osformat fasta -sequence example.ab1 -outseq example.fasta -auto Is ABI to FASTQ a reasonable to expect seqret to support? If so, could it be added to the TODO list please? Peter C. P.S. I'd be interested to hear suggestions for alternative tools to tackle this conversion. From pmr at ebi.ac.uk Tue Mar 30 12:02:25 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 13:02:25 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Message-ID: <4BB1E851.1060607@ebi.ac.uk> On 30/03/2010 12:46, Peter C. wrote: > Hi all, > > I've got some "Sanger" capillary sequence files in ABI trace file > format, which I understand includes the probabilities of the 4 bases > along the sequencing run. I'd like to extract this as a FASTQ file > with meaningful quality scores based on the trace data (for use in > assembly). > > This doesn't seem to work - the FASTQ quality score characters are all > double quotes (ASCI 34), meaning PHRED quality 1. I will take a look. I don;t recall anyone using the quality scores from ABI data when we first imeplemented it (at that time Staden Experiment files were the only supported output format with any quality scores) regards, Peter R From biopython at maubp.freeserve.co.uk Tue Mar 30 12:17:23 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 13:17:23 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1E851.1060607@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1E851.1060607@ebi.ac.uk> Message-ID: <320fb6e01003300517q6e9358bj4a45112d3e23c57f@mail.gmail.com> On Tue, Mar 30, 2010 at 1:02 PM, Peter Rice wrote: > > On 30/03/2010 12:46, Peter C. wrote: >> >> Hi all, >> >> I've got some "Sanger" capillary sequence files in ABI trace file >> format, which I understand includes the probabilities of the 4 bases >> along the sequencing run. I'd like to extract this as a FASTQ file >> with meaningful quality scores based on the trace data (for use in >> assembly). >> >> This doesn't seem to work - the FASTQ quality score characters are all >> double quotes (ASCI 34), meaning PHRED quality 1. > > I will take a look. I don;t recall anyone using the quality scores from ABI > data when we first imeplemented it (at that time Staden Experiment files > were the only supported output format with any quality scores) > Thanks Peter, Regarding other possible tools, there is the obvious choice of PHRED (although getting a copy is non-trivial), and based on this thread: http://seqanswers.com/forums/showthread.php?t=3165 I've just tried TraceTuner 3.0.6beta which is open source (specifically, GPL v2 or later): https://sourceforge.net/projects/tracetuner/ With the ttuner -nocall option to reuse the sequence as-is from the ABI file results in zero quality scores. Allowing ttuner to re-call the bases (the default), it can output FASTA/QUAL/PHD with meaningful qualities (from which I can easily make a FASTQ file). Peter C. From pmr at ebi.ac.uk Tue Mar 30 13:13:28 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 14:13:28 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> Message-ID: <4BB1F8F8.8050608@ebi.ac.uk> On 30/03/2010 12:46, Peter wrote: > Hi all, > > I've got some "Sanger" capillary sequence files in ABI trace file > format, which I understand includes the probabilities of the 4 bases > along the sequencing run. I'd like to extract this as a FASTQ file > with meaningful quality scores based on the trace data (for use in > assembly). > > This doesn't seem to work - the FASTQ quality score characters are all > double quotes (ASCI 34), meaning PHRED quality 1. We have code to extract various fields from ABI trace files, but I'm not familiar with the details fo the format, and documentation appears hard to find. Where do I look to find scores that we can use (and how do we convert those to phred quality scores)? regards, Peter From pmr at ebi.ac.uk Tue Mar 30 13:25:53 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 30 Mar 2010 14:25:53 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1F8F8.8050608@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> Message-ID: <4BB1FBE1.8030400@ebi.ac.uk> On 30/03/2010 14:13, Peter Rice wrote: > Where do I look to find scores that we can use (and how do we convert > those to phred quality scores)? Aha, found something. The field is called PCON (confidence values), with values 0-255. There is a possibility that these could be phred scores, but I suspect they are whatever the basecaller has decided to write there. http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf Peter R. From ztu at msi.umn.edu Tue Mar 30 13:33:56 2010 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 30 Mar 2010 08:33:56 -0500 (CDT) Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: Hi Peter: You may want to check this URL about how to convert quality score: http://maq.sourceforge.net/fastq.shtml Thanks, TU ======================================= On Tue, 30 Mar 2010, Peter Rice wrote: > On 30/03/2010 14:13, Peter Rice wrote: > > > Where do I look to find scores that we can use (and how do we convert > > those to phred quality scores)? > > Aha, found something. The field is called PCON (confidence values), with > values 0-255. > > There is a possibility that these could be phred scores, but I suspect they > are whatever the basecaller has decided to write there. > > http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf > > Peter R. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From biopython at maubp.freeserve.co.uk Tue Mar 30 13:56:34 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 14:56:34 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> On Tue, Mar 30, 2010 at 2:25 PM, Peter Rice wrote: > > On 30/03/2010 14:13, Peter Rice wrote: > >> Where do I look to find scores that we can use (and how do we convert >> those to phred quality scores)? > > Aha, found something. The field is called PCON (confidence values), with > values 0-255. > > There is a possibility that these could be phred scores, but I suspect they > are whatever the basecaller has decided to write there. > > http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf > > Peter R. Hmm. Good question - I don't know, although if they are PHRED scores they could go unusually high (we'd expect say 0 to 50 for a raw read). It could be some other encoding (e.g. scaled from 0 for a poor base to 255 for a perfect base). Do you have any contacts at Applied Biosystems to ask? Peter C. From biopython at maubp.freeserve.co.uk Tue Mar 30 13:58:18 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 30 Mar 2010 14:58:18 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> Message-ID: <320fb6e01003300658l2742a656ge766c3fcd5a2fa44@mail.gmail.com> On Tue, Mar 30, 2010 at 2:33 PM, Zheng Jin Tu wrote: > > > Hi Peter: > > You may want to check this URL about how to > convert quality score: > > ?http://maq.sourceforge.net/fastq.shtml > > Thanks, TU Thanks - but that just covers converting between PHRED scores and Solexa Scores. Peter Rice and I are well aware of this. The question here is what do the numbers in ABI files mean? Peter C. From georgios at biotek.uio.no Wed Mar 31 18:08:07 2010 From: georgios at biotek.uio.no (Georgios Magklaras) Date: Wed, 31 Mar 2010 20:08:07 +0200 Subject: [EMBOSS] MRS/EMBOSS lecture notes and videos Message-ID: <4BB38F87.3020803@biotek.uio.no> Hi, Just to let people know (some folks expressed interest). You can find some interesting lecture notes, as part of an EMBnet course given in Mexico about sequence mining with EMBOSS/MRS here: http://folk.uio.no/georgios/other/mrskurs.pdf Some video shots of the presented material can be obtained from this URL: http://www.nnb.unam.mx/video/track (I will try and obtain the videos in a non-flash format, however the URL should make them available in the meantime). Best regards, GM -- Best regards, -- George Magklaras BSc (Hons) MPhil RHCE IT Systems Manager/Senior Systems Engineer The Biotechnology Center of Oslo University of Oslo http://www.biotek.uio.no http://www.no.embnet.org http://folk.uio.no/georgios