From aengus.stewart at cancer.org.uk Wed Nov 3 08:19:13 2004 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Wed, 03 Nov 2004 13:19:13 +0000 Subject: [EMBOSS] Sequence format problem Message-ID: <4188DAD1.2060204@cancer.org.uk> Had a user complain that his sequence wasnt being recognised. It was a GCG format sequence. The error was Died: Unknown sequence type code for 'u' There were no 'u's in the sequence! After a bit of fiddling I discovered it was the * at the end of the sequence. I believe that the * appeared in GCG format sequences when a traslation was incomplete. I imagine some other people may have problems with this. Cheers Aengus -- ----------------------------------------------------------------------- Aengus Stewart Group Leader Bioinformatics at CGAL Tel: +44 (0)20 7269 3679 Cancer Research UK, Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK ----------------------------------------------------------------------- This electronic message contains information which may be privileged and confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify me by telephone or email (to the number or address above) immediately. From pmr at ebi.ac.uk Wed Nov 3 08:34:16 2004 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Wed, 3 Nov 2004 13:34:16 -0000 (GMT) Subject: [EMBOSS] Sequence format problem In-Reply-To: <4188DAD1.2060204@cancer.org.uk> References: <4188DAD1.2060204@cancer.org.uk> Message-ID: <1113.217.134.24.85.1099488856.squirrel@webmail.ebi.ac.uk> Hi Aengus, > Had a user complain that his sequence wasnt being recognised. > > It was a GCG format sequence. > > The error was > > Died: Unknown sequence type code for 'u' > > There were no 'u's in the sequence! > > After a bit of fiddling I discovered it was the * at the end of the > sequence. > > I believe that the * appeared in GCG format sequences when a traslation > was incomplete. Oops. It should be automatically converted to 'n' (for nucleotide) or 'x' for protein. I am currently going through all the sequence input types and formats, cleaning up a few issues that have been raised. Now would be a good time to send (to emboss-bug at embnet.org) any other oddities anyone may have noticed. regards, Peter Rice From areagp61 at yahoo.it Wed Nov 3 08:44:03 2004 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 3 Nov 2004 14:44:03 +0100 (CET) Subject: [EMBOSS] water exception Message-ID: <20041103134403.35588.qmail@web54206.mail.yahoo.com> Hi all, I have to align two complete mitochondrial genome sequences (about 16500 nt long) using water (emboss version 2.7.1). I have tried this alignment on a linux server with 2 Gbytes of RAM and water returns this error: "Uncaught exception: Allocation failed, insufficient memory available, raised at water.c:126" I have tried the same alignment using the embossversion 2.8.0 on a unix server with 1Gbyte of RAM and water returned the same error message. I have tried to launch the same alignment on EBI-SRS (which says to have a 2.2.0 Emboss version) and the alignment works. Is it a problem of RAM amount or a problem of EMBOSS version? Best regards Graziano ___________________________________ Nuovo Yahoo! Messenger: E' molto pi? divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it From pmr at ebi.ac.uk Wed Nov 3 09:32:49 2004 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Wed, 3 Nov 2004 14:32:49 -0000 (GMT) Subject: [EMBOSS] water exception In-Reply-To: <20041103134403.35588.qmail@web54206.mail.yahoo.com> References: <20041103134403.35588.qmail@web54206.mail.yahoo.com> Message-ID: <1181.217.134.254.120.1099492369.squirrel@webmail.ebi.ac.uk> Hi Graziano, > I have to align two complete mitochondrial genome > sequences (about 16500 nt long) using water (emboss > version 2.7.1). I have tried this alignment on a linux > server with 2 Gbytes of RAM and water returns this > error: > > "Uncaught exception: Allocation failed, insufficient > memory available, raised at water.c:126" It will be virtual memory rather than RAM. Something in your shell will limit how much memory you can have for water, and it is not limited to (or by) the physical RAM in the machine. But this is a huge alignment problem - just to get a "best local match". You could get fast results with dottup or (perhaps better) polydot. You need a large wordsize (the shortest identical sequence you expect in a real match). Polydot can save the best matches as feature tables. hope this halps, Peter From areagp61 at yahoo.it Wed Nov 3 11:05:26 2004 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 3 Nov 2004 17:05:26 +0100 (CET) Subject: Antwort: Re: [EMBOSS] water exception In-Reply-To: Message-ID: <20041103160526.57191.qmail@web54201.mail.yahoo.com> Hi David, I tried launching "limit" and these are the results: 127 /home/life> limit cputime unlimited filesize unlimited datasize unlimited stacksize unlimited coredumpsize 0 kbytes memoryuse unlimited vmemoryuse unlimited descriptors 1024 memorylocked unlimited maxproc 6132 Both memoryuse and vmemoryuse are unlimited. I am not obliged to use water, but I am curious to understand why the EBI-SRS server is able to perform this alignment and my server not. Best regards Graziano P.S. I am having troubles in sending mail to this mailing list; I regularly receive mails but I cannot send mails using outlook. Has the EMBOSS mailing list got an IP blacklist? --- David.Bauer at SCHERING.DE ha scritto: > > Hi Graziano, > > with the command "limit" (in tcsh or bash) you can > see if memoryuse and/or > vmemoryuse is limited. > You can try to increase the limit or set it to > "unlimited". > > Regards, > David. > > > > Hi Graziano, > > > I have to align two complete mitochondrial genome > > sequences (about 16500 nt long) using water > (emboss > > version 2.7.1). I have tried this alignment on a > linux > > server with 2 Gbytes of RAM and water returns this > > error: > > > > "Uncaught exception: Allocation failed, > insufficient > > memory available, raised at water.c:126" > > It will be virtual memory rather than RAM. Something > in your shell will > limit how much memory you can have for water, and it > is not limited to (or > by) the physical RAM in the machine. > > But this is a huge alignment problem - just to get a > "best local match". > > You could get fast results with dottup or (perhaps > better) polydot. You > need a large wordsize (the shortest identical > sequence you expect in a > real match). Polydot can save the best matches as > feature tables. > > hope this halps, > > Peter > > > > > > ___________________________________ Nuovo Yahoo! Messenger: E' molto pi? divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it From David.Bauer at Schering.de Wed Nov 3 10:28:09 2004 From: David.Bauer at Schering.de (David.Bauer at Schering.de) Date: Wed, 3 Nov 2004 16:28:09 +0100 Subject: Antwort: Re: [EMBOSS] water exception Message-ID: Hi Graziano, with the command "limit" (in tcsh or bash) you can see if memoryuse and/or vmemoryuse is limited. You can try to increase the limit or set it to "unlimited". Regards, David. Hi Graziano, > I have to align two complete mitochondrial genome > sequences (about 16500 nt long) using water (emboss > version 2.7.1). I have tried this alignment on a linux > server with 2 Gbytes of RAM and water returns this > error: > > "Uncaught exception: Allocation failed, insufficient > memory available, raised at water.c:126" It will be virtual memory rather than RAM. Something in your shell will limit how much memory you can have for water, and it is not limited to (or by) the physical RAM in the machine. But this is a huge alignment problem - just to get a "best local match". You could get fast results with dottup or (perhaps better) polydot. You need a large wordsize (the shortest identical sequence you expect in a real match). Polydot can save the best matches as feature tables. hope this halps, Peter From pmr at ebi.ac.uk Mon Nov 8 08:39:25 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 13:39:25 +0000 Subject: [EMBOSS] can't access databases indexed by dbifasta In-Reply-To: <1077015509.5582.15.camel@morpheus.ucc.ie> References: <1077015509.5582.15.camel@morpheus.ucc.ie> Message-ID: <418F770D.8060000@ebi.ac.uk> An old bug reoprt that was fixed at the time (by using a different database name) ... now we know the real reason. The failed database name was ecoli.nt Database names can only contain letters, numbers or underscores. The '.' in the database name means that EMBOSS fails to read it as a database name. It can read it as a file - this is why it appears to work if you are in the same directory as the database - because there is a file called ecoli.nt in that directory, and this is what EMBOSS is reading. In the next release, database names with '.' will cause a warning message when the emboss.default and .embossrc files are read. Also in the next release, database names must be at least 2 characters long (so we can use Windows filenaming conventions on windows systems). I doubt whether anyone is using single letter database names, but there may be smoe "test databases" defined which will cause a warning message in the next release. Marcus Claesson wrote: > Hello, > > I have a silly little problem indexing databases in Emboss-2.8.0. After > running dbifasta and adding DB entries in emboss.default I can only > access the database when being in the same directory as the fasta file. > Here is what I did: > > [blast_db]$ uname -a > Linux neo.ucc.ie 2.4.9-e.35enterprise #1 SMP Tue Dec 23 00:06:16 EST > 2003 i686 unknown > > [blast_db]$ pwd > /var/data/blast_db > > [blast_db]$ ll ecoli.nt > -rw-r--r-- 1 marcus bioinfo 4763013 Jan 15 01:38 ecoli.nt > > [blast_db]$ dbifasta > Index a fasta database > simple : >ID > idacc : >ID ACC > gcgid : >db:ID > gcgidacc : >db:ID ACC > dbid : >db ID > ncbi : | formats > ID line format [idacc]: > Database directory [.]: /var/data/blast_db > Wildcard database filename [*.dat]: ecoli.nt > Database name: ecoli.nt > Release number [0.0]: > Index date [00/00/00]: > > [blast_db]$ ll entrynam.idx division.lkp acnum.* > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.hit > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.trg > -rw-rw-r-- 1 marcus bioinfo 330 Feb 17 10:39 division.lkp > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 entrynam.idx > > Added these lines in /usr/local/EMBOSS-2.8.0/emboss/emboss.default: > > DB ecoli.nt [ > type: "N" > format: "fasta" > method: "emblcd" > dir: "/var/data/blast_db/" > ] > > [blast_db]$ showdb > Displays information on the currently available databases > # Name Type ID Qry All Comment > # ==== ==== == === === ======= > ecoli.nt N OK OK OK - > > [blast_db]$ cd ~ > > [marcus]$ seqret ecoli.nt > Reads and writes (returns) sequences > Error: failed to open filename 'ecoli.nt' > Error: Unable to read sequence 'ecoli.nt' > Died: seqret terminated: Bad value for '-sequence' and no prompt > > But it works when I'm the same directory as ecoli.nt: > > [blast_db]$ seqret ecoli.nt > Reads and writes (returns) sequences > Output sequence [ae000111.fasta]: > etc... > > Clearly it must be possible to access ecoli.nt from other directories? > > > Extremly grateful for any help on this! > > Regards, > Marcus > > > From pmr at ebi.ac.uk Mon Nov 8 09:53:25 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 14:53:25 +0000 Subject: [EMBOSS] question about databank access methods In-Reply-To: <20040325165952.GC24102@bigben.ulb.ac.be> References: <20040325165952.GC24102@bigben.ulb.ac.be> Message-ID: <418F8865.4020003@ebi.ac.uk> Guy Bottu wrote: > Dear colleagues, > > I am currently using EMBOSS version 2.8.0. The manual at > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/ > adminguide/node4.html#SECTION00421000000000000000 > mentions for the databank access method "DIRECT" > > DB mydb [ > #required parameters > method: "direct" > format: "embl" > type: "N" > dir: "\$emboss_db_dir/mydb" > file: "*.dat" > #optional parameters > fields: "sv des key org" > release: "63.0" > comment: "My own database with no indices" > exclude: "est*.dat" > ] > > I tried the "exclude" parameter and it did not work. Do I miss something ? > Has someone already used it successfully ? Fixed in the next release. Not difficult to do, but needed some time to think about it. Direct access will allow exclude to specify a list of file wildcards to be excluded, and then filename to specify a list of file wildcards to be included. The exclude list has priority. Both can be a list of wildcards, for example: exclude: "est*.dat gss*.dat sts*.dat wgs*.dat" From stefan.rensing at biologie.uni-freiburg.de Mon Nov 8 09:43:25 2004 From: stefan.rensing at biologie.uni-freiburg.de (Stefan Rensing) Date: Mon, 08 Nov 2004 15:43:25 +0100 Subject: [EMBOSS] URL dbs Message-ID: <418F860D.2020308@biologie.uni-freiburg.de> Hi there, can somebody please advice me which current public servers I might use to configure URL-based emboss sequence databases? I'm mainly interested in Genbank nr and Uniprot. A syntax example would be gratefully acknowledged. Cheers, Stefan From msarachu at biol.unlp.edu.ar Mon Nov 8 10:21:26 2004 From: msarachu at biol.unlp.edu.ar (Martin Sarachu) Date: Mon, 08 Nov 2004 12:21:26 -0300 Subject: [EMBOSS] wEMBOSS-1.3 announcement Message-ID: <418F8EF6.5010509@biol.unlp.edu.ar> This is to announce version 1.3 of wEMBOSS, a web interface for EMBOSS. wEMBOSS-1.3 has a clever interface and contains Jalview and ATV applets for multiple alignment and tree visualization. It also includes wrappers4EMBOSS package as an optional install. If you choose to install also the wrappers do not forget to look at the INSTALL file in its directory for important requirements. wEMBOSS can be downloaded from http://www.wemboss.org Enjoy your wEMBOSS experience! The wEMBOSS team. -- Martin Sarachu msarachu at biol.unlp.edu.ar AR.EMBnet http://www.ar.embnet.org From d.m.a.martin at dundee.ac.uk Mon Nov 8 10:50:03 2004 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Mon, 08 Nov 2004 15:50:03 +0000 Subject: [EMBOSS] URL dbs In-Reply-To: <418F860D.2020308@biologie.uni-freiburg.de> Message-ID: On 8/11/04 2:43 pm, "Stefan Rensing" wrote: > Hi there, > > can somebody please advice me which current public servers I might use > to configure URL-based emboss sequence databases? > > I'm mainly interested in Genbank nr and Uniprot. A syntax example would > be gratefully acknowledged. > > Cheers, Stefan > > Apologies for the wrapping: DB gp [ type: P method: url format: genbank url: "http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&uid=%s&db=Protein&do pt=genpept" comment: "GenPept by IDs (gi)" ] DB genbank [ type: N method: url format: genbank url: "http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=Nucleotide&uid=%s &dopt=GenBank" comment: "Genbank by IDs (gi)" ] DB srs_embl [ type: N format: embl method: url url: "http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[embl-AllText:%s]" comment: "text search against EMBL using EBI SRS server" ] From pmr at ebi.ac.uk Mon Nov 8 11:01:20 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 16:01:20 +0000 Subject: [EMBOSS] URL dbs In-Reply-To: <418F860D.2020308@biologie.uni-freiburg.de> References: <418F860D.2020308@biologie.uni-freiburg.de> Message-ID: <418F9850.3040604@ebi.ac.uk> Stefan Rensing wrote: > Hi there, > > can somebody please advice me which current public servers I might use > to configure URL-based emboss sequence databases? > > I'm mainly interested in Genbank nr and Uniprot. A syntax example would > be gratefully acknowledged. I am about to implement access to SeqHound (seqhound.blueprint.org). They have a simple API interface that we can use as an alternative to the SRSWWW interface. The main server is in Toronto, but there may be others around. Would anyone be interested in testing it? Peter Rice From pmr at ebi.ac.uk Wed Nov 24 03:57:42 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 24 Nov 2004 08:57:42 +0000 Subject: [EMBOSS] Re: input file names In-Reply-To: References: Message-ID: <41A44D06.60500@ebi.ac.uk> Julian Mintseris wrote to emboss-bug: > Dear EMBOSS, > > I realize that this is not really a bug: > > EMBOSS programs (such as water) do not like input filenames containing > colons. > > Of course I could just rename my files, but if there is another simple > workaround, I would appreciate it if you let me know. For filenames themselves, there is a possible workaround... If the part before the : is not a database - and we do have rules for database names - or for :: if it is not a valid format, we could try to look for an existing file with that name. For output files, sorry - we would have to object that the format (no database names on output - yet) is not valid, rather than using the string as a new filename. This means EMBOSS would not be able to create sequence files with such names. We will possibly be using a "format:" syntax for other input and output files in the near future as it is a nice clean way Hope this helps. I have copied this to the emboss mailing list in case the topic is of more general interest. Peter From aengus.stewart at cancer.org.uk Wed Nov 3 13:19:13 2004 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Wed, 03 Nov 2004 13:19:13 +0000 Subject: [EMBOSS] Sequence format problem Message-ID: <4188DAD1.2060204@cancer.org.uk> Had a user complain that his sequence wasnt being recognised. It was a GCG format sequence. The error was Died: Unknown sequence type code for 'u' There were no 'u's in the sequence! After a bit of fiddling I discovered it was the * at the end of the sequence. I believe that the * appeared in GCG format sequences when a traslation was incomplete. I imagine some other people may have problems with this. Cheers Aengus -- ----------------------------------------------------------------------- Aengus Stewart Group Leader Bioinformatics at CGAL Tel: +44 (0)20 7269 3679 Cancer Research UK, Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK ----------------------------------------------------------------------- This electronic message contains information which may be privileged and confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify me by telephone or email (to the number or address above) immediately. From pmr at ebi.ac.uk Wed Nov 3 13:34:16 2004 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Wed, 3 Nov 2004 13:34:16 -0000 (GMT) Subject: [EMBOSS] Sequence format problem In-Reply-To: <4188DAD1.2060204@cancer.org.uk> References: <4188DAD1.2060204@cancer.org.uk> Message-ID: <1113.217.134.24.85.1099488856.squirrel@webmail.ebi.ac.uk> Hi Aengus, > Had a user complain that his sequence wasnt being recognised. > > It was a GCG format sequence. > > The error was > > Died: Unknown sequence type code for 'u' > > There were no 'u's in the sequence! > > After a bit of fiddling I discovered it was the * at the end of the > sequence. > > I believe that the * appeared in GCG format sequences when a traslation > was incomplete. Oops. It should be automatically converted to 'n' (for nucleotide) or 'x' for protein. I am currently going through all the sequence input types and formats, cleaning up a few issues that have been raised. Now would be a good time to send (to emboss-bug at embnet.org) any other oddities anyone may have noticed. regards, Peter Rice From areagp61 at yahoo.it Wed Nov 3 13:44:03 2004 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 3 Nov 2004 14:44:03 +0100 (CET) Subject: [EMBOSS] water exception Message-ID: <20041103134403.35588.qmail@web54206.mail.yahoo.com> Hi all, I have to align two complete mitochondrial genome sequences (about 16500 nt long) using water (emboss version 2.7.1). I have tried this alignment on a linux server with 2 Gbytes of RAM and water returns this error: "Uncaught exception: Allocation failed, insufficient memory available, raised at water.c:126" I have tried the same alignment using the embossversion 2.8.0 on a unix server with 1Gbyte of RAM and water returned the same error message. I have tried to launch the same alignment on EBI-SRS (which says to have a 2.2.0 Emboss version) and the alignment works. Is it a problem of RAM amount or a problem of EMBOSS version? Best regards Graziano ___________________________________ Nuovo Yahoo! Messenger: E' molto pi? divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it From pmr at ebi.ac.uk Wed Nov 3 14:32:49 2004 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Wed, 3 Nov 2004 14:32:49 -0000 (GMT) Subject: [EMBOSS] water exception In-Reply-To: <20041103134403.35588.qmail@web54206.mail.yahoo.com> References: <20041103134403.35588.qmail@web54206.mail.yahoo.com> Message-ID: <1181.217.134.254.120.1099492369.squirrel@webmail.ebi.ac.uk> Hi Graziano, > I have to align two complete mitochondrial genome > sequences (about 16500 nt long) using water (emboss > version 2.7.1). I have tried this alignment on a linux > server with 2 Gbytes of RAM and water returns this > error: > > "Uncaught exception: Allocation failed, insufficient > memory available, raised at water.c:126" It will be virtual memory rather than RAM. Something in your shell will limit how much memory you can have for water, and it is not limited to (or by) the physical RAM in the machine. But this is a huge alignment problem - just to get a "best local match". You could get fast results with dottup or (perhaps better) polydot. You need a large wordsize (the shortest identical sequence you expect in a real match). Polydot can save the best matches as feature tables. hope this halps, Peter From areagp61 at yahoo.it Wed Nov 3 16:05:26 2004 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 3 Nov 2004 17:05:26 +0100 (CET) Subject: Antwort: Re: [EMBOSS] water exception In-Reply-To: Message-ID: <20041103160526.57191.qmail@web54201.mail.yahoo.com> Hi David, I tried launching "limit" and these are the results: 127 /home/life> limit cputime unlimited filesize unlimited datasize unlimited stacksize unlimited coredumpsize 0 kbytes memoryuse unlimited vmemoryuse unlimited descriptors 1024 memorylocked unlimited maxproc 6132 Both memoryuse and vmemoryuse are unlimited. I am not obliged to use water, but I am curious to understand why the EBI-SRS server is able to perform this alignment and my server not. Best regards Graziano P.S. I am having troubles in sending mail to this mailing list; I regularly receive mails but I cannot send mails using outlook. Has the EMBOSS mailing list got an IP blacklist? --- David.Bauer at SCHERING.DE ha scritto: > > Hi Graziano, > > with the command "limit" (in tcsh or bash) you can > see if memoryuse and/or > vmemoryuse is limited. > You can try to increase the limit or set it to > "unlimited". > > Regards, > David. > > > > Hi Graziano, > > > I have to align two complete mitochondrial genome > > sequences (about 16500 nt long) using water > (emboss > > version 2.7.1). I have tried this alignment on a > linux > > server with 2 Gbytes of RAM and water returns this > > error: > > > > "Uncaught exception: Allocation failed, > insufficient > > memory available, raised at water.c:126" > > It will be virtual memory rather than RAM. Something > in your shell will > limit how much memory you can have for water, and it > is not limited to (or > by) the physical RAM in the machine. > > But this is a huge alignment problem - just to get a > "best local match". > > You could get fast results with dottup or (perhaps > better) polydot. You > need a large wordsize (the shortest identical > sequence you expect in a > real match). Polydot can save the best matches as > feature tables. > > hope this halps, > > Peter > > > > > > ___________________________________ Nuovo Yahoo! Messenger: E' molto pi? divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it From David.Bauer at Schering.de Wed Nov 3 15:28:09 2004 From: David.Bauer at Schering.de (David.Bauer at Schering.de) Date: Wed, 3 Nov 2004 16:28:09 +0100 Subject: Antwort: Re: [EMBOSS] water exception Message-ID: Hi Graziano, with the command "limit" (in tcsh or bash) you can see if memoryuse and/or vmemoryuse is limited. You can try to increase the limit or set it to "unlimited". Regards, David. Hi Graziano, > I have to align two complete mitochondrial genome > sequences (about 16500 nt long) using water (emboss > version 2.7.1). I have tried this alignment on a linux > server with 2 Gbytes of RAM and water returns this > error: > > "Uncaught exception: Allocation failed, insufficient > memory available, raised at water.c:126" It will be virtual memory rather than RAM. Something in your shell will limit how much memory you can have for water, and it is not limited to (or by) the physical RAM in the machine. But this is a huge alignment problem - just to get a "best local match". You could get fast results with dottup or (perhaps better) polydot. You need a large wordsize (the shortest identical sequence you expect in a real match). Polydot can save the best matches as feature tables. hope this halps, Peter From pmr at ebi.ac.uk Mon Nov 8 13:39:25 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 13:39:25 +0000 Subject: [EMBOSS] can't access databases indexed by dbifasta In-Reply-To: <1077015509.5582.15.camel@morpheus.ucc.ie> References: <1077015509.5582.15.camel@morpheus.ucc.ie> Message-ID: <418F770D.8060000@ebi.ac.uk> An old bug reoprt that was fixed at the time (by using a different database name) ... now we know the real reason. The failed database name was ecoli.nt Database names can only contain letters, numbers or underscores. The '.' in the database name means that EMBOSS fails to read it as a database name. It can read it as a file - this is why it appears to work if you are in the same directory as the database - because there is a file called ecoli.nt in that directory, and this is what EMBOSS is reading. In the next release, database names with '.' will cause a warning message when the emboss.default and .embossrc files are read. Also in the next release, database names must be at least 2 characters long (so we can use Windows filenaming conventions on windows systems). I doubt whether anyone is using single letter database names, but there may be smoe "test databases" defined which will cause a warning message in the next release. Marcus Claesson wrote: > Hello, > > I have a silly little problem indexing databases in Emboss-2.8.0. After > running dbifasta and adding DB entries in emboss.default I can only > access the database when being in the same directory as the fasta file. > Here is what I did: > > [blast_db]$ uname -a > Linux neo.ucc.ie 2.4.9-e.35enterprise #1 SMP Tue Dec 23 00:06:16 EST > 2003 i686 unknown > > [blast_db]$ pwd > /var/data/blast_db > > [blast_db]$ ll ecoli.nt > -rw-r--r-- 1 marcus bioinfo 4763013 Jan 15 01:38 ecoli.nt > > [blast_db]$ dbifasta > Index a fasta database > simple : >ID > idacc : >ID ACC > gcgid : >db:ID > gcgidacc : >db:ID ACC > dbid : >db ID > ncbi : | formats > ID line format [idacc]: > Database directory [.]: /var/data/blast_db > Wildcard database filename [*.dat]: ecoli.nt > Database name: ecoli.nt > Release number [0.0]: > Index date [00/00/00]: > > [blast_db]$ ll entrynam.idx division.lkp acnum.* > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.hit > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.trg > -rw-rw-r-- 1 marcus bioinfo 330 Feb 17 10:39 division.lkp > -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 entrynam.idx > > Added these lines in /usr/local/EMBOSS-2.8.0/emboss/emboss.default: > > DB ecoli.nt [ > type: "N" > format: "fasta" > method: "emblcd" > dir: "/var/data/blast_db/" > ] > > [blast_db]$ showdb > Displays information on the currently available databases > # Name Type ID Qry All Comment > # ==== ==== == === === ======= > ecoli.nt N OK OK OK - > > [blast_db]$ cd ~ > > [marcus]$ seqret ecoli.nt > Reads and writes (returns) sequences > Error: failed to open filename 'ecoli.nt' > Error: Unable to read sequence 'ecoli.nt' > Died: seqret terminated: Bad value for '-sequence' and no prompt > > But it works when I'm the same directory as ecoli.nt: > > [blast_db]$ seqret ecoli.nt > Reads and writes (returns) sequences > Output sequence [ae000111.fasta]: > etc... > > Clearly it must be possible to access ecoli.nt from other directories? > > > Extremly grateful for any help on this! > > Regards, > Marcus > > > From pmr at ebi.ac.uk Mon Nov 8 14:53:25 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 14:53:25 +0000 Subject: [EMBOSS] question about databank access methods In-Reply-To: <20040325165952.GC24102@bigben.ulb.ac.be> References: <20040325165952.GC24102@bigben.ulb.ac.be> Message-ID: <418F8865.4020003@ebi.ac.uk> Guy Bottu wrote: > Dear colleagues, > > I am currently using EMBOSS version 2.8.0. The manual at > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/ > adminguide/node4.html#SECTION00421000000000000000 > mentions for the databank access method "DIRECT" > > DB mydb [ > #required parameters > method: "direct" > format: "embl" > type: "N" > dir: "\$emboss_db_dir/mydb" > file: "*.dat" > #optional parameters > fields: "sv des key org" > release: "63.0" > comment: "My own database with no indices" > exclude: "est*.dat" > ] > > I tried the "exclude" parameter and it did not work. Do I miss something ? > Has someone already used it successfully ? Fixed in the next release. Not difficult to do, but needed some time to think about it. Direct access will allow exclude to specify a list of file wildcards to be excluded, and then filename to specify a list of file wildcards to be included. The exclude list has priority. Both can be a list of wildcards, for example: exclude: "est*.dat gss*.dat sts*.dat wgs*.dat" From stefan.rensing at biologie.uni-freiburg.de Mon Nov 8 14:43:25 2004 From: stefan.rensing at biologie.uni-freiburg.de (Stefan Rensing) Date: Mon, 08 Nov 2004 15:43:25 +0100 Subject: [EMBOSS] URL dbs Message-ID: <418F860D.2020308@biologie.uni-freiburg.de> Hi there, can somebody please advice me which current public servers I might use to configure URL-based emboss sequence databases? I'm mainly interested in Genbank nr and Uniprot. A syntax example would be gratefully acknowledged. Cheers, Stefan From msarachu at biol.unlp.edu.ar Mon Nov 8 15:21:26 2004 From: msarachu at biol.unlp.edu.ar (Martin Sarachu) Date: Mon, 08 Nov 2004 12:21:26 -0300 Subject: [EMBOSS] wEMBOSS-1.3 announcement Message-ID: <418F8EF6.5010509@biol.unlp.edu.ar> This is to announce version 1.3 of wEMBOSS, a web interface for EMBOSS. wEMBOSS-1.3 has a clever interface and contains Jalview and ATV applets for multiple alignment and tree visualization. It also includes wrappers4EMBOSS package as an optional install. If you choose to install also the wrappers do not forget to look at the INSTALL file in its directory for important requirements. wEMBOSS can be downloaded from http://www.wemboss.org Enjoy your wEMBOSS experience! The wEMBOSS team. -- Martin Sarachu msarachu at biol.unlp.edu.ar AR.EMBnet http://www.ar.embnet.org From d.m.a.martin at dundee.ac.uk Mon Nov 8 15:50:03 2004 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Mon, 08 Nov 2004 15:50:03 +0000 Subject: [EMBOSS] URL dbs In-Reply-To: <418F860D.2020308@biologie.uni-freiburg.de> Message-ID: On 8/11/04 2:43 pm, "Stefan Rensing" wrote: > Hi there, > > can somebody please advice me which current public servers I might use > to configure URL-based emboss sequence databases? > > I'm mainly interested in Genbank nr and Uniprot. A syntax example would > be gratefully acknowledged. > > Cheers, Stefan > > Apologies for the wrapping: DB gp [ type: P method: url format: genbank url: "http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&uid=%s&db=Protein&do pt=genpept" comment: "GenPept by IDs (gi)" ] DB genbank [ type: N method: url format: genbank url: "http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=Nucleotide&uid=%s &dopt=GenBank" comment: "Genbank by IDs (gi)" ] DB srs_embl [ type: N format: embl method: url url: "http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[embl-AllText:%s]" comment: "text search against EMBL using EBI SRS server" ] From pmr at ebi.ac.uk Mon Nov 8 16:01:20 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Nov 2004 16:01:20 +0000 Subject: [EMBOSS] URL dbs In-Reply-To: <418F860D.2020308@biologie.uni-freiburg.de> References: <418F860D.2020308@biologie.uni-freiburg.de> Message-ID: <418F9850.3040604@ebi.ac.uk> Stefan Rensing wrote: > Hi there, > > can somebody please advice me which current public servers I might use > to configure URL-based emboss sequence databases? > > I'm mainly interested in Genbank nr and Uniprot. A syntax example would > be gratefully acknowledged. I am about to implement access to SeqHound (seqhound.blueprint.org). They have a simple API interface that we can use as an alternative to the SRSWWW interface. The main server is in Toronto, but there may be others around. Would anyone be interested in testing it? Peter Rice From pmr at ebi.ac.uk Wed Nov 24 08:57:42 2004 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 24 Nov 2004 08:57:42 +0000 Subject: [EMBOSS] Re: input file names In-Reply-To: References: Message-ID: <41A44D06.60500@ebi.ac.uk> Julian Mintseris wrote to emboss-bug: > Dear EMBOSS, > > I realize that this is not really a bug: > > EMBOSS programs (such as water) do not like input filenames containing > colons. > > Of course I could just rename my files, but if there is another simple > workaround, I would appreciate it if you let me know. For filenames themselves, there is a possible workaround... If the part before the : is not a database - and we do have rules for database names - or for :: if it is not a valid format, we could try to look for an existing file with that name. For output files, sorry - we would have to object that the format (no database names on output - yet) is not valid, rather than using the string as a new filename. This means EMBOSS would not be able to create sequence files with such names. We will possibly be using a "format:" syntax for other input and output files in the near future as it is a nice clean way Hope this helps. I have copied this to the emboss mailing list in case the topic is of more general interest. Peter