From tmargus at ebc.ee Tue Jun 4 10:46:05 2002 From: tmargus at ebc.ee (=?iso-8859-1?Q?T=F5nu_Margus?=) Date: Tue, 4 Jun 2002 17:46:05 +0300 Subject: problems with dbiblast Message-ID: <008e01c20bd6$8e616690$1e1728c1@ebc.ee> Hi, I have problems with blast database formating for emboss. dbiblast gives an unexpected error. It starts to look file swissprot.phr.pin not swissprot.pin. Waht could be a reason and how to overcome? here is program output form screen tmargus at kobra:blast$ dbiblast Index a BLAST database Database name: swissprot Database directory [.]: Wildcard database filename [swissprot]: swissprot* Release number [0.0]: 40 Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: P 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 EMBOSS An error in dbiblast.c at line 640: cannot open ./swissprot.phr table file ./swissprot.phr.pin Sincerely Yours T?nu Margus Estonian Biocentre Riia 23 Tartu 51010 Estonia E-mail tmargus at ebc.ee -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20020604/9664aa28/attachment.html From mathog at mendel.bio.caltech.edu Tue Jun 4 11:36:21 2002 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Tue, 04 Jun 2002 08:36:21 -0700 Subject: BLAST, X vs. U, and EMBOSS Message-ID: In nr there is an entry with gi= 2018236 which ends with: gcxg When nr is made into a BLAST database with % formatdb -i nr -p T -o T the database can be dumped out again with % fastacmd -d nr -D The problem is when it comes out that sequence fragment is now GCUG This causes problems when piping the output of fastacmd into fuzzpro (for instance) because fuzzpro complains that this sequence is not a protein - because of the U. This can be verified by inserting a tr 'U' 'X' into the data stream, which lets fuzzpro digest that particular entry. So what to do? Modify EMBOSS to add a switch to allow "U" to be part of the protein alphabet or lobby the NCBI to use 'X' for unknown in blast databases? I lean pretty strongly towards the latter option, although getting the NCBI to change this may be difficult. In this regard it's also interesting that there is another problem with any sequence containing an X (U) in a BLAST database. To see what happens BLAST the NCBI sequence (with the x) against the BLAST database containing the same entry. The sequence alignment terminates at the C before the X/U. That isn't very informative because BLAST alignments terminate there no matter what letter I substitute in for X (ie, Y,L,etc.) So try another sequence containing a U, this time towards the middle of the sequence, for instance emb|CAC39234.1| (AJ312124) FdhA-I protein. BLAST that against nr and you find this odd alignment: LX-H L H L-UH and it does that even if the query uses a U instead of an X. Here's the relevant fragment if you'd like to verify this for yourself: YLFQKLLRAVVGTNNVDHCARLXHASTVAGLATTLGSGAMT Now, what happens here if the query sequence is mutated X->Y? The exact same alignment, with X->Y. Are we all agreed that this is a BLAST bug? X against any valid amino acid or against X should be a mismatch (since without further information there's a 19/20 probability it is a mismatch and only a 1/20 change it is a match) but there's no reason to insert gaps in the alignment at this point. So it should have been: LXH L H LUH This was with blast versions 2.2.2 (local) and 2.2.3 (NCBI server). Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From mathog at mendel.bio.caltech.edu Tue Jun 4 14:34:25 2002 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Tue, 04 Jun 2002 11:34:25 -0700 Subject: BLAST, X vs. U, and EMBOSS Message-ID: > In nr there is an entry with gi= 2018236 which ends with: Sorry, I dropped a character in the cut and paste, it's: 12018236 Tao Tao points out that U is Iupac for selenocysteine, see: http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html#AA212 This gets very confusing because entrez returns Genbank format with U->X, but fasta (and ASN.1) with U as U. Which protein alphabet is EMBOSS supposed to recognize for protein? And all that aside, X vs. X or X vs. U in blastp really does introduce two unnecessary gaps in the alignment, which can be easily demonstrated with bl2seq on gi 14250938 vs. itself. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From oddmund.nordgard at biokjemi.uio.no Wed Jun 5 02:57:59 2002 From: oddmund.nordgard at biokjemi.uio.no (=?ISO8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Wed, 5 Jun 2002 08:57:59 +0200 (MET DST) Subject: Multiple plots on one page Message-ID: Hello! I have been trying to get more than one plot per page with postscript output from emboss. Have tried both psnup and mpage, but the results are not as expected. Strange things happen. Is there a solution to this? Oddmund ******************************************* Oddmund Nordg?rd Bj?rnev. 23 4323 SANDNES Tlf.: 51 67 25 65 Mob: 48 20 51 72 Tlf. arb.: 51 87 54 94 ******************************************** Registered linux user #44149 From David.Bauer at SCHERING.DE Wed Jun 5 03:40:31 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 5 Jun 2002 09:40:31 +0200 Subject: Antwort: Multiple plots on one page Message-ID: Hi, I have observed that the postscript outputs from emboss mostly (always ?) have an additional empty page definition at the end: ################ %%Page: 2 2 bop S 0.0000 G S 0.0000 G S 0.0000 G %%Trailer %%Pages: 2 @end ############### Maybe this confuses psnup and mpage ? David. =?ISO8859-1?Q?Oddmund_No rdg=E5rd? = Kopie: Gesendet von: Thema: Multiple plots on one page owner-emboss at hgmp.mrc.ac .uk 05.06.02 08:57 Hello! I have been trying to get more than one plot per page with postscript output from emboss. Have tried both psnup and mpage, but the results are not as expected. Strange things happen. Is there a solution to this? Oddmund ******************************************* Oddmund Nordg?rd Bj?rnev. 23 4323 SANDNES Tlf.: 51 67 25 65 Mob: 48 20 51 72 Tlf. arb.: 51 87 54 94 ******************************************** Registered linux user #44149 From mad at biol.unlp.edu.ar Fri Jun 7 07:42:08 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Fri, 07 Jun 2002 14:42:08 +0300 Subject: dbiblast problem Message-ID: <3D009C10.ACF8FC@biol.unlp.edu.ar> Hi, anyone found a solution for the dbiblast problem? I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > % dbiblast > Index a BLAST database > Database name: humrep.fsa > Database directory [.]: > Wildcard database filename [humrep.fsa]: humrep.fsa.* > Release number [0.0]: > Index date [00/00/00]: > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > > EMBOSS An error in dbiblast.c at line 640: > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin I also get a similar error when trying with nr (prot) database > $ dbiblast > Index a BLAST database > Database name: nr > Database directory [.]: > Wildcard database filename [nr]: nr.* > Release number [0.0]: > Index date [00/00/00]: > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: P > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > > EMBOSS An error in dbiblast.c at line 640: > cannot open ./nr.phr table file ./nr.phr.pin martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ztu at msi.umn.edu Fri Jun 7 13:49:37 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 7 Jun 2002 12:49:37 -0500 (CDT) Subject: dbiblast problem In-Reply-To: <3D009C10.ACF8FC@biol.unlp.edu.ar> Message-ID: You may need to run formatdb first then run dbiblast. formatdb -i input_seq -p T -o T (for nucleic acid) formatdb -i input_seq -p F -o T (for protein) Thanks, Tu ============================================================ On Fri, 7 Jun 2002, Martin Sarachu wrote: > Hi, > > anyone found a solution for the dbiblast problem? > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > % dbiblast > > Index a BLAST database > > Database name: humrep.fsa > > Database directory [.]: > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > Release number [0.0]: > > Index date [00/00/00]: > > N : nucleic > > P : protein > > ? : unknown > > Sequence type [unknown]: N > > 1 : wublast and setdb/pressdb > > 2 : formatdb > > 0 : unknown > > Blast index version [unknown]: 2 > > > > EMBOSS An error in dbiblast.c at line 640: > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > I also get a similar error when trying with nr (prot) database > > > $ dbiblast > > Index a BLAST database > > Database name: nr > > Database directory [.]: > > Wildcard database filename [nr]: nr.* > > Release number [0.0]: > > Index date [00/00/00]: > > N : nucleic > > P : protein > > ? : unknown > > Sequence type [unknown]: P > > 1 : wublast and setdb/pressdb > > 2 : formatdb > > 0 : unknown > > Blast index version [unknown]: 2 > > > > EMBOSS An error in dbiblast.c at line 640: > > cannot open ./nr.phr table file ./nr.phr.pin > > > martin > > -- > Martin Sarachu > mad at biol.unlp.edu.ar > EMBnet Argentina > http://www.ar.embnet.org > From mad at biol.unlp.edu.ar Fri Jun 7 07:52:47 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Fri, 07 Jun 2002 14:52:47 +0300 Subject: dbiblast problem References: Message-ID: <3D009E8F.5CD201E4@biol.unlp.edu.ar> yes, I did that before trying dbiblast martin Zheng Jin Tu wrote: > > You may need to run formatdb first then run dbiblast. > > formatdb -i input_seq -p T -o T (for nucleic acid) > formatdb -i input_seq -p F -o T (for protein) > > Thanks, > > Tu > ============================================================ > > On Fri, 7 Jun 2002, Martin Sarachu wrote: > > > Hi, > > > > anyone found a solution for the dbiblast problem? > > > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > > > % dbiblast > > > Index a BLAST database > > > Database name: humrep.fsa > > > Database directory [.]: > > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > > Release number [0.0]: > > > Index date [00/00/00]: > > > N : nucleic > > > P : protein > > > ? : unknown > > > Sequence type [unknown]: N > > > 1 : wublast and setdb/pressdb > > > 2 : formatdb > > > 0 : unknown > > > Blast index version [unknown]: 2 > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > > > > I also get a similar error when trying with nr (prot) database > > > > > $ dbiblast > > > Index a BLAST database > > > Database name: nr > > > Database directory [.]: > > > Wildcard database filename [nr]: nr.* > > > Release number [0.0]: > > > Index date [00/00/00]: > > > N : nucleic > > > P : protein > > > ? : unknown > > > Sequence type [unknown]: P > > > 1 : wublast and setdb/pressdb > > > 2 : formatdb > > > 0 : unknown > > > Blast index version [unknown]: 2 > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > cannot open ./nr.phr table file ./nr.phr.pin > > > > > > martin > > > > -- > > Martin Sarachu > > mad at biol.unlp.edu.ar > > EMBnet Argentina > > http://www.ar.embnet.org > > -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ztu at msi.umn.edu Fri Jun 7 14:13:32 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 7 Jun 2002 13:13:32 -0500 (CDT) Subject: dbiblast problem In-Reply-To: <3D009E8F.5CD201E4@biol.unlp.edu.ar> Message-ID: Oh ! Yes, my emboss runs same problem. Before, I find when dbiblast indexed database. seqret is only retrive first 10000 line of records. After that give error message. I have no idea whether this fixed or not at new emboss releases. I am now using dbiflat to do that. It needs flat file instead of fasta file so more space consuming. It is my experience. Thanks, Tu ==================================================================== On Fri, 7 Jun 2002, Martin Sarachu wrote: > yes, I did that before trying dbiblast > > martin > > Zheng Jin Tu wrote: > > > > You may need to run formatdb first then run dbiblast. > > > > formatdb -i input_seq -p T -o T (for nucleic acid) > > formatdb -i input_seq -p F -o T (for protein) > > > > Thanks, > > > > Tu > > ============================================================ > > > > On Fri, 7 Jun 2002, Martin Sarachu wrote: > > > > > Hi, > > > > > > anyone found a solution for the dbiblast problem? > > > > > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > > > > > % dbiblast > > > > Index a BLAST database > > > > Database name: humrep.fsa > > > > Database directory [.]: > > > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > > > Release number [0.0]: > > > > Index date [00/00/00]: > > > > N : nucleic > > > > P : protein > > > > ? : unknown > > > > Sequence type [unknown]: N > > > > 1 : wublast and setdb/pressdb > > > > 2 : formatdb > > > > 0 : unknown > > > > Blast index version [unknown]: 2 > > > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > > > > > > > I also get a similar error when trying with nr (prot) database > > > > > > > $ dbiblast > > > > Index a BLAST database > > > > Database name: nr > > > > Database directory [.]: > > > > Wildcard database filename [nr]: nr.* > > > > Release number [0.0]: > > > > Index date [00/00/00]: > > > > N : nucleic > > > > P : protein > > > > ? : unknown > > > > Sequence type [unknown]: P > > > > 1 : wublast and setdb/pressdb > > > > 2 : formatdb > > > > 0 : unknown > > > > Blast index version [unknown]: 2 > > > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > > cannot open ./nr.phr table file ./nr.phr.pin > > > > > > > > > martin > > > > > > -- > > > Martin Sarachu > > > mad at biol.unlp.edu.ar > > > EMBnet Argentina > > > http://www.ar.embnet.org > > > > > -- > Martin Sarachu > mad at biol.unlp.edu.ar > EMBnet Argentina > http://www.ar.embnet.org > From sgmd at genetik.fu-berlin.de Mon Jun 10 11:13:34 2002 From: sgmd at genetik.fu-berlin.de (Thomas Siegmund) Date: Mon, 10 Jun 2002 17:13:34 +0200 Subject: EMBOSS.kaptn - X11/KDE GUI for EMBOSS 2.4.1 and EPHYLIP-3.573c Message-ID: <200206101713.34800.sgmd@genetik.fu-berlin.de> Dear EMBOSS list, a new version of the Kaptain user interface for EMBOSS is finished. In addition to EMBOSS the current EMBOSS.kaptn 0.80 supports the complete EPHYLIP 3.5. There are not so many GUIs for Phylip out there, aren't they? As usual, you will find a compressed tar archive for download, additional information, and a few screenshots at http://userpage.fu-berlin.de/~sgmd . From the ChangeLog: Version 0.80 - Finished support for EPHYLIP-3.573c ---------------------------------------------- - new: econtml, econtrast.kaptn - modified install script for systems where the nedit client has a name different from "nc". Thanks to P. Martineau for the hint! Version 0.79 - new: ednaml.kaptn, ednaml.kaptn, ednacomp.kaptn eclique.kaptn, econsense.kaptn Have fun Thomas -- Thomas Siegmund Freie Universit?t Berlin Institut f?r Genetik Arnimallee 7 14195 Berlin Germany Tel: +49 30 838 54868 Fax: +49 30 838 54395 http://userpage.fu-berlin.de/~sgmd From ztu at msi.umn.edu Wed Jun 12 17:08:26 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Wed, 12 Jun 2002 16:08:26 -0500 (CDT) Subject: How do I specific human genome CHR_01 database at .embossrc file Message-ID: I have downloaded human genome CHR_01 data from ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_01 directory with hs_chr1.asn.gz, hs_chr1.fa.gz, hs_chr1.gbk.gz, hs_chr1.gbs.gz, hs_chr1.mfa.gz files. After gunzip them, I use hs_chr1.gbk to run dbiflat, it works fine and creates these files: division.lkp entrynam.idx acnum.trg acnum.hit. Question is come down how do I specify this database at .embossrc for these fields especially for method: method: ??? format: genbank ? Any suggestion is great appreciated. Thanks, Tu -------------------------------------- Zheng Jin Tu Supercomputing Institute University of Minnesota -------------------------------------- From charles at moulinette.dyndns.org Fri Jun 14 09:48:09 2002 From: charles at moulinette.dyndns.org (Charles Plessy) Date: Fri, 14 Jun 2002 15:48:09 +0200 Subject: How do I specific human genome CHR_01 database at .embossrc file In-Reply-To: References: Message-ID: <20020614134809.GA13692@moulinette.dyndns.org> Hi, I used home sequences so I had to reformat the headers to something "ncbi compliant" like lcl|foo. see the formatdb readme for more info : http://www.inb.mu-luebeck.de/biosoft/blast/README.formatdb I had to use formatdb vith the option -o T , to generate more files. I then used dbiflat normally Hese is an extract of my embossrc : DB ngn [ type: N method: blast format: ncbi dir: /home/charles/bioinfo/blast indexdir: /home/charles/bioinfo/blast/ngn file: "ngn" release: "0.0" ] Charles From ztu at msi.umn.edu Fri Jun 14 10:03:59 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 14 Jun 2002 09:03:59 -0500 (CDT) Subject: How do I specific human genome CHR_01 database at .embossrc file In-Reply-To: <20020614134809.GA13692@moulinette.dyndns.org> Message-ID: Really thanks for these kindly people reply for my message. Most people sugguest method: emblcd format: genbank It works well with this definition. Best regards, Zheng Jin Tu Supercomputing Institute University of Minnesota ________________________________________________________________________ On Fri, 14 Jun 2002, Charles Plessy wrote: > Hi, > > I used home sequences so I had to reformat the headers to something > "ncbi compliant" like lcl|foo. > > see the formatdb readme for more info : > http://www.inb.mu-luebeck.de/biosoft/blast/README.formatdb > > I had to use formatdb vith the option -o T , to generate more files. > > I then used dbiflat normally > > Hese is an extract of my embossrc : > > DB ngn [ > type: N > method: blast > format: ncbi > dir: /home/charles/bioinfo/blast > indexdir: /home/charles/bioinfo/blast/ngn > file: "ngn" > release: "0.0" > ] > > Charles > From mad at biol.unlp.edu.ar Sun Jun 16 11:20:15 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Sun, 16 Jun 2002 18:20:15 +0300 Subject: problem with dbifasta and long files? Message-ID: <3D0CACAF.E51FD0D6@biol.unlp.edu.ar> Hi, I'm having an "cannot read" error when trying to index with dbifasta. Here is the output > $ ls -lsa > . > . > . > 5894336 -rw-r--r-- 1 martin 6032840325 Jun 15 17:10 nt > . > $ dbifasta -debug > Index a fasta database > simple : >ID > idacc : >ID ACC > gcgid : >db:ID > gcgidacc : >db:ID ACC > dbid : >db ID > ncbi : | formats > ID line format [idacc]: ncbi > Database directory [.]: > Wildcard database filename [*.dat]: nt > Database name: nt > Release number [0.0]: > Index date [00/00/00]: > > EMBOSS An error in embdbi.c at line 295: > Cannot open ./nt for reading > > $ I'm also sending dbifasta.dbg just in case martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org -------------- next part -------------- acdArgsScan acdDebug Yes acdDoHelp No ajFileNewIn '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' ajNamResolve of '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' EOF ajFileGetsL file /usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd closing file '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' acdSetDefC auto 'N' 2bec8 acdSetDefC stdout 'N' 2bf68 acdSetDefC filter 'N' 2c008 acdSetDefC options 'N' 2c0a8 acdSetDefC debug 'N' 2c148 acdSetDefC acdlog 'N' 2c1e8 acdSetDefC acdpretty 'N' 2c288 acdSetDefC acdtable 'N' 2c328 acdSetDefC help 'N' 2c3c8 acdSetDefC verbose 'N' 2c468 acdSetDefC warning 'Y' 2c508 acdSetDefC error 'Y' 2c5a8 acdSetDefC fatal 'Y' 2c648 acdSetDefC die 'N' 2cf60 acdSetVarDef today '16/06/02' ff2650cc acdNewSec 'input' acdSecList length 0 acdNewSec acdSecList push 'input' new length 1 acdSet attr 'idformat' val 'required' type '' acdSet attr 'idformat' val 'information' type '' acdSet attr 'idformat' val 'values' type '' acdSet attr 'idformat' val 'maximum' type '' acdSet attr 'idformat' val 'minimum' type '' acdSet attr 'idformat' val 'default' type '' acdSet attr 'idformat' val 'delimiter' type '' acdSet attr 'idformat' val 'codedelimiter' type '' acdSet attr 'directory' val 'required' type '' acdSet attr 'directory' val 'default' type '' acdSet attr 'directory' val 'information' type '' acdSet attr 'filenames' val 'required' type '' acdSet attr 'filenames' val 'default' type '' acdSet attr 'filenames' val 'information' type '' acdNewEndsec 'input' acdSecList length 1 Pop from acdSecList 'input' new length 0 acdNewSec 'required' acdSecList length 0 acdNewSec acdSecList push 'required' new length 1 acdSet attr 'dbname' val 'parameter' type '' acdSet attr 'dbname' val 'maxlength' type '' acdSet attr 'dbname' val 'minlength' type '' acdSet attr 'dbname' val 'information' type '' acdSet attr 'release' val 'required' type '' acdSet attr 'release' val 'default' type '' acdSet attr 'release' val 'maxlength' type '' acdSet attr 'release' val 'information' type '' acdSet attr 'date' val 'required' type '' acdSet attr 'date' val 'default' type '' acdSet attr 'date' val 'valid' type '' acdSet attr 'date' val 'information' type '' acdSet attr 'date' val 'pattern' type '' acdNewEndsec 'required' acdSecList length 1 Pop from acdSecList 'required' new length 0 acdNewSec 'advanced' acdSecList length 0 acdNewSec acdSecList push 'advanced' new length 1 acdSet attr 'fields' val 'required' type '' acdSet attr 'fields' val 'information' type '' acdSet attr 'fields' val 'values' type '' acdSet attr 'fields' val 'minimum' type '' acdSet attr 'fields' val 'maximum' type '' acdSet attr 'fields' val 'default' type '' acdSet attr 'exclude' val 'information' type '' acdSet attr 'indexdirectory' val 'default' type '' acdSet attr 'indexdirectory' val 'information' type '' acdSet attr 'maxindex' val 'default' type '' acdSet attr 'maxindex' val 'minimum' type '' acdSet attr 'maxindex' val 'information' type '' acdSet attr 'sortoptions' val 'default' type '' acdSet attr 'sortoptions' val 'information' type '' acdSet attr 'sortoptions' val 'help' type '' acdSet attr 'systemsort' val 'default' type '' acdSet attr 'systemsort' val 'information' type '' acdSet attr 'cleanup' val 'default' type '' acdSet attr 'cleanup' val 'information' type '' acdNewEndsec 'advanced' acdSecList length 1 Pop from acdSecList 'advanced' new length 0 -- All Done : acdSecList length 0 acdSet attr 'dbname' val 'required' type '' acdDef debug 'Y' 2c148 acdSetDef debug 'Y' 2c148 acdSetBool -auto def: N acdSetQualAppl 'auto' acdSetBool -auto val: No acdSetBool -stdout def: N acdSetQualAppl 'stdout' acdSetBool -stdout val: No acdSetBool -filter def: N acdSetQualAppl 'filter' acdSetBool -filter val: No acdSetBool -options def: N acdSetQualAppl 'options' acdSetBool -options val: No acdSetBool -debug def: Y acdSetQualAppl 'debug' acdSetBool -debug val: Yes acdSetBool -acdlog def: N acdSetQualAppl 'acdlog' acdSetBool -acdlog val: No acdSetBool -acdpretty def: N acdSetQualAppl 'acdpretty' acdSetBool -acdpretty val: No acdSetBool -acdtable def: N acdSetQualAppl 'acdtable' acdSetBool -acdtable val: No acdSetBool -help def: N acdSetQualAppl 'help' acdSetBool -help val: No acdHelp No acdSetBool -verbose def: N acdSetQualAppl 'verbose' acdSetBool -verbose val: No acdSetBool -warning def: Y acdSetQualAppl 'warning' acdSetBool -warning val: Yes acdSetBool -error def: Y acdSetQualAppl 'error' acdSetBool -error val: Yes acdSetBool -fatal def: Y acdSetQualAppl 'fatal' acdSetBool -fatal val: Yes acdSetBool -die def: N acdSetQualAppl 'die' acdSetBool -die val: No acdUserGet 'idformat' reply 'idacc' acdUserGet 'idformat' defreply 'idacc' msg 'ID line format' ajUserGet buffer len: 5 res: 2048 ptr: 30bc8 testing 'ncbi' Found 1 matches OK: Y min: 1 max: 1 Accept: 'ncbi' Found 1 matches Menu length now 0 before return val[0] 'ncbi' before return val[0] 'ncbi' Storing val[0] 'ncbi' acdUserGet 'directory' reply '.' acdUserGet 'directory' defreply '.' msg 'Database directory' ajUserGet buffer len: 1 res: 2048 ptr: 31418 acdUserGet 'filenames' reply '*.dat' acdUserGet 'filenames' defreply '*.dat' msg 'Wildcard database filename' ajUserGet buffer len: 5 res: 2048 ptr: 31418 acdUserGet 'dbname' reply '' acdUserGet 'dbname' defreply '' msg 'Database name' ajUserGet buffer len: 0 res: 2048 ptr: 31418 acdUserGet 'release' reply '0.0' acdUserGet 'release' defreply '0.0' msg 'Release number' ajUserGet buffer len: 3 res: 2048 ptr: 31418 acdUserGet 'date' reply '00/00/00' acdUserGet 'date' defreply '00/00/00' msg 'Index date' ajUserGet buffer len: 8 res: 2048 ptr: 31418 testing 'acnum' Found 1 matches OK: Y min: 0 max: 5 Accept: 'acnum' Found 1 matches Menu length now 0 before return val[0] 'acnum' before return val[0] 'acnum' Storing val[0] 'acnum' acdSetBool -systemsort def: Y acdSetQualAppl 'systemsort' acdSetBool -systemsort val: Yes acdSetBool -cleanup def: Y acdSetQualAppl 'cleanup' acdSetBool -cleanup val: Yes reading './nt' writing './' embDbiFileListExc dir '.' wildfile 'nt' exclude '' dirfix './' ajFileTestSkip: file '.' exclude: '' include: 'nt' ajFileTestSkip: file '..' exclude: '' include: 'nt' ajFileTestSkip: file 'update.log' exclude: '' include: 'nt' ajFileTestSkip: file 'README' exclude: '' include: 'nt' ajFileTestSkip: file 'update.error' exclude: '' include: 'nt' ajFileTestSkip: file 'release.error' exclude: '' include: 'nt' ajFileTestSkip: file 'release.log' exclude: '' include: 'nt' ajFileTestSkip: file 'nt' exclude: '' include: 'nt' ajFileTestSkip: file 'nt' included by 'nt' accept 'nt' ajFileTestSkip: file 'fmerge.log' exclude: '' include: 'nt' ajFileTestSkip: file 'dbifasta.dbg' exclude: '' include: 'nt' ajFileTestSkip: file 'index.nr' exclude: '' include: 'nt' ajFileTestSkip: file 'division.lkp' exclude: '' include: 'nt' ajFileTestSkip: file 'entrynam.idx' exclude: '' include: 'nt' ajFileTestSkip: file 'acnum.trg' exclude: '' include: 'nt' ajFileTestSkip: file 'acnum.hit' exclude: '' include: 'nt' 1 files for '.' 'nt' ajFileNewIn './nt' ajNamResolve of './nt' From AUnderwood at PHLS.org.uk Tue Jun 18 06:18:07 2002 From: AUnderwood at PHLS.org.uk (AUnderwood at PHLS.org.uk) Date: Tue, 18 Jun 2002 11:18:07 +0100 Subject: Compilation problem Message-ID: Hi all, When installing EMBOSS I have the following error message after typing the make command: make[2]: Entering directory `/tmp/EMBOSS-2.4.1/emboss' /bin/sh ../libtool --mode=link gcc -g -O2 -o abiview abiview.o ../nucleus/libnucleus.la ../ajax/libajaxg.la ../ajax/libajax.la ../plplot/libplplot.la -lX11 -lm gcc -g -O2 -o .libs/abiview abiview.o ../nucleus/.libs/libnucleus.so ../ajax/.libs/libajaxg.so ../ajax/.libs/libajax.so ../plplot/.libs/libplplot.so -lX11 -lm -Wl,--rpath -Wl,/usr/local/lib /usr/bin/ld: cannot find -lX11 collect2: ld returned 1 exit status make[2]: *** [abiview] Error 1 make[2]: Leaving directory `/tmp/EMBOSS-2.4.1/emboss' make[1]: *** [install-recursive] Error 1 make[1]: Leaving directory `/tmp/EMBOSS-2.4.1/emboss' make: *** [install-recursive] Error 1 By searching newsgroups I understand that this could be solved by adding the line -L/usr/X11R6/lib somewhere in the makefile. I am running Linux redhat 7.3. Please can you tell me how to solve this problem. Many thanks, Dr Anthony Underwood Bioinformatics Unit Central Public Health Laboratory 61 Colindale Avenue London NW9 5HT t: 0208 2004400 ext. 3618 f: 0208 3583138 e: aunderwood at phls.org.uk ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the PHLS, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. For information on how to send data to the PHLS in encrypted form via E.Mail, visit www.phls.org.uk. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.phls.org.uk ************************************************************************** From tmargus at ebc.ee Wed Jun 19 09:31:28 2002 From: tmargus at ebc.ee (=?iso-8859-1?Q?T=F5nu_Margus?=) Date: Wed, 19 Jun 2002 16:31:28 +0300 Subject: formatin EMBL to fasta for Blast. hum01.dat is large than 2GB Message-ID: <003901c21795$9dc478a0$1e1728c1@ebc.ee> Hi all, It is not exactly emboss question but is related. I try to format EMBL rel 71.0 for blast First step is converting in to FastA format I am using sp2fasta Unfortunately it can't manage files what are lager than 2GB ( I am running this stuff on Linux 2.4 Slackware 8.0 ) And I didn't figure out how to send standard input as input for sp2fasta ( cat hum01.dat | sp2fasta didn't wokr) Does someone know how to overcome this problem? Is there some newer compilations of sp2fasta or some an other tool? Thanks in advance T?nu Margus Estonian Biocentre Riia 23 Tartu 51010 Estonia -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20020619/6fc8fe35/attachment.html From ableasby at hgmp.mrc.ac.uk Wed Jun 19 09:34:10 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 19 Jun 2002 14:34:10 +0100 (BST) Subject: formatin EMBL to fasta for Blast. hum01.dat is large than 2GB Message-ID: <200206191334.OAA20044@bromine.hgmp.mrc.ac.uk> This is becoming a common sort of question with the new EMBL release. EMBOSS (on many operating systems) can cope with files >2Gb if it is configured with the --enable-large option. Of course you need to 'make clean' and 'make' again afterwards. It certainly works for RedHat and SuSE so it may well do so with Slackware. If it does then you could use 'seqret' to do the fasta conversion. HTH Alan Bleasby HGMP From gbottu at ben.vub.ac.be Wed Jun 19 11:34:49 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 19 Jun 2002 17:34:49 +0200 (CEST) Subject: question about List Files Message-ID: <200206191534.RAA0001243481@ben.vub.ac.be> from : BEN I have a question. Where does the List File format used by EMBOSS come from ? I think I read it somewhere but I cannot find it back in the EMBOSS documentation. By the way, are there plans to make List Files genrated by GCG usable in EMBOSS ? Guy Bottu From gwilliam at hgmp.mrc.ac.uk Wed Jun 19 11:50:14 2002 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 19 Jun 2002 16:50:14 +0100 Subject: question about List Files References: <200206191534.RAA0001243481@ben.vub.ac.be> Message-ID: <3D10A836.EDCE4F66@hgmp.mrc.ac.uk> The List File is described in the Uniform Sequence Address docs: http://www.uk.embnet.org/Software/EMBOSS/Themes/UniformSequenceAddress.html#list And also in the formal summary section of the USA page: http://www.uk.embnet.org/Software/EMBOSS/Themes/UniformSequenceAddress.html#listfile Gary Guy Bottu wrote: > > from : BEN > > I have a question. Where does the List File format used by EMBOSS come from ? I > think I read it somewhere but I cannot find it back in the EMBOSS documentation. > By the way, are there plans to make List Files genrated by GCG usable in > EMBOSS ? > > Guy Bottu -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From peter.rice at uk.lionbioscience.com Wed Jun 19 11:51:35 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 19 Jun 2002 16:51:35 +0100 Subject: question about List Files References: <200206191534.RAA0001243481@ben.vub.ac.be> Message-ID: <3D10A887.A5F83B4A@uk.lionbioscience.com> Guy Bottu wrote: > > I have a question. Where does the List File format used by EMBOSS come from ? I > think I read it somewhere but I cannot find it back in the EMBOSS documentation. The format comes from VMS ... Under VAX/VMS (later OpenVMS) the syntax "@filename" opened a file and used each line of the file as input (Unix users may think this is like redirecting standard input). Each line of the list file is a separate USA. EMBOSS USAs similarly come from VMS logical names (EMBL:entryname) and the original ideas for ACD came from VMS DCL's "CLD" files. GCG and SRS also show their VMS 'roots' by using similar sequence naming. Bioinformatics users are now used to this, although many have probably never seen a VMS system :-) > By the way, are there plans to make List Files genrated by GCG usable in > EMBOSS ? GCG do silly things with these files, including having GCG headers ending in ".." and of course GCG database names. Yes, we could try to make some GCG listfiles work with EMBOSS, for example adding a new format "gcglist::" that strips everything up to ".." (something EMBOSS would not want to do by default). regards, Peter From gbottu at ben.vub.ac.be Tue Jun 25 11:30:07 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Tue, 25 Jun 2002 17:30:07 +0200 (CEST) Subject: question about restriction maps Message-ID: <200206251530.RAA0001125452@ben.vub.ac.be> from : BEN Dear All, Does anyone know this ? Is there a way to run an EMBOSS restriction program and give the output to cirdna or lindna, thus reproducing the functionality of GCG mapsort-plasmidmap ? Guy Bottu From peter.rice at uk.lionbioscience.com Tue Jun 25 11:45:47 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Tue, 25 Jun 2002 16:45:47 +0100 Subject: question about restriction maps References: <200206251530.RAA0001125452@ben.vub.ac.be> Message-ID: <3D18902B.15E58147@uk.lionbioscience.com> Guy Bottu wrote: > > Does anyone know this ? Is there a way to run an EMBOSS restriction program and > give the output to cirdna or lindna, thus reproducing the functionality of GCG > mapsort-plasmidmap ? Sounds like a new report format to me ... which would allow any features to be drawn. We could set colours for each feature somewhere (DNA or protein would work) by classifying them in sets with 1 colour for each. Any volunteers to classify the EMBL/GenBank and SwissProt/PIR feature keys? Peter From dmerberg at Phylos.com Tue Jun 25 17:41:11 2002 From: dmerberg at Phylos.com (David Merberg) Date: Tue, 25 Jun 2002 17:41:11 -0400 Subject: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <888CC5DBA518D511A72D00A0C9E97E8F513D71@ntserver1.phylos.com> Hello, We've recently installed EMBOSS and cannot run applications that open X windows. We see this error: Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). We've tried it both on Red Hat 7.1 and Solaris 7. Has anybody seen this before? Know how to fix it? Thanks, David Merberg Phylos, Inc. Lexington, MA From gbottu at ben.vub.ac.be Wed Jun 26 06:18:11 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 26 Jun 2002 12:18:11 +0200 (CEST) Subject: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <200206261018.MAA0001144803@ben.vub.ac.be> from : BEN > Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). That sounds familiar to me. I had installed EMBOSS on a mainframe and when trying to access it via an X-terminal emulator I had the same message. The fix was to put the X-terminal emulator in "pseudocolor" mode or the PC in "256 colors" mode. Your configuration is not the same, but there is maybe also a problem with the color maps. Guy Bottu From David.Bauer at SCHERING.DE Wed Jun 26 07:04:12 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 26 Jun 2002 13:04:12 +0200 Subject: Antwort: Re: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: I thought that this is only a problem with Exceed X-Server for PC. Btw. the settings needed for EMBOSS do not work for Blixem-Dotter. So I use as default for graphical output 'cps' and view this with ghostview. On the long run it would be nice to get the PLPLOT X11 libraries fixed, so that they work also with other color settings. David. from : BEN > Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). That sounds familiar to me. I had installed EMBOSS on a mainframe and when trying to access it via an X-terminal emulator I had the same message. The fix was to put the X-terminal emulator in "pseudocolor" mode or the PC in "256 colors" mode. Your configuration is not the same, but there is maybe also a problem with the color maps. Guy Bottu From David.Bauer at SCHERING.DE Wed Jun 26 07:32:22 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 26 Jun 2002 13:32:22 +0200 Subject: dbiblast problem Message-ID: Hi, I observed the folowing problem when retrieving entries from blast databases formated with dbiblast: I get the CORRECT sequence if I specify the ID: ----------------------------------------------------------------------------------------------------------------------------- seqret -auto -stdout cgdb_nt:celsl2a >CELSL2A M27263 C.elegans trans-spliced leader 2 (SL2 RNA-alpha) gene, 5' flank. seqret -auto -stdout cgdb_nt:celsl2b >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank. ######################################################################### But I get the WRONG sequences if I specify the ACC: ------------------------------------------------------------------------------------------------------------------------------ seqret -auto -stdout cgdb_nt:M27263 >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank. seqret -auto -stdout cgdb_nt:M27264 >CELSNTI L15302 C.elegans synaptotagmin I mRNA, complete cds and flanking regions. ########################################################################### With fastacmd the headers look like this: --------------------------------------------------------------------------------------------------------------------------------- >gb|M27263|CELSL2A C.elegans trans-spliced leader 2 (SL2 RNA-alpha) gene, 5' flank >gb|M27264|CELSL2B C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank ########################################################################## Any ideas ? Thanks, David. From mad at biol.unlp.edu.ar Wed Jun 26 11:16:57 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Wed, 26 Jun 2002 18:16:57 +0300 Subject: NCBI nt database index Message-ID: <3D19DAE9.1D12D0C@biol.unlp.edu.ar> Hi, we have the non-redundant NCBI nucleotide database (nt) indexed with > $ dbifasta -idformat ncbi the raw nt database look like this > >gi|4003368|dbj|AB000282.1|AB000282 Navel orange infectious mottling virus gene for polyprotein (coat protein region), partial cds > AATGTCACCATTGAAAGTGGTGACAATAATAATAATAATTGTCCCACCGGTAATGTAGATAATAGAGAAATACCGGTGGT > ....... > >gi|1827449|dbj|AB000449.1|AB000449 Homo sapiens mRNA for VRK1, complete cds > CCGAGTTACGAGTCGGCGAAAGCGGCGGGAAGTTCGTACTGGGCAGAACGCGACGGGTCTGCGGCTTAGGTGAAAATGCC > etc and when we run > $ fuzznuc -raccshow2 -rdesshow2 -rusashow2 > Nucleic acid pattern search > Input sequence(s): nt:* > Search pattern: GGTTTCsanttyggnac > Number of mismatches [0]: 3 > Output report [gi.fuzznuc]: xx.fuzznuc we get this > $ more xx.fuzznuc > ######################################## > # Program: fuzznuc > # Rundate: Wed Jun 26 18:09:24 2002 > # Report_file: xx.fuzznuc > ######################################## > > #======================================= > # > # Sequence: nt-id:gi from: 1 to: 1904 > # Accession: > # Description: Schizosaccharomyces pombe DNA for SUI1 homologue, complete cds > # HitCount: 1 > # > # Pattern: GGTTTCsanttyggnac > # Mismatch: 3 > # Complement: No > # > #======================================= > > Start End Mismatch Sequence > 9 25 3 GGTTACCATTTTGGCTA > > .... > # Sequence: nt-id:gi from: 1 to: 17070 > # Accession: > # Description: Oryza sativa gene for NADH-dependent glutamate synthase > # HitCount: 1 > # > etc The acnum.hit, acnum.trg, division.lkp and entrynam.idx for nt database seems to be correct. Any idea why de Accesion numbers doesn't show up on the fuzznuc results? martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ableasby at hgmp.mrc.ac.uk Wed Jun 26 20:06:12 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 27 Jun 2002 01:06:12 +0100 (BST) Subject: Antwort: Re: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <200206270006.BAA22905@bromine.hgmp.mrc.ac.uk> >On the long run it would be nice to get the PLPLOT X11 libraries fixed, For some time we have been working on replacing PLPLOT altogether. Development is being done on OpenGL/Java3D. We can't commit to a timescale yet, but things are looking promising. Lets just say we think it will be shorter than "in the long run". Alan From john.walshaw at bbsrc.ac.uk Thu Jun 27 05:04:11 2002 From: john.walshaw at bbsrc.ac.uk (john walshaw (JIC)) Date: Thu, 27 Jun 2002 10:04:11 +0100 Subject: dbiblast problem Message-ID: I have experienced a similar problem, tring to index WU-BLAST-formatted databases with dbiblast. However I got it to work properly with NCBI-BLAST-formatted databases. John Walshaw, John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK. +44(0)1603 450827 > -----Original Message----- > From: David.Bauer at SCHERING.DE [mailto:David.Bauer at SCHERING.DE] > Sent: 26 June 2002 12:32 > To: emboss at embnet.org > Subject: dbiblast problem > > > Hi, > > I observed the folowing problem when retrieving entries from > blast databases > formated with dbiblast: > > I get the CORRECT sequence if I specify the ID: > -------------------------------------------------------------- > --------------------------------------------------------------- > seqret -auto -stdout cgdb_nt:celsl2a > >CELSL2A M27263 C.elegans trans-spliced leader 2 (SL2 > RNA-alpha) gene, 5' flank. > seqret -auto -stdout cgdb_nt:celsl2b > >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' flank. > ############################################################## > ########### > > But I get the WRONG sequences if I specify the ACC: > -------------------------------------------------------------- > ---------------------------------------------------------------- > seqret -auto -stdout cgdb_nt:M27263 > >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' flank. > seqret -auto -stdout cgdb_nt:M27264 > >CELSNTI L15302 C.elegans synaptotagmin I mRNA, complete cds > and flanking > regions. > ############################################################## > ############# > > With fastacmd the headers look like this: > -------------------------------------------------------------- > ------------------------------------------------------------------- > >gb|M27263|CELSL2A C.elegans trans-spliced leader 2 (SL2 > RNA-alpha) gene, 5' > flank > >gb|M27264|CELSL2B C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' > flank > ############################################################## > ############ > > Any ideas ? > > Thanks, David. > > From David.Bauer at SCHERING.DE Fri Jun 28 01:32:42 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Fri, 28 Jun 2002 07:32:42 +0200 Subject: dbiblast problem Message-ID: Dear John, thanks for the info. Unfortunately the databases I try to index are formated with formatdb 2.2.1 (NCBI). The blast databases have been updated yesterday, so I rebuild the index. But the error is highly reproducible. I get the same wrong sequences back when using the ACC. David. I have experienced a similar problem, tring to index WU-BLAST-formatted databases with dbiblast. However I got it to work properly with NCBI-BLAST-formatted databases. From areagp61 at yahoo.it Fri Jun 28 06:00:19 2002 From: areagp61 at yahoo.it (Graziano P.) Date: Fri, 28 Jun 2002 12:00:19 +0200 Subject: fuzznuc output Message-ID: <000201c21e8c$132b1340$18105709@italy.ibm.com> Hi, I have installed the EMBOSS version 2.3.1. I have made an analisys with the fuzznuc program in this way: $ fuzznuc Nucleic acid pattern search Input sequence(s): mysequence Search pattern: acgtggac Number of mismatches [0]: 2 Output report [af049916.fuzznuc]: The output file is: $ more af049916.fuzznuc AF049916 148 ATGTGGAT AF049916 416 ACGTGGGC AF049916 845 TCTTGGAC AF049916 1722 ACGTGGGC AF049916 2007 ACGTGTGC AF049916 2257 ACATGTAC AF049916 3183 ACGTAAAC AF049916 3377 TCGTGGAA AF049916 3914 ACGTGCAC AF049916 4058 ACATGGAC AF049916 4317 ACGTAAAC AF049916 4534 TCGTGGAA AF049916 4906 ACATGGAA AF049916 5877 ACATGGAC AF049916 5954 TCCTGGAC AF049916 6024 CCCTGGAC AF049916 6094 ACGTGGCA AF049916 6127 GCCTGGAC AF049916 6148 ACATGGAC AF049916 6160 TCGTGGAC AF049916 6208 ACGTCGAC As you can see there is no name for each coloumn of the table; moreover this output is different from that you can see in the EMBOSS HELP, i.e. for example: ######################################## # Program: fuzznuc # Rundate: Thu Apr 11 13:34:06 2002 # Report_file: stdout ######################################## #======================================= # # Sequence: HHTETRA from: 1 to: 1272 # HitCount: 2 # # Pattern: aagctt # Mismatch: 0 # Complement: No # #======================================= Start End Mismatch Sequence 1 6 . aagctt 1267 1272 . aagctt #--------------------------------------- #--------------------------------------- I have tried to use the option -rformat seqtable, like suggested in the help, but the the program says: " EMBOSS An error in ajacd.c at line 11225: unknown qualifier -rformat" Have you got any idea about this problem? Is the matter in the EMBOSS version? Thanks Graziano -------------------------------------------------------------------------------------- Graziano Pappad? areagp61 at yahoo.it -------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20020628/1e7190fb/attachment.html From peter.rice at uk.lionbioscience.com Fri Jun 28 08:45:10 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 28 Jun 2002 13:45:10 +0100 Subject: fuzznuc output References: <000201c21e8c$132b1340$18105709@italy.ibm.com> Message-ID: <3D1C5A56.B0D4D600@uk.lionbioscience.com> Hi Graziano, > "Graziano P." wrote: > > I have installed the EMBOSS version 2.3.1. > I have made an analisys with the fuzznuc program in this way: > $ fuzznuc > > As you can see there is no name for each coloumn of the table; moreover > this output is different from that you can see in the EMBOSS HELP, i.e. > for example: > > I have tried to use the option -rformat seqtable, like suggested in the help, but the the program says: > > " EMBOSS An error in ajacd.c at line 11225: > unknown qualifier -rformat" > > Have you got any idea about this problem? Is the matter in the EMBOSS > version? This is, as you say, an EMBOSS version issue. We are converting EMBOSS programs to use formatted reports. Fuzznuc was converted for EMBOSS 2.4.0 The web pages show the current (development) documentation. The EMBOSS distribution includes the documentation for each release in the doc/programs/html directory. In this case, I would recommend upgrading to EMBOSS 2.4.x because the fuzznuc report output is much nicer. regards, Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From tmargus at ebc.ee Tue Jun 4 14:46:05 2002 From: tmargus at ebc.ee (=?iso-8859-1?Q?T=F5nu_Margus?=) Date: Tue, 4 Jun 2002 17:46:05 +0300 Subject: problems with dbiblast Message-ID: <008e01c20bd6$8e616690$1e1728c1@ebc.ee> Hi, I have problems with blast database formating for emboss. dbiblast gives an unexpected error. It starts to look file swissprot.phr.pin not swissprot.pin. Waht could be a reason and how to overcome? here is program output form screen tmargus at kobra:blast$ dbiblast Index a BLAST database Database name: swissprot Database directory [.]: Wildcard database filename [swissprot]: swissprot* Release number [0.0]: 40 Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: P 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 EMBOSS An error in dbiblast.c at line 640: cannot open ./swissprot.phr table file ./swissprot.phr.pin Sincerely Yours T?nu Margus Estonian Biocentre Riia 23 Tartu 51010 Estonia E-mail tmargus at ebc.ee -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathog at mendel.bio.caltech.edu Tue Jun 4 15:36:21 2002 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Tue, 04 Jun 2002 08:36:21 -0700 Subject: BLAST, X vs. U, and EMBOSS Message-ID: In nr there is an entry with gi= 2018236 which ends with: gcxg When nr is made into a BLAST database with % formatdb -i nr -p T -o T the database can be dumped out again with % fastacmd -d nr -D The problem is when it comes out that sequence fragment is now GCUG This causes problems when piping the output of fastacmd into fuzzpro (for instance) because fuzzpro complains that this sequence is not a protein - because of the U. This can be verified by inserting a tr 'U' 'X' into the data stream, which lets fuzzpro digest that particular entry. So what to do? Modify EMBOSS to add a switch to allow "U" to be part of the protein alphabet or lobby the NCBI to use 'X' for unknown in blast databases? I lean pretty strongly towards the latter option, although getting the NCBI to change this may be difficult. In this regard it's also interesting that there is another problem with any sequence containing an X (U) in a BLAST database. To see what happens BLAST the NCBI sequence (with the x) against the BLAST database containing the same entry. The sequence alignment terminates at the C before the X/U. That isn't very informative because BLAST alignments terminate there no matter what letter I substitute in for X (ie, Y,L,etc.) So try another sequence containing a U, this time towards the middle of the sequence, for instance emb|CAC39234.1| (AJ312124) FdhA-I protein. BLAST that against nr and you find this odd alignment: LX-H L H L-UH and it does that even if the query uses a U instead of an X. Here's the relevant fragment if you'd like to verify this for yourself: YLFQKLLRAVVGTNNVDHCARLXHASTVAGLATTLGSGAMT Now, what happens here if the query sequence is mutated X->Y? The exact same alignment, with X->Y. Are we all agreed that this is a BLAST bug? X against any valid amino acid or against X should be a mismatch (since without further information there's a 19/20 probability it is a mismatch and only a 1/20 change it is a match) but there's no reason to insert gaps in the alignment at this point. So it should have been: LXH L H LUH This was with blast versions 2.2.2 (local) and 2.2.3 (NCBI server). Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From mathog at mendel.bio.caltech.edu Tue Jun 4 18:34:25 2002 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Tue, 04 Jun 2002 11:34:25 -0700 Subject: BLAST, X vs. U, and EMBOSS Message-ID: > In nr there is an entry with gi= 2018236 which ends with: Sorry, I dropped a character in the cut and paste, it's: 12018236 Tao Tao points out that U is Iupac for selenocysteine, see: http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html#AA212 This gets very confusing because entrez returns Genbank format with U->X, but fasta (and ASN.1) with U as U. Which protein alphabet is EMBOSS supposed to recognize for protein? And all that aside, X vs. X or X vs. U in blastp really does introduce two unnecessary gaps in the alignment, which can be easily demonstrated with bl2seq on gi 14250938 vs. itself. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From oddmund.nordgard at biokjemi.uio.no Wed Jun 5 06:57:59 2002 From: oddmund.nordgard at biokjemi.uio.no (=?ISO8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Wed, 5 Jun 2002 08:57:59 +0200 (MET DST) Subject: Multiple plots on one page Message-ID: Hello! I have been trying to get more than one plot per page with postscript output from emboss. Have tried both psnup and mpage, but the results are not as expected. Strange things happen. Is there a solution to this? Oddmund ******************************************* Oddmund Nordg?rd Bj?rnev. 23 4323 SANDNES Tlf.: 51 67 25 65 Mob: 48 20 51 72 Tlf. arb.: 51 87 54 94 ******************************************** Registered linux user #44149 From David.Bauer at SCHERING.DE Wed Jun 5 07:40:31 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 5 Jun 2002 09:40:31 +0200 Subject: Antwort: Multiple plots on one page Message-ID: Hi, I have observed that the postscript outputs from emboss mostly (always ?) have an additional empty page definition at the end: ################ %%Page: 2 2 bop S 0.0000 G S 0.0000 G S 0.0000 G %%Trailer %%Pages: 2 @end ############### Maybe this confuses psnup and mpage ? David. =?ISO8859-1?Q?Oddmund_No rdg=E5rd? = Kopie: Gesendet von: Thema: Multiple plots on one page owner-emboss at hgmp.mrc.ac .uk 05.06.02 08:57 Hello! I have been trying to get more than one plot per page with postscript output from emboss. Have tried both psnup and mpage, but the results are not as expected. Strange things happen. Is there a solution to this? Oddmund ******************************************* Oddmund Nordg?rd Bj?rnev. 23 4323 SANDNES Tlf.: 51 67 25 65 Mob: 48 20 51 72 Tlf. arb.: 51 87 54 94 ******************************************** Registered linux user #44149 From mad at biol.unlp.edu.ar Fri Jun 7 11:42:08 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Fri, 07 Jun 2002 14:42:08 +0300 Subject: dbiblast problem Message-ID: <3D009C10.ACF8FC@biol.unlp.edu.ar> Hi, anyone found a solution for the dbiblast problem? I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > % dbiblast > Index a BLAST database > Database name: humrep.fsa > Database directory [.]: > Wildcard database filename [humrep.fsa]: humrep.fsa.* > Release number [0.0]: > Index date [00/00/00]: > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > > EMBOSS An error in dbiblast.c at line 640: > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin I also get a similar error when trying with nr (prot) database > $ dbiblast > Index a BLAST database > Database name: nr > Database directory [.]: > Wildcard database filename [nr]: nr.* > Release number [0.0]: > Index date [00/00/00]: > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: P > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > > EMBOSS An error in dbiblast.c at line 640: > cannot open ./nr.phr table file ./nr.phr.pin martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ztu at msi.umn.edu Fri Jun 7 17:49:37 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 7 Jun 2002 12:49:37 -0500 (CDT) Subject: dbiblast problem In-Reply-To: <3D009C10.ACF8FC@biol.unlp.edu.ar> Message-ID: You may need to run formatdb first then run dbiblast. formatdb -i input_seq -p T -o T (for nucleic acid) formatdb -i input_seq -p F -o T (for protein) Thanks, Tu ============================================================ On Fri, 7 Jun 2002, Martin Sarachu wrote: > Hi, > > anyone found a solution for the dbiblast problem? > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > % dbiblast > > Index a BLAST database > > Database name: humrep.fsa > > Database directory [.]: > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > Release number [0.0]: > > Index date [00/00/00]: > > N : nucleic > > P : protein > > ? : unknown > > Sequence type [unknown]: N > > 1 : wublast and setdb/pressdb > > 2 : formatdb > > 0 : unknown > > Blast index version [unknown]: 2 > > > > EMBOSS An error in dbiblast.c at line 640: > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > I also get a similar error when trying with nr (prot) database > > > $ dbiblast > > Index a BLAST database > > Database name: nr > > Database directory [.]: > > Wildcard database filename [nr]: nr.* > > Release number [0.0]: > > Index date [00/00/00]: > > N : nucleic > > P : protein > > ? : unknown > > Sequence type [unknown]: P > > 1 : wublast and setdb/pressdb > > 2 : formatdb > > 0 : unknown > > Blast index version [unknown]: 2 > > > > EMBOSS An error in dbiblast.c at line 640: > > cannot open ./nr.phr table file ./nr.phr.pin > > > martin > > -- > Martin Sarachu > mad at biol.unlp.edu.ar > EMBnet Argentina > http://www.ar.embnet.org > From mad at biol.unlp.edu.ar Fri Jun 7 11:52:47 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Fri, 07 Jun 2002 14:52:47 +0300 Subject: dbiblast problem References: Message-ID: <3D009E8F.5CD201E4@biol.unlp.edu.ar> yes, I did that before trying dbiblast martin Zheng Jin Tu wrote: > > You may need to run formatdb first then run dbiblast. > > formatdb -i input_seq -p T -o T (for nucleic acid) > formatdb -i input_seq -p F -o T (for protein) > > Thanks, > > Tu > ============================================================ > > On Fri, 7 Jun 2002, Martin Sarachu wrote: > > > Hi, > > > > anyone found a solution for the dbiblast problem? > > > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > > > % dbiblast > > > Index a BLAST database > > > Database name: humrep.fsa > > > Database directory [.]: > > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > > Release number [0.0]: > > > Index date [00/00/00]: > > > N : nucleic > > > P : protein > > > ? : unknown > > > Sequence type [unknown]: N > > > 1 : wublast and setdb/pressdb > > > 2 : formatdb > > > 0 : unknown > > > Blast index version [unknown]: 2 > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > > > > I also get a similar error when trying with nr (prot) database > > > > > $ dbiblast > > > Index a BLAST database > > > Database name: nr > > > Database directory [.]: > > > Wildcard database filename [nr]: nr.* > > > Release number [0.0]: > > > Index date [00/00/00]: > > > N : nucleic > > > P : protein > > > ? : unknown > > > Sequence type [unknown]: P > > > 1 : wublast and setdb/pressdb > > > 2 : formatdb > > > 0 : unknown > > > Blast index version [unknown]: 2 > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > cannot open ./nr.phr table file ./nr.phr.pin > > > > > > martin > > > > -- > > Martin Sarachu > > mad at biol.unlp.edu.ar > > EMBnet Argentina > > http://www.ar.embnet.org > > -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ztu at msi.umn.edu Fri Jun 7 18:13:32 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 7 Jun 2002 13:13:32 -0500 (CDT) Subject: dbiblast problem In-Reply-To: <3D009E8F.5CD201E4@biol.unlp.edu.ar> Message-ID: Oh ! Yes, my emboss runs same problem. Before, I find when dbiblast indexed database. seqret is only retrive first 10000 line of records. After that give error message. I have no idea whether this fixed or not at new emboss releases. I am now using dbiflat to do that. It needs flat file instead of fasta file so more space consuming. It is my experience. Thanks, Tu ==================================================================== On Fri, 7 Jun 2002, Martin Sarachu wrote: > yes, I did that before trying dbiblast > > martin > > Zheng Jin Tu wrote: > > > > You may need to run formatdb first then run dbiblast. > > > > formatdb -i input_seq -p T -o T (for nucleic acid) > > formatdb -i input_seq -p F -o T (for protein) > > > > Thanks, > > > > Tu > > ============================================================ > > > > On Fri, 7 Jun 2002, Martin Sarachu wrote: > > > > > Hi, > > > > > > anyone found a solution for the dbiblast problem? > > > > > > I'm trying to index the formatdb'ed humrep.fas (nuc) FASTA database > > > > > > > % dbiblast > > > > Index a BLAST database > > > > Database name: humrep.fsa > > > > Database directory [.]: > > > > Wildcard database filename [humrep.fsa]: humrep.fsa.* > > > > Release number [0.0]: > > > > Index date [00/00/00]: > > > > N : nucleic > > > > P : protein > > > > ? : unknown > > > > Sequence type [unknown]: N > > > > 1 : wublast and setdb/pressdb > > > > 2 : formatdb > > > > 0 : unknown > > > > Blast index version [unknown]: 2 > > > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > > cannot open ./humrep.fsa.nhr table file ./humrep.fsa.nhr.nin > > > > > > > > > I also get a similar error when trying with nr (prot) database > > > > > > > $ dbiblast > > > > Index a BLAST database > > > > Database name: nr > > > > Database directory [.]: > > > > Wildcard database filename [nr]: nr.* > > > > Release number [0.0]: > > > > Index date [00/00/00]: > > > > N : nucleic > > > > P : protein > > > > ? : unknown > > > > Sequence type [unknown]: P > > > > 1 : wublast and setdb/pressdb > > > > 2 : formatdb > > > > 0 : unknown > > > > Blast index version [unknown]: 2 > > > > > > > > EMBOSS An error in dbiblast.c at line 640: > > > > cannot open ./nr.phr table file ./nr.phr.pin > > > > > > > > > martin > > > > > > -- > > > Martin Sarachu > > > mad at biol.unlp.edu.ar > > > EMBnet Argentina > > > http://www.ar.embnet.org > > > > > -- > Martin Sarachu > mad at biol.unlp.edu.ar > EMBnet Argentina > http://www.ar.embnet.org > From sgmd at genetik.fu-berlin.de Mon Jun 10 15:13:34 2002 From: sgmd at genetik.fu-berlin.de (Thomas Siegmund) Date: Mon, 10 Jun 2002 17:13:34 +0200 Subject: EMBOSS.kaptn - X11/KDE GUI for EMBOSS 2.4.1 and EPHYLIP-3.573c Message-ID: <200206101713.34800.sgmd@genetik.fu-berlin.de> Dear EMBOSS list, a new version of the Kaptain user interface for EMBOSS is finished. In addition to EMBOSS the current EMBOSS.kaptn 0.80 supports the complete EPHYLIP 3.5. There are not so many GUIs for Phylip out there, aren't they? As usual, you will find a compressed tar archive for download, additional information, and a few screenshots at http://userpage.fu-berlin.de/~sgmd . From the ChangeLog: Version 0.80 - Finished support for EPHYLIP-3.573c ---------------------------------------------- - new: econtml, econtrast.kaptn - modified install script for systems where the nedit client has a name different from "nc". Thanks to P. Martineau for the hint! Version 0.79 - new: ednaml.kaptn, ednaml.kaptn, ednacomp.kaptn eclique.kaptn, econsense.kaptn Have fun Thomas -- Thomas Siegmund Freie Universit?t Berlin Institut f?r Genetik Arnimallee 7 14195 Berlin Germany Tel: +49 30 838 54868 Fax: +49 30 838 54395 http://userpage.fu-berlin.de/~sgmd From ztu at msi.umn.edu Wed Jun 12 21:08:26 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Wed, 12 Jun 2002 16:08:26 -0500 (CDT) Subject: How do I specific human genome CHR_01 database at .embossrc file Message-ID: I have downloaded human genome CHR_01 data from ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_01 directory with hs_chr1.asn.gz, hs_chr1.fa.gz, hs_chr1.gbk.gz, hs_chr1.gbs.gz, hs_chr1.mfa.gz files. After gunzip them, I use hs_chr1.gbk to run dbiflat, it works fine and creates these files: division.lkp entrynam.idx acnum.trg acnum.hit. Question is come down how do I specify this database at .embossrc for these fields especially for method: method: ??? format: genbank ? Any suggestion is great appreciated. Thanks, Tu -------------------------------------- Zheng Jin Tu Supercomputing Institute University of Minnesota -------------------------------------- From charles at moulinette.dyndns.org Fri Jun 14 13:48:09 2002 From: charles at moulinette.dyndns.org (Charles Plessy) Date: Fri, 14 Jun 2002 15:48:09 +0200 Subject: How do I specific human genome CHR_01 database at .embossrc file In-Reply-To: References: Message-ID: <20020614134809.GA13692@moulinette.dyndns.org> Hi, I used home sequences so I had to reformat the headers to something "ncbi compliant" like lcl|foo. see the formatdb readme for more info : http://www.inb.mu-luebeck.de/biosoft/blast/README.formatdb I had to use formatdb vith the option -o T , to generate more files. I then used dbiflat normally Hese is an extract of my embossrc : DB ngn [ type: N method: blast format: ncbi dir: /home/charles/bioinfo/blast indexdir: /home/charles/bioinfo/blast/ngn file: "ngn" release: "0.0" ] Charles From ztu at msi.umn.edu Fri Jun 14 14:03:59 2002 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Fri, 14 Jun 2002 09:03:59 -0500 (CDT) Subject: How do I specific human genome CHR_01 database at .embossrc file In-Reply-To: <20020614134809.GA13692@moulinette.dyndns.org> Message-ID: Really thanks for these kindly people reply for my message. Most people sugguest method: emblcd format: genbank It works well with this definition. Best regards, Zheng Jin Tu Supercomputing Institute University of Minnesota ________________________________________________________________________ On Fri, 14 Jun 2002, Charles Plessy wrote: > Hi, > > I used home sequences so I had to reformat the headers to something > "ncbi compliant" like lcl|foo. > > see the formatdb readme for more info : > http://www.inb.mu-luebeck.de/biosoft/blast/README.formatdb > > I had to use formatdb vith the option -o T , to generate more files. > > I then used dbiflat normally > > Hese is an extract of my embossrc : > > DB ngn [ > type: N > method: blast > format: ncbi > dir: /home/charles/bioinfo/blast > indexdir: /home/charles/bioinfo/blast/ngn > file: "ngn" > release: "0.0" > ] > > Charles > From mad at biol.unlp.edu.ar Sun Jun 16 15:20:15 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Sun, 16 Jun 2002 18:20:15 +0300 Subject: problem with dbifasta and long files? Message-ID: <3D0CACAF.E51FD0D6@biol.unlp.edu.ar> Hi, I'm having an "cannot read" error when trying to index with dbifasta. Here is the output > $ ls -lsa > . > . > . > 5894336 -rw-r--r-- 1 martin 6032840325 Jun 15 17:10 nt > . > $ dbifasta -debug > Index a fasta database > simple : >ID > idacc : >ID ACC > gcgid : >db:ID > gcgidacc : >db:ID ACC > dbid : >db ID > ncbi : | formats > ID line format [idacc]: ncbi > Database directory [.]: > Wildcard database filename [*.dat]: nt > Database name: nt > Release number [0.0]: > Index date [00/00/00]: > > EMBOSS An error in embdbi.c at line 295: > Cannot open ./nt for reading > > $ I'm also sending dbifasta.dbg just in case martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org -------------- next part -------------- acdArgsScan acdDebug Yes acdDoHelp No ajFileNewIn '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' ajNamResolve of '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' EOF ajFileGetsL file /usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd closing file '/usr/local/EMBOSS/share/EMBOSS/acd/dbifasta.acd' acdSetDefC auto 'N' 2bec8 acdSetDefC stdout 'N' 2bf68 acdSetDefC filter 'N' 2c008 acdSetDefC options 'N' 2c0a8 acdSetDefC debug 'N' 2c148 acdSetDefC acdlog 'N' 2c1e8 acdSetDefC acdpretty 'N' 2c288 acdSetDefC acdtable 'N' 2c328 acdSetDefC help 'N' 2c3c8 acdSetDefC verbose 'N' 2c468 acdSetDefC warning 'Y' 2c508 acdSetDefC error 'Y' 2c5a8 acdSetDefC fatal 'Y' 2c648 acdSetDefC die 'N' 2cf60 acdSetVarDef today '16/06/02' ff2650cc acdNewSec 'input' acdSecList length 0 acdNewSec acdSecList push 'input' new length 1 acdSet attr 'idformat' val 'required' type '' acdSet attr 'idformat' val 'information' type '' acdSet attr 'idformat' val 'values' type '' acdSet attr 'idformat' val 'maximum' type '' acdSet attr 'idformat' val 'minimum' type '' acdSet attr 'idformat' val 'default' type '' acdSet attr 'idformat' val 'delimiter' type '' acdSet attr 'idformat' val 'codedelimiter' type '' acdSet attr 'directory' val 'required' type '' acdSet attr 'directory' val 'default' type '' acdSet attr 'directory' val 'information' type '' acdSet attr 'filenames' val 'required' type '' acdSet attr 'filenames' val 'default' type '' acdSet attr 'filenames' val 'information' type '' acdNewEndsec 'input' acdSecList length 1 Pop from acdSecList 'input' new length 0 acdNewSec 'required' acdSecList length 0 acdNewSec acdSecList push 'required' new length 1 acdSet attr 'dbname' val 'parameter' type '' acdSet attr 'dbname' val 'maxlength' type '' acdSet attr 'dbname' val 'minlength' type '' acdSet attr 'dbname' val 'information' type '' acdSet attr 'release' val 'required' type '' acdSet attr 'release' val 'default' type '' acdSet attr 'release' val 'maxlength' type '' acdSet attr 'release' val 'information' type '' acdSet attr 'date' val 'required' type '' acdSet attr 'date' val 'default' type '' acdSet attr 'date' val 'valid' type '' acdSet attr 'date' val 'information' type '' acdSet attr 'date' val 'pattern' type '' acdNewEndsec 'required' acdSecList length 1 Pop from acdSecList 'required' new length 0 acdNewSec 'advanced' acdSecList length 0 acdNewSec acdSecList push 'advanced' new length 1 acdSet attr 'fields' val 'required' type '' acdSet attr 'fields' val 'information' type '' acdSet attr 'fields' val 'values' type '' acdSet attr 'fields' val 'minimum' type '' acdSet attr 'fields' val 'maximum' type '' acdSet attr 'fields' val 'default' type '' acdSet attr 'exclude' val 'information' type '' acdSet attr 'indexdirectory' val 'default' type '' acdSet attr 'indexdirectory' val 'information' type '' acdSet attr 'maxindex' val 'default' type '' acdSet attr 'maxindex' val 'minimum' type '' acdSet attr 'maxindex' val 'information' type '' acdSet attr 'sortoptions' val 'default' type '' acdSet attr 'sortoptions' val 'information' type '' acdSet attr 'sortoptions' val 'help' type '' acdSet attr 'systemsort' val 'default' type '' acdSet attr 'systemsort' val 'information' type '' acdSet attr 'cleanup' val 'default' type '' acdSet attr 'cleanup' val 'information' type '' acdNewEndsec 'advanced' acdSecList length 1 Pop from acdSecList 'advanced' new length 0 -- All Done : acdSecList length 0 acdSet attr 'dbname' val 'required' type '' acdDef debug 'Y' 2c148 acdSetDef debug 'Y' 2c148 acdSetBool -auto def: N acdSetQualAppl 'auto' acdSetBool -auto val: No acdSetBool -stdout def: N acdSetQualAppl 'stdout' acdSetBool -stdout val: No acdSetBool -filter def: N acdSetQualAppl 'filter' acdSetBool -filter val: No acdSetBool -options def: N acdSetQualAppl 'options' acdSetBool -options val: No acdSetBool -debug def: Y acdSetQualAppl 'debug' acdSetBool -debug val: Yes acdSetBool -acdlog def: N acdSetQualAppl 'acdlog' acdSetBool -acdlog val: No acdSetBool -acdpretty def: N acdSetQualAppl 'acdpretty' acdSetBool -acdpretty val: No acdSetBool -acdtable def: N acdSetQualAppl 'acdtable' acdSetBool -acdtable val: No acdSetBool -help def: N acdSetQualAppl 'help' acdSetBool -help val: No acdHelp No acdSetBool -verbose def: N acdSetQualAppl 'verbose' acdSetBool -verbose val: No acdSetBool -warning def: Y acdSetQualAppl 'warning' acdSetBool -warning val: Yes acdSetBool -error def: Y acdSetQualAppl 'error' acdSetBool -error val: Yes acdSetBool -fatal def: Y acdSetQualAppl 'fatal' acdSetBool -fatal val: Yes acdSetBool -die def: N acdSetQualAppl 'die' acdSetBool -die val: No acdUserGet 'idformat' reply 'idacc' acdUserGet 'idformat' defreply 'idacc' msg 'ID line format' ajUserGet buffer len: 5 res: 2048 ptr: 30bc8 testing 'ncbi' Found 1 matches OK: Y min: 1 max: 1 Accept: 'ncbi' Found 1 matches Menu length now 0 before return val[0] 'ncbi' before return val[0] 'ncbi' Storing val[0] 'ncbi' acdUserGet 'directory' reply '.' acdUserGet 'directory' defreply '.' msg 'Database directory' ajUserGet buffer len: 1 res: 2048 ptr: 31418 acdUserGet 'filenames' reply '*.dat' acdUserGet 'filenames' defreply '*.dat' msg 'Wildcard database filename' ajUserGet buffer len: 5 res: 2048 ptr: 31418 acdUserGet 'dbname' reply '' acdUserGet 'dbname' defreply '' msg 'Database name' ajUserGet buffer len: 0 res: 2048 ptr: 31418 acdUserGet 'release' reply '0.0' acdUserGet 'release' defreply '0.0' msg 'Release number' ajUserGet buffer len: 3 res: 2048 ptr: 31418 acdUserGet 'date' reply '00/00/00' acdUserGet 'date' defreply '00/00/00' msg 'Index date' ajUserGet buffer len: 8 res: 2048 ptr: 31418 testing 'acnum' Found 1 matches OK: Y min: 0 max: 5 Accept: 'acnum' Found 1 matches Menu length now 0 before return val[0] 'acnum' before return val[0] 'acnum' Storing val[0] 'acnum' acdSetBool -systemsort def: Y acdSetQualAppl 'systemsort' acdSetBool -systemsort val: Yes acdSetBool -cleanup def: Y acdSetQualAppl 'cleanup' acdSetBool -cleanup val: Yes reading './nt' writing './' embDbiFileListExc dir '.' wildfile 'nt' exclude '' dirfix './' ajFileTestSkip: file '.' exclude: '' include: 'nt' ajFileTestSkip: file '..' exclude: '' include: 'nt' ajFileTestSkip: file 'update.log' exclude: '' include: 'nt' ajFileTestSkip: file 'README' exclude: '' include: 'nt' ajFileTestSkip: file 'update.error' exclude: '' include: 'nt' ajFileTestSkip: file 'release.error' exclude: '' include: 'nt' ajFileTestSkip: file 'release.log' exclude: '' include: 'nt' ajFileTestSkip: file 'nt' exclude: '' include: 'nt' ajFileTestSkip: file 'nt' included by 'nt' accept 'nt' ajFileTestSkip: file 'fmerge.log' exclude: '' include: 'nt' ajFileTestSkip: file 'dbifasta.dbg' exclude: '' include: 'nt' ajFileTestSkip: file 'index.nr' exclude: '' include: 'nt' ajFileTestSkip: file 'division.lkp' exclude: '' include: 'nt' ajFileTestSkip: file 'entrynam.idx' exclude: '' include: 'nt' ajFileTestSkip: file 'acnum.trg' exclude: '' include: 'nt' ajFileTestSkip: file 'acnum.hit' exclude: '' include: 'nt' 1 files for '.' 'nt' ajFileNewIn './nt' ajNamResolve of './nt' From AUnderwood at PHLS.org.uk Tue Jun 18 10:18:07 2002 From: AUnderwood at PHLS.org.uk (AUnderwood at PHLS.org.uk) Date: Tue, 18 Jun 2002 11:18:07 +0100 Subject: Compilation problem Message-ID: Hi all, When installing EMBOSS I have the following error message after typing the make command: make[2]: Entering directory `/tmp/EMBOSS-2.4.1/emboss' /bin/sh ../libtool --mode=link gcc -g -O2 -o abiview abiview.o ../nucleus/libnucleus.la ../ajax/libajaxg.la ../ajax/libajax.la ../plplot/libplplot.la -lX11 -lm gcc -g -O2 -o .libs/abiview abiview.o ../nucleus/.libs/libnucleus.so ../ajax/.libs/libajaxg.so ../ajax/.libs/libajax.so ../plplot/.libs/libplplot.so -lX11 -lm -Wl,--rpath -Wl,/usr/local/lib /usr/bin/ld: cannot find -lX11 collect2: ld returned 1 exit status make[2]: *** [abiview] Error 1 make[2]: Leaving directory `/tmp/EMBOSS-2.4.1/emboss' make[1]: *** [install-recursive] Error 1 make[1]: Leaving directory `/tmp/EMBOSS-2.4.1/emboss' make: *** [install-recursive] Error 1 By searching newsgroups I understand that this could be solved by adding the line -L/usr/X11R6/lib somewhere in the makefile. I am running Linux redhat 7.3. Please can you tell me how to solve this problem. Many thanks, Dr Anthony Underwood Bioinformatics Unit Central Public Health Laboratory 61 Colindale Avenue London NW9 5HT t: 0208 2004400 ext. 3618 f: 0208 3583138 e: aunderwood at phls.org.uk ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the PHLS, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. For information on how to send data to the PHLS in encrypted form via E.Mail, visit www.phls.org.uk. This footnote also confirms that this EMail has been swept for computer viruses, but please re-sweep any attachments before opening or saving. HTTP://www.phls.org.uk ************************************************************************** From tmargus at ebc.ee Wed Jun 19 13:31:28 2002 From: tmargus at ebc.ee (=?iso-8859-1?Q?T=F5nu_Margus?=) Date: Wed, 19 Jun 2002 16:31:28 +0300 Subject: formatin EMBL to fasta for Blast. hum01.dat is large than 2GB Message-ID: <003901c21795$9dc478a0$1e1728c1@ebc.ee> Hi all, It is not exactly emboss question but is related. I try to format EMBL rel 71.0 for blast First step is converting in to FastA format I am using sp2fasta Unfortunately it can't manage files what are lager than 2GB ( I am running this stuff on Linux 2.4 Slackware 8.0 ) And I didn't figure out how to send standard input as input for sp2fasta ( cat hum01.dat | sp2fasta didn't wokr) Does someone know how to overcome this problem? Is there some newer compilations of sp2fasta or some an other tool? Thanks in advance T?nu Margus Estonian Biocentre Riia 23 Tartu 51010 Estonia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ableasby at hgmp.mrc.ac.uk Wed Jun 19 13:34:10 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 19 Jun 2002 14:34:10 +0100 (BST) Subject: formatin EMBL to fasta for Blast. hum01.dat is large than 2GB Message-ID: <200206191334.OAA20044@bromine.hgmp.mrc.ac.uk> This is becoming a common sort of question with the new EMBL release. EMBOSS (on many operating systems) can cope with files >2Gb if it is configured with the --enable-large option. Of course you need to 'make clean' and 'make' again afterwards. It certainly works for RedHat and SuSE so it may well do so with Slackware. If it does then you could use 'seqret' to do the fasta conversion. HTH Alan Bleasby HGMP From gbottu at ben.vub.ac.be Wed Jun 19 15:34:49 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 19 Jun 2002 17:34:49 +0200 (CEST) Subject: question about List Files Message-ID: <200206191534.RAA0001243481@ben.vub.ac.be> from : BEN I have a question. Where does the List File format used by EMBOSS come from ? I think I read it somewhere but I cannot find it back in the EMBOSS documentation. By the way, are there plans to make List Files genrated by GCG usable in EMBOSS ? Guy Bottu From gwilliam at hgmp.mrc.ac.uk Wed Jun 19 15:50:14 2002 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 19 Jun 2002 16:50:14 +0100 Subject: question about List Files References: <200206191534.RAA0001243481@ben.vub.ac.be> Message-ID: <3D10A836.EDCE4F66@hgmp.mrc.ac.uk> The List File is described in the Uniform Sequence Address docs: http://www.uk.embnet.org/Software/EMBOSS/Themes/UniformSequenceAddress.html#list And also in the formal summary section of the USA page: http://www.uk.embnet.org/Software/EMBOSS/Themes/UniformSequenceAddress.html#listfile Gary Guy Bottu wrote: > > from : BEN > > I have a question. Where does the List File format used by EMBOSS come from ? I > think I read it somewhere but I cannot find it back in the EMBOSS documentation. > By the way, are there plans to make List Files genrated by GCG usable in > EMBOSS ? > > Guy Bottu -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From peter.rice at uk.lionbioscience.com Wed Jun 19 15:51:35 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 19 Jun 2002 16:51:35 +0100 Subject: question about List Files References: <200206191534.RAA0001243481@ben.vub.ac.be> Message-ID: <3D10A887.A5F83B4A@uk.lionbioscience.com> Guy Bottu wrote: > > I have a question. Where does the List File format used by EMBOSS come from ? I > think I read it somewhere but I cannot find it back in the EMBOSS documentation. The format comes from VMS ... Under VAX/VMS (later OpenVMS) the syntax "@filename" opened a file and used each line of the file as input (Unix users may think this is like redirecting standard input). Each line of the list file is a separate USA. EMBOSS USAs similarly come from VMS logical names (EMBL:entryname) and the original ideas for ACD came from VMS DCL's "CLD" files. GCG and SRS also show their VMS 'roots' by using similar sequence naming. Bioinformatics users are now used to this, although many have probably never seen a VMS system :-) > By the way, are there plans to make List Files genrated by GCG usable in > EMBOSS ? GCG do silly things with these files, including having GCG headers ending in ".." and of course GCG database names. Yes, we could try to make some GCG listfiles work with EMBOSS, for example adding a new format "gcglist::" that strips everything up to ".." (something EMBOSS would not want to do by default). regards, Peter From gbottu at ben.vub.ac.be Tue Jun 25 15:30:07 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Tue, 25 Jun 2002 17:30:07 +0200 (CEST) Subject: question about restriction maps Message-ID: <200206251530.RAA0001125452@ben.vub.ac.be> from : BEN Dear All, Does anyone know this ? Is there a way to run an EMBOSS restriction program and give the output to cirdna or lindna, thus reproducing the functionality of GCG mapsort-plasmidmap ? Guy Bottu From peter.rice at uk.lionbioscience.com Tue Jun 25 15:45:47 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Tue, 25 Jun 2002 16:45:47 +0100 Subject: question about restriction maps References: <200206251530.RAA0001125452@ben.vub.ac.be> Message-ID: <3D18902B.15E58147@uk.lionbioscience.com> Guy Bottu wrote: > > Does anyone know this ? Is there a way to run an EMBOSS restriction program and > give the output to cirdna or lindna, thus reproducing the functionality of GCG > mapsort-plasmidmap ? Sounds like a new report format to me ... which would allow any features to be drawn. We could set colours for each feature somewhere (DNA or protein would work) by classifying them in sets with 1 colour for each. Any volunteers to classify the EMBL/GenBank and SwissProt/PIR feature keys? Peter From dmerberg at Phylos.com Tue Jun 25 21:41:11 2002 From: dmerberg at Phylos.com (David Merberg) Date: Tue, 25 Jun 2002 17:41:11 -0400 Subject: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <888CC5DBA518D511A72D00A0C9E97E8F513D71@ntserver1.phylos.com> Hello, We've recently installed EMBOSS and cannot run applications that open X windows. We see this error: Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). We've tried it both on Red Hat 7.1 and Solaris 7. Has anybody seen this before? Know how to fix it? Thanks, David Merberg Phylos, Inc. Lexington, MA From gbottu at ben.vub.ac.be Wed Jun 26 10:18:11 2002 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 26 Jun 2002 12:18:11 +0200 (CEST) Subject: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <200206261018.MAA0001144803@ben.vub.ac.be> from : BEN > Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). That sounds familiar to me. I had installed EMBOSS on a mainframe and when trying to access it via an X-terminal emulator I had the same message. The fix was to put the X-terminal emulator in "pseudocolor" mode or the PC in "256 colors" mode. Your configuration is not the same, but there is maybe also a problem with the color maps. Guy Bottu From David.Bauer at SCHERING.DE Wed Jun 26 11:04:12 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 26 Jun 2002 13:04:12 +0200 Subject: Antwort: Re: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: I thought that this is only a problem with Exceed X-Server for PC. Btw. the settings needed for EMBOSS do not work for Blixem-Dotter. So I use as default for graphical output 'cps' and view this with ghostview. On the long run it would be nice to get the PLPLOT X11 libraries fixed, so that they work also with other color settings. David. from : BEN > Error in XCreatePixmap: BadDrawable (invalid Pixmap or Window parameter). That sounds familiar to me. I had installed EMBOSS on a mainframe and when trying to access it via an X-terminal emulator I had the same message. The fix was to put the X-terminal emulator in "pseudocolor" mode or the PC in "256 colors" mode. Your configuration is not the same, but there is maybe also a problem with the color maps. Guy Bottu From David.Bauer at SCHERING.DE Wed Jun 26 11:32:22 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Wed, 26 Jun 2002 13:32:22 +0200 Subject: dbiblast problem Message-ID: Hi, I observed the folowing problem when retrieving entries from blast databases formated with dbiblast: I get the CORRECT sequence if I specify the ID: ----------------------------------------------------------------------------------------------------------------------------- seqret -auto -stdout cgdb_nt:celsl2a >CELSL2A M27263 C.elegans trans-spliced leader 2 (SL2 RNA-alpha) gene, 5' flank. seqret -auto -stdout cgdb_nt:celsl2b >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank. ######################################################################### But I get the WRONG sequences if I specify the ACC: ------------------------------------------------------------------------------------------------------------------------------ seqret -auto -stdout cgdb_nt:M27263 >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank. seqret -auto -stdout cgdb_nt:M27264 >CELSNTI L15302 C.elegans synaptotagmin I mRNA, complete cds and flanking regions. ########################################################################### With fastacmd the headers look like this: --------------------------------------------------------------------------------------------------------------------------------- >gb|M27263|CELSL2A C.elegans trans-spliced leader 2 (SL2 RNA-alpha) gene, 5' flank >gb|M27264|CELSL2B C.elegans trans-spliced leader 2 (SL2 RNA-beta) gene, 5' flank ########################################################################## Any ideas ? Thanks, David. From mad at biol.unlp.edu.ar Wed Jun 26 15:16:57 2002 From: mad at biol.unlp.edu.ar (Martin Sarachu) Date: Wed, 26 Jun 2002 18:16:57 +0300 Subject: NCBI nt database index Message-ID: <3D19DAE9.1D12D0C@biol.unlp.edu.ar> Hi, we have the non-redundant NCBI nucleotide database (nt) indexed with > $ dbifasta -idformat ncbi the raw nt database look like this > >gi|4003368|dbj|AB000282.1|AB000282 Navel orange infectious mottling virus gene for polyprotein (coat protein region), partial cds > AATGTCACCATTGAAAGTGGTGACAATAATAATAATAATTGTCCCACCGGTAATGTAGATAATAGAGAAATACCGGTGGT > ....... > >gi|1827449|dbj|AB000449.1|AB000449 Homo sapiens mRNA for VRK1, complete cds > CCGAGTTACGAGTCGGCGAAAGCGGCGGGAAGTTCGTACTGGGCAGAACGCGACGGGTCTGCGGCTTAGGTGAAAATGCC > etc and when we run > $ fuzznuc -raccshow2 -rdesshow2 -rusashow2 > Nucleic acid pattern search > Input sequence(s): nt:* > Search pattern: GGTTTCsanttyggnac > Number of mismatches [0]: 3 > Output report [gi.fuzznuc]: xx.fuzznuc we get this > $ more xx.fuzznuc > ######################################## > # Program: fuzznuc > # Rundate: Wed Jun 26 18:09:24 2002 > # Report_file: xx.fuzznuc > ######################################## > > #======================================= > # > # Sequence: nt-id:gi from: 1 to: 1904 > # Accession: > # Description: Schizosaccharomyces pombe DNA for SUI1 homologue, complete cds > # HitCount: 1 > # > # Pattern: GGTTTCsanttyggnac > # Mismatch: 3 > # Complement: No > # > #======================================= > > Start End Mismatch Sequence > 9 25 3 GGTTACCATTTTGGCTA > > .... > # Sequence: nt-id:gi from: 1 to: 17070 > # Accession: > # Description: Oryza sativa gene for NADH-dependent glutamate synthase > # HitCount: 1 > # > etc The acnum.hit, acnum.trg, division.lkp and entrynam.idx for nt database seems to be correct. Any idea why de Accesion numbers doesn't show up on the fuzznuc results? martin -- Martin Sarachu mad at biol.unlp.edu.ar EMBnet Argentina http://www.ar.embnet.org From ableasby at hgmp.mrc.ac.uk Thu Jun 27 00:06:12 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 27 Jun 2002 01:06:12 +0100 (BST) Subject: Antwort: Re: Can't run X apps in EMBOSS - XCreatePixmap: BadDrawable Message-ID: <200206270006.BAA22905@bromine.hgmp.mrc.ac.uk> >On the long run it would be nice to get the PLPLOT X11 libraries fixed, For some time we have been working on replacing PLPLOT altogether. Development is being done on OpenGL/Java3D. We can't commit to a timescale yet, but things are looking promising. Lets just say we think it will be shorter than "in the long run". Alan From john.walshaw at bbsrc.ac.uk Thu Jun 27 09:04:11 2002 From: john.walshaw at bbsrc.ac.uk (john walshaw (JIC)) Date: Thu, 27 Jun 2002 10:04:11 +0100 Subject: dbiblast problem Message-ID: I have experienced a similar problem, tring to index WU-BLAST-formatted databases with dbiblast. However I got it to work properly with NCBI-BLAST-formatted databases. John Walshaw, John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK. +44(0)1603 450827 > -----Original Message----- > From: David.Bauer at SCHERING.DE [mailto:David.Bauer at SCHERING.DE] > Sent: 26 June 2002 12:32 > To: emboss at embnet.org > Subject: dbiblast problem > > > Hi, > > I observed the folowing problem when retrieving entries from > blast databases > formated with dbiblast: > > I get the CORRECT sequence if I specify the ID: > -------------------------------------------------------------- > --------------------------------------------------------------- > seqret -auto -stdout cgdb_nt:celsl2a > >CELSL2A M27263 C.elegans trans-spliced leader 2 (SL2 > RNA-alpha) gene, 5' flank. > seqret -auto -stdout cgdb_nt:celsl2b > >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' flank. > ############################################################## > ########### > > But I get the WRONG sequences if I specify the ACC: > -------------------------------------------------------------- > ---------------------------------------------------------------- > seqret -auto -stdout cgdb_nt:M27263 > >CELSL2B M27264 C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' flank. > seqret -auto -stdout cgdb_nt:M27264 > >CELSNTI L15302 C.elegans synaptotagmin I mRNA, complete cds > and flanking > regions. > ############################################################## > ############# > > With fastacmd the headers look like this: > -------------------------------------------------------------- > ------------------------------------------------------------------- > >gb|M27263|CELSL2A C.elegans trans-spliced leader 2 (SL2 > RNA-alpha) gene, 5' > flank > >gb|M27264|CELSL2B C.elegans trans-spliced leader 2 (SL2 > RNA-beta) gene, 5' > flank > ############################################################## > ############ > > Any ideas ? > > Thanks, David. > > From David.Bauer at SCHERING.DE Fri Jun 28 05:32:42 2002 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Fri, 28 Jun 2002 07:32:42 +0200 Subject: dbiblast problem Message-ID: Dear John, thanks for the info. Unfortunately the databases I try to index are formated with formatdb 2.2.1 (NCBI). The blast databases have been updated yesterday, so I rebuild the index. But the error is highly reproducible. I get the same wrong sequences back when using the ACC. David. I have experienced a similar problem, tring to index WU-BLAST-formatted databases with dbiblast. However I got it to work properly with NCBI-BLAST-formatted databases. From areagp61 at yahoo.it Fri Jun 28 10:00:19 2002 From: areagp61 at yahoo.it (Graziano P.) Date: Fri, 28 Jun 2002 12:00:19 +0200 Subject: fuzznuc output Message-ID: <000201c21e8c$132b1340$18105709@italy.ibm.com> Hi, I have installed the EMBOSS version 2.3.1. I have made an analisys with the fuzznuc program in this way: $ fuzznuc Nucleic acid pattern search Input sequence(s): mysequence Search pattern: acgtggac Number of mismatches [0]: 2 Output report [af049916.fuzznuc]: The output file is: $ more af049916.fuzznuc AF049916 148 ATGTGGAT AF049916 416 ACGTGGGC AF049916 845 TCTTGGAC AF049916 1722 ACGTGGGC AF049916 2007 ACGTGTGC AF049916 2257 ACATGTAC AF049916 3183 ACGTAAAC AF049916 3377 TCGTGGAA AF049916 3914 ACGTGCAC AF049916 4058 ACATGGAC AF049916 4317 ACGTAAAC AF049916 4534 TCGTGGAA AF049916 4906 ACATGGAA AF049916 5877 ACATGGAC AF049916 5954 TCCTGGAC AF049916 6024 CCCTGGAC AF049916 6094 ACGTGGCA AF049916 6127 GCCTGGAC AF049916 6148 ACATGGAC AF049916 6160 TCGTGGAC AF049916 6208 ACGTCGAC As you can see there is no name for each coloumn of the table; moreover this output is different from that you can see in the EMBOSS HELP, i.e. for example: ######################################## # Program: fuzznuc # Rundate: Thu Apr 11 13:34:06 2002 # Report_file: stdout ######################################## #======================================= # # Sequence: HHTETRA from: 1 to: 1272 # HitCount: 2 # # Pattern: aagctt # Mismatch: 0 # Complement: No # #======================================= Start End Mismatch Sequence 1 6 . aagctt 1267 1272 . aagctt #--------------------------------------- #--------------------------------------- I have tried to use the option -rformat seqtable, like suggested in the help, but the the program says: " EMBOSS An error in ajacd.c at line 11225: unknown qualifier -rformat" Have you got any idea about this problem? Is the matter in the EMBOSS version? Thanks Graziano -------------------------------------------------------------------------------------- Graziano Pappad? areagp61 at yahoo.it -------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.rice at uk.lionbioscience.com Fri Jun 28 12:45:10 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 28 Jun 2002 13:45:10 +0100 Subject: fuzznuc output References: <000201c21e8c$132b1340$18105709@italy.ibm.com> Message-ID: <3D1C5A56.B0D4D600@uk.lionbioscience.com> Hi Graziano, > "Graziano P." wrote: > > I have installed the EMBOSS version 2.3.1. > I have made an analisys with the fuzznuc program in this way: > $ fuzznuc > > As you can see there is no name for each coloumn of the table; moreover > this output is different from that you can see in the EMBOSS HELP, i.e. > for example: > > I have tried to use the option -rformat seqtable, like suggested in the help, but the the program says: > > " EMBOSS An error in ajacd.c at line 11225: > unknown qualifier -rformat" > > Have you got any idea about this problem? Is the matter in the EMBOSS > version? This is, as you say, an EMBOSS version issue. We are converting EMBOSS programs to use formatted reports. Fuzznuc was converted for EMBOSS 2.4.0 The web pages show the current (development) documentation. The EMBOSS distribution includes the documentation for each release in the doc/programs/html directory. In this case, I would recommend upgrading to EMBOSS 2.4.x because the fuzznuc report output is much nicer. regards, Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723