From gbottu at black.vub.ac.be Thu May 1 14:43:58 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Thu, 1 May 2003 20:43:58 +0200 Subject: Preferred isoschizomer ? In-Reply-To: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk>; from ableasby@hgmp.mrc.ac.uk on Wed, Apr 30, 2003 at 07:29:58PM +0100 References: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk> Message-ID: <20030501204358.A1336237@black.vub.ac.be> from : BEN On Wed, Apr 30, 2003 at 07:29:58PM +0100, ableasby at hgmp.mrc.ac.uk wrote: > There are replacement files for rebaseextract.c and rebaseextract.acd > in the ftp://ftp.uk.embnet.org/pub/EMBOSS/patchfiles/ > directory. By default this program will now produce an > embossre.equ file. Re-extract the withrefm file using the new > program. If you then use the -preferred option to 'restrict' > it should behave as you wish. Fine ! There is however a problem : the programs restrict and restover now behave as they should, but, the programs remap and showseq seem to ignore the parameter -preferred, or do I make a mistake ? Regards, Guy Bottu From eija.korpelainen at csc.fi Fri May 2 01:41:04 2003 From: eija.korpelainen at csc.fi (Eija Korpelainen) Date: Fri, 2 May 2003 08:41:04 +0300 Subject: Preferred isoschizomer ? References: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> <20030430181945.GD3138@iib.unsam.edu.ar> Message-ID: <002b01c3106d$6ea5b8a0$0402a6c1@windows.csc.fi> Dear Fernan, Guy and others, we have been looking into this problem with Alan and as he told you, the embossre.equ -file is now made automatically. -preferred works (gives you PstI instead of BspMAI) because the default value of -limit is true (this is defined in the restrict.acd file). So if one is using a graphical interface one has to tick both -preferred and -limit to get the right thing. This is because in the code of restrict.c -preferred (called "equiv" in the code) is considered only when -limit has been chosen. What the program actually does is that it first limits to one isoschizomer and picks the alphabetically first one (!), and then converts this to the prototype enzyme using the embossre.equ file. The limiting step is performed by the function embPatRestrictRestrict in embpat.c (in the nucleus directory). The problem with the current set up is that the user doesn't know that -limit and -preferred are interconnected. This could of course be documented, but the easy fix would be to set the equiv boolean true in the code and abolish the -preferred qualifier altogether. This way -limit would give you automatically PstI, and -nolimit all isoschizomers. As Guy pointed out, the problem with remap is that it does not take any notice of the -preferred. This is simply because the code reads the value of preferred (or equiv) but doesn't use it for anything. In other words, most of remap.c code comes from Alan's restrict.c code, but the following critical bit was accidentally left out. if(equiv && limit) { value = ajTableGet(table,m->cod); if (value) ajStrAss(&m->cod,value); } I think it would be important to fix these problems because these are quite central programs for molecular biologists and expensive projects like transgenic design depend heavily on proper restriction maps. Cheers, Eija _____________________________________________ Eija Korpelainen, Ph.D Science Support/Biosciences CSC - Center for Scientific Computing P.O.Box 405, FIN-02101 Espoo, Finland Phone +358 9 457 2030 Mobile +358 50 381 9726 Fax +358 9 457 2302 E-Mail Eija.Korpelainen at csc.fi ________________________________________________ From ableasby at hgmp.mrc.ac.uk Fri May 2 02:50:57 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Fri, 2 May 2003 07:50:57 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200305020650.h426ovQ24261@bromine.hgmp.mrc.ac.uk> Eija's analysis is quite correct. In fact the modifications to remap/showseq (or their equivalent) were made yesterday and passed on to the original author so they can be tested for any knock-on effects. It is true that, when the program was written, there were no GUIs for EMBOSS so the '-limit' confusion didn't arise. Eija's suggestion is a good one and will be tested Alan From bianji at jincao.com Fri May 2 06:18:10 2003 From: bianji at jincao.com (bianji at jincao.com) Date: Fri, 2 May 2003 18:18:10 +0800 Subject: =?GB2312?B?ufq80rDksry52NPaIrfHteQi1+7QwreowsmhoreoueY=?= Message-ID: <20030502100834.1E0B37D1A5@mercury.hgmp.mrc.ac.uk> ?????????????????????????????????? ????????????????"????"???????????????????????????????????????? "????"???????????? ?????? http://www.jincao.com/t1.htm ????????????????????????????????CEO?????? ???????????????????????????? msm at jincao.com 2003??5??2?? From peptides at earthlink.net Wed May 7 04:15:47 2003 From: peptides at earthlink.net (David Stephens) Date: Wed, 7 May 2003 01:15:47 -0700 Subject: Growth In Radiolabeled Peptides Message-ID: <20030507081548.EB25A7D20A@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030507/5fa68a9f/attachment.html From Marc.Logghe at devgen.com Wed May 7 06:18:13 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 7 May 2003 12:18:13 +0200 Subject: dbiflat question Message-ID: Hi all, I feel a little dumb but I'll ask it anyhow. I seem not to succeed in creating indices for a database using dbiflat. As a test I just wanted to index the genbank file /data/genbank/gbest226.seq Ok, I wanted my indices to be in /data/emboss/est so I have run dbiflat in that folder. dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq -dbname est I added this entry to emboss.default DB est [ type: N format: genbank method: emblcd directory: /data/emboss/est ] But, you guessed it, this did not work. What am I doing wrong here ? What happens with the passed dbname (could not find any file with that name after running dbiflat) ? TIA, marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 83 fax: +32 (0) 9 324 24 25 *********************************************************** From pmr at ebi.ac.uk Wed May 7 06:34:58 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 07 May 2003 11:34:58 +0100 Subject: dbiflat question References: Message-ID: <3EB8E152.7090706@ebi.ac.uk> Marc Logghe wrote: > I added this entry to emboss.default > DB est [ > type: N > format: genbank > method: emblcd > directory: /data/emboss/est > ] You need: directory: /data/genbank indexdirectory: /data/emboss/est EMBOSS needs to find the index files and the data files. Just specifying "directory" works if both files are there (it becomes the defualt for indexdirectory), so your confusion is quite understandable. Hope this helps, Peter Rice From pemberaj at pugh.bip.bham.ac.uk Wed May 7 07:17:31 2003 From: pemberaj at pugh.bip.bham.ac.uk (Tony Pemberton) Date: Wed, 7 May 2003 12:17:31 +0100 Subject: dbiflat question In-Reply-To: References: Message-ID: On Wed, 7 May 2003, Marc Logghe wrote: > Hi all, > I feel a little dumb but I'll ask it anyhow. I seem not to succeed in > creating indices for a database using dbiflat. > As a test I just wanted to index the genbank file /data/genbank/gbest226.seq > Ok, I wanted my indices to be in /data/emboss/est > so I have run dbiflat in that folder. > dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq > -dbname est > I added this entry to emboss.default > DB est [ > type: N > format: genbank > method: emblcd > directory: /data/emboss/est > ] > > But, you guessed it, this did not work. > What am I doing wrong here ? What happens with the passed dbname (could not > find any file with that name after running dbiflat) ? > TIA, > marc > > *********************************************************** > Marc Logghe, Ph.D. > Senior Scientist > Scientific Computing Group > deVGen > Technologiepark 9 > 9052 Zwijnaarde > Belgium > tel: +32 (0) 9 324 24 83 > fax: +32 (0) 9 324 24 25 > *********************************************************** > > > Marc, You need the .seq file also to be in the directory where you run dbiflat. Or make symbolic links! You will note that the dialogue of dbiflat asks about the files to process (*.seq). At this stage, I think I am correct in saying, that the database directory file emboss.default is not operable. This merely directs the user programs e.g. seqret to the formatted database (indeces) as shown by showdb. Regards, Tony ********************************************************************* Mr. A.J.Pemberton Tel: +121-414-3388 c/o Dept. Rheumatology, Fax: +121-414-6794 Medical School, E-mail: A.J.Pemberton at bham.ac.uk The University of Birmingham, Birmingham B15 2TT. U.K. ********************************************************************* From Marc.Logghe at devgen.com Wed May 7 08:05:25 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 7 May 2003 14:05:25 +0200 Subject: dbiflat question Message-ID: Thanks for the reply ! That is what I have figured out: when you run dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq in the index directory (e.g. /data/emboss/est) has the same effect as running dbiflat -idformat genbank -directory /data/genbank -indexdirectory /data/emboss/est -filenames gbest226.seq meaning, the index files are created in the desired place. But still the sequences themselves are not accessible using the mentioned entry in emboss.default ('seqret est -firstonly' gives a segmentation fault). I suppose the 'directory' key should point to the indexdirectory, right ? Because, the index itself should be pointing to the correct sequence path. At least that is what I expect. And indeed, as suggested by Tony, everything worked fine when putting index and sequence files in the same directory (indexdirectory and directory are the same). OK, just tried something which appears to work now. Switch to the first scenario again: separate paths for index and sequence files. When I changed the emboss.default to the following, everything worked fine: DB est [ type: N format: genbank method: emblcd indexdirectory: /data/emboss/est directory: /data/genbank ] Apparently you have to set the indexdirectory and directory explicitely in the configuration file also; pointing to the indexdirectory alone is not sufficient ! Regards, Marc > -----Original Message----- > From: Tony Pemberton [mailto:pemberaj at pugh.bip.bham.ac.uk] > Sent: Wednesday, May 07, 2003 1:18 PM > To: Marc Logghe > Cc: Emboss (E-mail) > Subject: Re: dbiflat question > > > On Wed, 7 May 2003, Marc Logghe wrote: > > > Hi all, > > I feel a little dumb but I'll ask it anyhow. I seem not to > succeed in > > creating indices for a database using dbiflat. > > As a test I just wanted to index the genbank file > /data/genbank/gbest226.seq > > Ok, I wanted my indices to be in /data/emboss/est > > so I have run dbiflat in that folder. > > dbiflat -idformat genbank -directory /data/genbank > -filenames gbest226.seq > > -dbname est > > I added this entry to emboss.default > > DB est [ > > type: N > > format: genbank > > method: emblcd > > directory: /data/emboss/est > > ] > > > > But, you guessed it, this did not work. > > What am I doing wrong here ? What happens with the passed > dbname (could not > > find any file with that name after running dbiflat) ? > > TIA, > > marc > > > > *********************************************************** > > Marc Logghe, Ph.D. > > Senior Scientist > > Scientific Computing Group > > deVGen > > Technologiepark 9 > > 9052 Zwijnaarde > > Belgium > > tel: +32 (0) 9 324 24 83 > > fax: +32 (0) 9 324 24 25 > > *********************************************************** > > > > > > > > Marc, > > You need the .seq file also to be in the directory where you run > dbiflat. Or make symbolic links! > > You will note that the dialogue of dbiflat asks about the files to > process (*.seq). At this stage, I think I am correct in saying, that > the database directory file emboss.default is not operable. This > merely directs the user programs e.g. seqret to the formatted > database (indeces) as shown by showdb. > > Regards, > > Tony > > > ********************************************************************* > Mr. A.J.Pemberton Tel: +121-414-3388 > c/o Dept. Rheumatology, Fax: +121-414-6794 > Medical School, E-mail: A.J.Pemberton at bham.ac.uk > The University of Birmingham, > Birmingham B15 2TT. > U.K. > ********************************************************************* > From Stephan.Hurling at evotecoai.com Thu May 8 08:41:50 2003 From: Stephan.Hurling at evotecoai.com (Stephan.Hurling at evotecoai.com) Date: Thu, 8 May 2003 14:41:50 +0200 Subject: Problems with dbigcg... Message-ID: Hello Everyone, I would like to use EMBOSS version 2.6.0 together with the GCG Wisconsin package version 10.3 on a Red Hat 7.2 linux server. I followed the installation instructions from the administrators guide and doing the usual ./configure make make install I compiled and installed emboss on my system without any error messages. But when I want to make indexes from a gcg database I run into troubles. See the following output of an interactive session: 14:18 [root at kepler] ~/Temp # dbigcg Index a GCG formatted database EMBL : EMBL SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew GENBANK : Genbank, DDBJ PIR : NBRF Entry format [EMBL]: Database directory [.]: /usr/local/share/EMBOSS/data/GCG_DATABASES/gcgembl Wildcard database filename [*.seq]: Database name: embl Release number [0.0]: 73.0 Index date [00/00/00]: 01/12/02 EMBOSS An error in embdbi.c at line 590: Cannot open embl.idsrt for reading 14:21 [root at kepler] ~/Temp # ll total 252k drwxr-xr-x 2 root root 4.0k May 8 14:18 ./ drwxr-x--- 24 root root 4.0k May 8 14:15 ../ -rw------- 1 root root 1.4M May 8 14:18 core -rw-r--r-- 1 root root 1.5k May 8 14:18 division.lkp -rw-r--r-- 1 root root 675 May 8 14:18 embl001.acnum -rw-r--r-- 1 root root 104 May 8 14:18 embl002.acnum -rw-r--r-- 1 root root 504 May 8 14:18 embl003.acnum -rw-r--r-- 1 root root 51 May 8 14:18 embl004.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl005.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl006.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl007.acnum -rw-r--r-- 1 root root 126 May 8 14:18 embl008.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl009.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl010.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl011.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl012.acnum -rw-r--r-- 1 root root 36 May 8 14:18 embl013.acnum -rw-r--r-- 1 root root 25k May 8 14:18 embl014.acnum -rw-r--r-- 1 root root 161 May 8 14:18 embl015.acnum -rw-r--r-- 1 root root 850 May 8 14:18 embl016.acnum -rw-r--r-- 1 root root 1.3k May 8 14:18 embl017.acnum -rw-r--r-- 1 root root 2.6k May 8 14:18 embl018.acnum -rw-r--r-- 1 root root 121 May 8 14:18 embl019.acnum -rw-r--r-- 1 root root 290 May 8 14:18 embl020.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl021.acnum -rw-r--r-- 1 root root 104 May 8 14:18 embl022.acnum -rw-r--r-- 1 root root 188 May 8 14:18 embl023.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl024.acnum -rw-r--r-- 1 root root 34 May 8 14:18 embl025.acnum -rw-r--r-- 1 root root 14 May 8 14:18 embl026.acnum -rw-r--r-- 1 root root 490 May 8 14:18 embl027.acnum -rw-r--r-- 1 root root 300 May 8 14:18 entrynam.idx -rw------- 1 root root 0 May 8 14:18 sort9YCQfK Can somebody help me? Have I done something wrong during the compilation step of emboss? Any hint would help me. Thanks in advance... All the best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030508/3ce77125/attachment.html From ablavier at wanadoo.fr Sun May 11 14:36:24 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Sun, 11 May 2003 20:36:24 +0200 Subject: EMBOSS for Windows: DLL build Message-ID: <001e01c317ec$3c9feb60$5ca03551@bach> EMBOSS for Windows is now built with ajax and nucleus compiled as DLLs, so the EMBOSS programs are now much smaller, and the distribution as well. dbiblast is now in the package. See http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html. -- Andr? Blavier From arunanirudhan at yahoo.co.in Mon May 12 04:27:44 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Mon, 12 May 2003 09:27:44 +0100 (BST) Subject: seqret Message-ID: <20030512082744.65129.qmail@web8203.mail.in.yahoo.com> Hello allHow can i use seqret to retrieve sequences from a database like we use in entrez? For eg: I want to get sequences of all insulin from genbank. What to give as command?seqret embl:insulin ? Arun Catch all the cricket action. Download Yahoo! Score tracker -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030512/2ca4eb08/attachment.html From maoj at mail.nih.gov Mon May 12 10:16:05 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Mon, 12 May 2003 10:16:05 -0400 Subject: about ftp site of EMBOSS Administrators Guide Message-ID: <00e801c31891$0c341410$618a70a5@citjmao> Hi, where can I find the pdf version of emboss administrators guide? the link on the website doesn't work. thanks. Jean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030512/c7d5275b/attachment.html From gwilliam at hgmp.mrc.ac.uk Mon May 12 10:40:35 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 12 May 2003 15:40:35 +0100 Subject: about ftp site of EMBOSS Administrators Guide References: <00e801c31891$0c341410$618a70a5@citjmao> Message-ID: <3EBFB263.95B0DDAB@hgmp.mrc.ac.uk> There is no PDF version of the current guide. The link on the web was left there by accident and has now been tidied away - sorry. Gary > Jean Mao wrote: > > Hi, where can I find the pdf version of emboss administrators guide? > the link on the website doesn't work. thanks. > > Jean -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From gwilliam at hgmp.mrc.ac.uk Mon May 12 11:40:10 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 12 May 2003 16:40:10 +0100 Subject: about ftp site of EMBOSS Administrators Guide References: <00e801c31891$0c341410$618a70a5@citjmao> <3EBFB263.95B0DDAB@hgmp.mrc.ac.uk> Message-ID: <3EBFC05A.16A96393@hgmp.mrc.ac.uk> The .ps and .pdf versions of the current guide are now on the web page: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html See: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/admin.ps and http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/admin.pdf Gary > > Jean Mao wrote: > > > > Hi, where can I find the pdf version of emboss administrators guide? > > the link on the website doesn't work. thanks. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From arunanirudhan at yahoo.co.in Tue May 13 03:25:13 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Tue, 13 May 2003 08:25:13 +0100 (BST) Subject: Fwd: seqret Message-ID: <20030513072513.53308.qmail@web8204.mail.in.yahoo.com> Note: forwarded message attached. Catch all the cricket action. Download Yahoo! Score tracker -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030513/a71082c2/attachment.html -------------- next part -------------- An embedded message was scrubbed... From: =?iso-8859-1?q?arun=20anirudhan?= Subject: seqret Date: Mon, 12 May 2003 09:27:44 +0100 (BST) Size: 2520 Url: http://lists.open-bio.org/pipermail/emboss/attachments/20030513/a71082c2/attachment.mht From peptides at earthlink.net Tue May 13 04:45:09 2003 From: peptides at earthlink.net (David Stephens) Date: Tue, 13 May 2003 01:45:09 -0700 Subject: Sourcing Information For Amino Acids and Custom Peptides Message-ID: <20030513084512.363E87D2CC@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030513/c3982ccb/attachment.html From yann-francois.bizouerne at bayercropscience.com Thu May 15 11:29:46 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Thu, 15 May 2003 17:29:46 +0200 Subject: Search for organism of entry Message-ID: Hello, I have install EMBOSS on our server since recently. I reallly enjoy a lot the different tools but I have a little problem and I can't find the solution anywhere. I have index the SwissProt database with the command line : dbiflat -idformat SWISS -directory . -filenames sprot.dat -dnname sprot -fields acnum,seqvn,des,keyword,taxon So after that I could find sequence information when I am looking by for particular organism or keyword. For the moment What I could retrieve with the accession number is the following : >infoseq sprot:P15711 Displays some simple information about sequences # USA Name Accession Type Length Description ian-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa microneme-rhoptry antigen. And when I am looking with the organism I obtain : >infoseq sprot-org:"*Theileria*" -outfile stdout Displays some simple information about sequences # USA Name Accession Type Length Description sprot-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa microneme-rhoptry antigen. So now I want to know if I could for one particular entry (sprot:P15711) find the Organism (Theileria prava) or not ? Thanks in advance for your answer. Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From pmr at ebi.ac.uk Thu May 15 12:26:31 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 15 May 2003 17:26:31 +0100 Subject: Search for organism of entry References: Message-ID: <3EC3BFB7.5000605@ebi.ac.uk> yann-francois.bizouerne at bayercropscience.com wrote: > And when I am looking with the organism I obtain : > >infoseq sprot-org:"*Theileria*" -outfile stdout > Displays some simple information about sequences > # USA Name Accession Type Length Description > sprot-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa > microneme-rhoptry antigen. > > > So now I want to know if I could for one particular entry (sprot:P15711) find > the Organism (Theileria prava) or not ? EMBOSS can search a database by organism, but reads the sequence (in most programs) or the whole entry (entret) ... but I am looking into ways to parse out more detail, including organism, citation, and features. The database definition would have a list of fields that can be retrieved, and a program like (for example) entret could check the fields and let you choose the ones you need. For now, you can run entret and look for the organism in the text. Hope this helps, Peter Rice From henrikki.almusa at helsinki.fi Tue May 20 02:42:36 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Tue, 20 May 2003 09:42:36 +0300 Subject: Support for nexus in alignment format Message-ID: <200305200942.36206.henrikki.almusa@helsinki.fi> Hello I read through the alignment formats that emboss supports. I was wondering if nexus is supported as alignment format (-aformat nexus)? I seems to be supported as sequence format but it wasnt mentioned as alignment format. -- Henrikki Almusa From pmr at ebi.ac.uk Tue May 20 05:29:57 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 May 2003 10:29:57 +0100 Subject: Support for nexus in alignment format References: <200305200942.36206.henrikki.almusa@helsinki.fi> Message-ID: <3EC9F595.5000008@ebi.ac.uk> Henrikki Almusa wrote: > I read through the alignment formats that emboss supports. I was wondering if > nexus is supported as alignment format (-aformat nexus)? I seems to be > supported as sequence format but it wasnt mentioned as alignment format. The sequence formats are easy to add as alignment formats. Not sure quite how useful that is. You can do this: 1. Create your alignment in a sequence format (FASTA, MSF) 2. Use seqret to convert to nexus format ... or does NEXUS format hold some extra information that would make it a useful alignment format, and that we lose by going through FASTA? Hope this helps, Peter From yann-francois.bizouerne at bayercropscience.com Wed May 21 04:53:19 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Wed, 21 May 2003 10:53:19 +0200 Subject: Use two Emboss package with one database Message-ID: Hello, I am working with 2 diffretns servers on different locations. On each of them a EMBOSS package tools is installed. I need to know if I could configure these 2 EMBOSS in order to work with the same database (which is located on one of the two servers). Is EMBOSS could working this way or does I need to have only one EMBOSS package (tools + databse) installed on one server ? I hope that my question is clear enough. Best Regards Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From pmr at ebi.ac.uk Wed May 21 04:59:18 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 21 May 2003 09:59:18 +0100 Subject: Use two Emboss package with one database References: Message-ID: <3ECB3FE6.1020906@ebi.ac.uk> yann-francois.bizouerne at bayercropscience.com wrote: > Hello, > > I am working with 2 diffretns servers on different locations. On each of them a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only one EMBOSS package > (tools + databse) installed on one server ? Yes ... but you need to do some work. The EMBOSS package on the same server as the databases is easy. The second EMBOSS package needs to read from remote databases. I assume you indexed them with dbiflat (an the other dbi programs). You can access a remote database by: SRSWWW if it on an SRS server URL if you have a web page to query the database APP (EXTERNAL) if yuo have a script that can return an entry Assuming you don't have them under SRS ... You can provide a simple web CGI script that runs entret (for whole entry) or seqret (for sequence only - you can put -osformat on the command line to get the format of your choice)) You can write a script that will access the databases somehow (possibly also by talking to a web page - your choice). Meanwhile, I am working on ways to define EMBOSS web services and data services that would give an alternative access method, but that is for later in the year. Hope this helps, Peter Rice From Marc.Logghe at devgen.com Wed May 21 05:03:00 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 21 May 2003 11:03:00 +0200 Subject: Use two Emboss package with one database Message-ID: Hi, At our site emboss is installed on every node of a cluster while the databases are installed on only one. The only thing you have to do is mount the database directory/directories in one way or another on every node and adapt the emboss.default files appropriately (if necessary) so that the DB entries are pointing to the correct directories. HTH, Marc > -----Original Message----- > From: yann-francois.bizouerne at bayercropscience.com > [mailto:yann-francois.bizouerne at bayercropscience.com] > Sent: Wednesday, May 21, 2003 10:53 AM > To: emboss at embnet.org > Subject: Use two Emboss package with one database > > > Hello, > > I am working with 2 diffretns servers on different locations. > On each of them a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order > to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only > one EMBOSS package > (tools + databse) installed on one server ? > > I hope that my question is clear enough. > > Best Regards > > > > Yann-Fran?ois BIZOUERNE > BioInformatic Team > BAYER CropScience > 1, rue Pierre Fontaine > 91058 Evry Cedex > FRANCE > Phone: 33-(0) 1-69-47-61-56 > FAX: 33-(0) 1-69-47-61-42 > E-mail: yann-francois.bizouerne at bayercropscience.com > Intranet: http://bioinfo.evry.fr.bayercropscience/ > > From d.m.a.martin at dundee.ac.uk Wed May 21 05:04:24 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 21 May 2003 10:04:24 +0100 Subject: Use two Emboss package with one database In-Reply-To: Message-ID: On 21/5/03 9:53 am, "yann-francois.bizouerne at bayercropscience.com" wrote: > Hello, > > I am working with 2 diffretns servers on different locations. On each of them > a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only one EMBOSS > package > (tools + databse) installed on one server ? There are two options here: 1. Different platforms accessing the same database 2. same platform (different machines) accessing the same database. 1. Easy. When you do a configure set the prefix (prefixes) approrpriately so that the executables go to an appropriate place and the databases point to a shared (NFS or similar) drive containing the config files. Obviously you have to use the same mountpoint on all your machines for this to work (I use /site/share/EMBOSS for the config and /site/databases as a root for the databases. In this case /site/bin is local, not shared and contains the appropriate binaries [or can be a symlink to /site/Linux/bin, /site/IRIX/bin, /site/Solaris/bin, /site/Darwin/bin as appropriate if you are supporting more than on emachine on a particular platform] ) 2. Can be done in the same way using a shared drive for the executables. One gotcha is that EMBOSS, despite all efforts, does not compile statically so you have to ensure that the library versions are the same across the various platforms or you will get runtime errors. Either of these methods will reduce the maintenance load considerably. In my case I use NFS for the data directories and use a nightly scheduled rsync to synchronise the executables and config files with the master machine as these don't take much space and it reduces the network overhead. Hope this helps. ..d > > I hope that my question is clear enough. > > Best Regards > > > > Yann-Fran?ois BIZOUERNE > BioInformatic Team > BAYER CropScience > 1, rue Pierre Fontaine > 91058 Evry Cedex > FRANCE > Phone: 33-(0) 1-69-47-61-56 > FAX: 33-(0) 1-69-47-61-42 > E-mail: yann-francois.bizouerne at bayercropscience.com > Intranet: http://bioinfo.evry.fr.bayercropscience/ > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From d.m.a.martin at dundee.ac.uk Wed May 21 05:13:25 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 21 May 2003 10:13:25 +0100 Subject: Use two Emboss package with one database In-Reply-To: <3ECB3FE6.1020906@ebi.ac.uk> Message-ID: On 21/5/03 9:59 am, "Peter Rice" wrote: > yann-francois.bizouerne at bayercropscience.com wrote: >> Hello, >> >> I am working with 2 diffretns servers on different locations. On each of them >> a >> EMBOSS package tools is installed. >> I need to know if I could configure these 2 EMBOSS in order to work with the >> same database (which is located on one of the two servers). >> Is EMBOSS could working this way or does I need to have only one EMBOSS >> package >> (tools + databse) installed on one server ? > > Yes ... but you need to do some work. > > The EMBOSS package on the same server as the databases is easy. > > The second EMBOSS package needs to read from remote databases. I assume > you indexed them with dbiflat (an the other dbi programs). > > You can access a remote database by: > > SRSWWW if it on an SRS server > URL if you have a web page to query the database > APP (EXTERNAL) if yuo have a script that can return an entry > > Assuming you don't have them under SRS ... > > You can provide a simple web CGI script that runs entret (for whole > entry) or seqret (for sequence only - you can put -osformat on the > command line to get the format of your choice)) > > You can write a script that will access the databases somehow (possibly > also by talking to a web page - your choice). > > Meanwhile, I am working on ways to define EMBOSS web services and data > services that would give an alternative access method, but that is for > later in the year. > What about just using Jemboss (or a variant thereof) to talk to the master machine? The alternative is shunting lots of data around which isn't really feasible unless you have a fast network between the machines. Do you move the data to the problem or the problem to the data? The trade off is between transport time and execution time. Does Jemboss make use of a SOAP server? If not it would be really nice to have a script that could generate a WSDL definition from the ACD files. It's then one step away from being a Grid service.. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From maoj at mail.nih.gov Wed May 21 12:21:43 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Wed, 21 May 2003 12:21:43 -0400 Subject: question about databases setup Message-ID: <038001c31fb5$1171acf0$618a70a5@citjmao> Hi, I am new to emboss. have question in database setup. I have a file in the directory /data/maoj/emboss/db/mouse/ called 'test.dat'. this file has 9 entries in embl format. i ran dbiflat, acnum.hit, acnum.trg, division.lkp, entrynam.idx were generated. then I setup a .embossrc file in my home dir as follows : ------------------------------------------------------------------------------------------- # Logfile - set this to a file that any user can append to # and EMBOSS applications will automatically write log information # SET emboss_logfile /home/db/emboss/tmp/log DB test [ type: N method: emblcd format: embl dir: /data/maoj/emboss/db/mouse file: "*.dat" release: "1" comment: "Test DB" ] ------------------------------------------------- when i run seqret and try to retrieve 1 of the 9 entries, i got following error: % seqret Reads and writes (returns) sequences Input sequence(s): test:AB001363 Warning: Cannot open division file '' for database 'test' Warning: seqCdQry failed Error: Unable to read sequence 'test:AB001363' Please help. Thank you in advance. Jean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030521/34ed2c6a/attachment.html From pmr at ebi.ac.uk Wed May 21 12:30:43 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 21 May 2003 17:30:43 +0100 Subject: question about databases setup References: <038001c31fb5$1171acf0$618a70a5@citjmao> Message-ID: <3ECBA9B3.3070802@ebi.ac.uk> Jean Mao wrote: > Hi, I am new to emboss. have question in database setup. > > I have a file in the directory /data/maoj/emboss/db/mouse/ called > 'test.dat'. this file has 9 entries in embl format. i ran dbiflat, > acnum.hit, acnum.trg, division.lkp, entrynam.idx were generated. You need to specify where the index files are (indexdir) in the database definition. Hope this helps, Peter Rice From mathog at mendel.bio.caltech.edu Wed May 21 18:04:02 2003 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Wed, 21 May 2003 15:04:02 -0700 Subject: extractfeat with gff files? Message-ID: EMBOSS 2.6.0 I cannot seem to locate the magic incantation that will make gff files work as desired with fasta files. HELP! Here's the sort of line I want to extract (sorry about the wrap): X gadfly translation 1880 3119 . - . genegrp=CG3038; transgrp=CG3038-RB There are many other lines in the gff for transcription,exon, gene, etc. which should not be extracted. The fasta input file currently has entries with names like X,2L,etc. which correspond to the first column of the gff file. Ideally I'd like to be able to use one gff file (with X->3R in the first column) to extract from one fasta file (again, with X->3R for the fasta name), and have the descriptions of X act only on the sequence X, and so forth. The idea being to be able to extract features on a genomic level using only one fasta/gff pair, rather than N (=#of scaffolds) pairs. First though I tried an input fasta file containing (11 X 10kb entries, the first being X) and a gff file also starting only with X (but with references for the whole chromosome, 22101 lines). The following command sat for about two minutes, burned a lot of CPU time, but emitted nothing: extractfeat -sequence=dmel_genome_frag.nfa\ -ufo=x.gff -type=translation -outseq=x.nfa When the -type qualifier was removed it went nuts and emitted over 40000 entries (more than there were lines in the gff file!) before I killed it. Clearly there was no error checking for size of gff entry versus size of sequence. The input fasta file had 11 entries of 10000 bp each. The first was X. Yet a bunch of lines like: >X_12390_12854 [exon] X release:3 length:21780003bp Assembled X chromosome arm sequence md5:f3fbbb4c44f0d30d1effeecc87b5bd18 T were emitted. So the fasta file was reduced to just one entry (X, 10kb) and this time the output fasta file held 22101 entries. As before, those beyond 10kb were emitted with a single base. So apparently the entire gff description is applied to each fasta sequence and there's no checking of the first column against the sequence name. That's ok - we can live with that for now, but it would be better if the descriptions could automatically matched to the sequence names. I'm not sure though that we can live with it emitting single bp sequences when the description is outside of the sequence. If the feature is beyond the end of the input sequence it just isn't there, right? Just to spite me "translation" was never emitted. There were only lines for gene,exon.misc_feature,tRNA,snoRNA. So I tried: extractfeat -sequence=dmel_x.nfa \ -ufo=x.gff -outseq=x.nfa -type=gene And it emitted a single whole gene match at (1488,3280,-) correctly. The next one at (3445,11463,+) partially (and correctly, ending at the end of the sequence - a warning would have been nice) and then a slew of (>2000) single base pair "empty" entries outside of the input sequence. Note also that there's no indication on the fasta header line in the output of the strand which was selected. So, how does one get extractfeat to emit only matches to "translation"? Please tell me there's some way other than by extracting those lines into a separate gff file and renaming them all "gene"! Extractfeat seems to have a predefined set of "features" that it's willing to work with and doesn't handle others well. To narrow this down a bit more I made a small gff file containing "fred" where "gene" had been and specifying positions <10kb. The features were emitted but all were labeled "misc_feature". Is this documented somewhere? It isn't in an obvious place in the on line help, as both of these searches come up empty. extractfeat -h 2>&1 | grep -i misc tfm extractfeat 2>&1 | grep -i misc It would also be nice if there was some way to get column 9 from the gff file onto the fasta header line somewhere. (It can then be rearranged to suit later.) Currently even if one has the gene names lined up with the gene entries in the gff file the resulting fasta file just says "X_100_123 [gene]..." without any of the comment info. You've got the sequence but not the names of the genes. Very painful to work with if the output is the coding sequences for an entire genome. Is there a switch (or bug fix) that stops extractfeat from emitting garbage single bp entries for descriptions outside the sequence? Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From yann-francois.bizouerne at bayercropscience.com Thu May 22 10:26:36 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Thu, 22 May 2003 16:26:36 +0200 Subject: creation of new output fasta format Message-ID: Hello, Fisrt thanks a lot for your quick response to my last mail. Now, I am trying to create a new fasta format. The format I want to obtain : > dbname:id |accession|organism|description By the way I create a new function in the ajseqwrite.c (seqWriteNewFasta). I have select the diffrent informations I want to retrieve by using the examples of others functions. It is working quite well. Except for the Pir and Nrl_3D databases. Indeed for these databases, I have no database name and no organism (taxon) /** Database name **/ if (ajStrLen(outseq->Db)) (void) ajFmtPrintF (outseq->File, ">%S:", outseq->Db); else if (ajStrLen(outseq->Setdb)) (void) ajFmtPrintF (outseq->File, ">%S:", outseq->Setdb); else (void) ajFmtPrintF (outseq->File, ">unk:"); /** Organism **/ if (ajStrLen(outseq->Tax)) (void) ajFmtPrintF (outseq->File, "%S|", outseq->Tax); I try to find some information about NBRF format in EMboss and the way to use it, but I could find nothing. Do you have a clue for me ? Best regards Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From peptides at earthlink.net Sat May 24 18:55:53 2003 From: peptides at earthlink.net (David Stephens) Date: Sat, 24 May 2003 15:55:53 -0700 Subject: Happy Memorial Day Message-ID: <20030524225553.29CB27D181@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030524/4408e78b/attachment.html From henrikki.almusa at helsinki.fi Tue May 27 06:46:13 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Tue, 27 May 2003 13:46:13 +0300 Subject: Graph data handling Message-ID: <200305271339.51333.henrikki.almusa@helsinki.fi> Hello, I'm trying to use graphs in scripts outside of emboss. However i got into problems with options conserning the graph handling. I found following options from "banana" tools webpage: "-graph" related qualifiers -gprompt boolean Graph prompting -gtitle string Graph title -gsubtitle string Graph subtitle -gxtitle string Graph x axis title -gytitle string Graph y axis title -goutfile string Output file for non interactive displays -gdirectory string Output directory I tried to use -goutfile and -gdirectory with banana, but i seem to be unable to effect the data file(s) or their directories. If i understand correctly this should work "banana mRNA.seq -graph data -goutfile /home/hena/banana_data_file -auto" or then "banana mRNA.seq -graph data -goutfile banana_data_file -gdirectory /home/hena". For second i get error: "Died: unknown qualifier -gdirectory" and with first i get "Created banana_data_file.dat", but no such file is created and no data fale is there. Also if i use "-data" option there, i get multiple bananaX.dat (in which X is running number) files. So, my questions are. How exactly how is this supposed to work? And could it be added to webpages "User documentation" section with other formats. And thirdly is there a reason, why some programs expect "-data" option and others do not? TIA -- Henrikki Almusa From sebastien.frade at bayercropscience.com Wed May 28 08:38:49 2003 From: sebastien.frade at bayercropscience.com (sebastien.frade at bayercropscience.com) Date: Wed, 28 May 2003 14:38:49 +0200 Subject: No subject Message-ID: Hi, I'm a new user of EMBOSS and i like to extract some information of EMBL flat file like clone, strain, tissue ... that are stored in the FT section. But i don't know how to do that. I've look for a tool that can extract features, but no one of them extract these fields. if a such tool doesn't exist how can i develop it ? Please help me !! Thank S?bastien Frade BioInformatic Team BAYER CropScience 1 Rue Pierre FONTAINE 91058 EVRY ? France tel : 33 (0) 1 69 47 61 52 fax : 33 (0) 1 69 47 61 42 mail : sebastien.frade at bayercropscience.com http://bioinfo.evry.fr.bayercropscience From pmr at ebi.ac.uk Wed May 28 10:30:43 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 28 May 2003 15:30:43 +0100 Subject: References: Message-ID: <3ED4C813.4010900@ebi.ac.uk> sebastien.frade at bayercropscience.com wrote: > Hi, > > I'm a new user of EMBOSS and i like to extract some information of EMBL flat > file like clone, strain, tissue ... that are stored in the FT section. > But i don't know how to do that. > > I've look for a tool that can extract features, but no one of them extract these > fields. This sounds like a task for SRS :-) http://srs.ebi.ac.uk/ EMBOSS really works with the sequence data. We can try to extract more of the other data but it is a non-trivial task. But ... you could write your own EMBOSS tool, and we can help you to do that!!! Hope this helps Peter Rice From maoj at mail.nih.gov Fri May 30 10:49:17 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Fri, 30 May 2003 10:49:17 -0400 Subject: question about dbiblast Message-ID: <0d4401c326ba$a5847600$618a70a5@citjmao> Hi, I am new in emboss db config. Need some help in indexing blast db. I have in dir following files: -rw-rw-r-- 1 maoj Seqdb 190733 May 30 08:19 drosoph.nt.nhr -rw-rw-r-- 1 maoj Seqdb 14108 May 30 08:19 drosoph.nt.nin -rw-rw-r-- 1 maoj Seqdb 9360 May 30 08:19 drosoph.nt.nnd -rw-rw-r-- 1 maoj Seqdb 84 May 30 08:19 drosoph.nt.nni -rw-rw-r-- 1 maoj Seqdb 174584 May 30 08:19 drosoph.nt.nsd -rw-rw-r-- 1 maoj Seqdb 3699 May 30 08:19 drosoph.nt.nsi -rw-rw-r-- 1 maoj Seqdb 31368306 May 30 08:19 drosoph.nt.nsq i believe these are files from NCBI and generated use formatdb version 2.2.5. I run dbiblast in this directory: Index a BLAST database Database name: drosoph Database directory [.]: Wildcard database filename [drosoph]: drosoph.nt.* Release number [0.0]: Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: N 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 then I got many lines of the following message: Warning: Duplicate ID skipped: '0?0? ?^DROSOPHILA' All hits will point to first ID found The following new files were generated: -rw-rw-r-- 1 maoj Seqdb 322 May 30 10:44 division.lkp -rw-rw-r-- 1 maoj Seqdb 496 May 30 10:44 entrynam.idx -rw-rw-r-- 1 maoj Seqdb 300 May 30 10:44 acnum.trg -rw-rw-r-- 1 maoj Seqdb 300 May 30 10:44 acnum.hit I then edit my ~/.embossrc file by add the following lines: DB drosoph [ type: N method: blast format: ncbi dir: /data/maoj/emboss/db/blast/drosoph indexdir: /data/maoj/emboss/db/blast/drosoph file: "drosoph.nt.*" release: "0.0" comment: "blast drosoph" ] Then I run showdb: % showdb Displays information on the currently available databases # Name Type ID Qry All Comment # ==== ==== == === === ======= drosoph N OK OK OK blast drosoph test N OK OK OK Test DB The test db is genbank format and was running fine. Then I run seqret: % seqret Reads and writes (returns) sequences Input sequence(s): drosoph:A* Error: BLAST Query failed Error: Unable to read sequence 'drosoph:A*' Input sequence(s): drosoph:* EMBOSS An error in ajseqdb.c at line 4006: error reading file /data/maoj/emboss/db/blast/drosoph/drosoph.nt.nhr Please advise what I might did wrong. Thank you very much!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030530/21abd710/attachment.html From pmr at ebi.ac.uk Fri May 30 14:24:58 2003 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Fri, 30 May 2003 19:24:58 +0100 (BST) Subject: question about dbiblast In-Reply-To: <0d4401c326ba$a5847600$618a70a5@citjmao> References: <0d4401c326ba$a5847600$618a70a5@citjmao> Message-ID: <1189.217.134.86.144.1054319098.squirrel@webmail.ebi.ac.uk> > Hi, I am new in emboss db config. Need some help in indexing blast db. This is the long-standing problem of the "new ASN.1 format blast database" NCBI changed formatdb to create a new index file format, but we have no documentation on the new format, so we cannot update dbiblast to index it. We hope to provide the ability to index these blast databases in a future release, once NCBI release the format specification. I suspect EMBOSS and FASTA are the only other applications using blast index formats so it is not an urgent task for them. Meanwhile, you need to use the 'old' format: First, you need the original FASTA format file (drosophila.nt) Then, index it with formatdb but add "-A F" to the command line (to turn off ASN.1 format). Hope this helps, Peter Rice From calvinwangxi at yahoo.com Sat May 31 07:43:11 2003 From: calvinwangxi at yahoo.com (calvin wang) Date: Sat, 31 May 2003 04:43:11 -0700 (PDT) Subject: new moethods In-Reply-To: <1189.217.134.86.144.1054319098.squirrel@webmail.ebi.ac.uk> Message-ID: <20030531114311.68280.qmail@web41115.mail.yahoo.com> I need to use TCOFFE, I understand this is not part of EMBOSS. Is it possible to include new methods in to EMBOSS? how? is there a guide? thanks. > Hi, I am new in emboss db config. Need some help in indexing blast db. This is the long-standing problem of the "new ASN.1 format blast database" NCBI changed formatdb to create a new index file format, but we have no documentation on the new format, so we cannot update dbiblast to index it. We hope to provide the ability to index these blast databases in a future release, once NCBI release the format specification. I suspect EMBOSS and FASTA are the only other applications using blast index formats so it is not an urgent task for them. Meanwhile, you need to use the 'old' format: First, you need the original FASTA format file (drosophila.nt) Then, index it with formatdb but add "-A F" to the command line (to turn off ASN.1 format). Hope this helps, Peter Rice --------------------------------- Do you Yahoo!? Free online calendar with sync to Outlook(TM). -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030531/7dba1b52/attachment.html From gbottu at black.vub.ac.be Thu May 1 18:43:58 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Thu, 1 May 2003 20:43:58 +0200 Subject: Preferred isoschizomer ? In-Reply-To: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk>; from ableasby@hgmp.mrc.ac.uk on Wed, Apr 30, 2003 at 07:29:58PM +0100 References: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk> Message-ID: <20030501204358.A1336237@black.vub.ac.be> from : BEN On Wed, Apr 30, 2003 at 07:29:58PM +0100, ableasby at hgmp.mrc.ac.uk wrote: > There are replacement files for rebaseextract.c and rebaseextract.acd > in the ftp://ftp.uk.embnet.org/pub/EMBOSS/patchfiles/ > directory. By default this program will now produce an > embossre.equ file. Re-extract the withrefm file using the new > program. If you then use the -preferred option to 'restrict' > it should behave as you wish. Fine ! There is however a problem : the programs restrict and restover now behave as they should, but, the programs remap and showseq seem to ignore the parameter -preferred, or do I make a mistake ? Regards, Guy Bottu From eija.korpelainen at csc.fi Fri May 2 05:41:04 2003 From: eija.korpelainen at csc.fi (Eija Korpelainen) Date: Fri, 2 May 2003 08:41:04 +0300 Subject: Preferred isoschizomer ? References: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> <20030430181945.GD3138@iib.unsam.edu.ar> Message-ID: <002b01c3106d$6ea5b8a0$0402a6c1@windows.csc.fi> Dear Fernan, Guy and others, we have been looking into this problem with Alan and as he told you, the embossre.equ -file is now made automatically. -preferred works (gives you PstI instead of BspMAI) because the default value of -limit is true (this is defined in the restrict.acd file). So if one is using a graphical interface one has to tick both -preferred and -limit to get the right thing. This is because in the code of restrict.c -preferred (called "equiv" in the code) is considered only when -limit has been chosen. What the program actually does is that it first limits to one isoschizomer and picks the alphabetically first one (!), and then converts this to the prototype enzyme using the embossre.equ file. The limiting step is performed by the function embPatRestrictRestrict in embpat.c (in the nucleus directory). The problem with the current set up is that the user doesn't know that -limit and -preferred are interconnected. This could of course be documented, but the easy fix would be to set the equiv boolean true in the code and abolish the -preferred qualifier altogether. This way -limit would give you automatically PstI, and -nolimit all isoschizomers. As Guy pointed out, the problem with remap is that it does not take any notice of the -preferred. This is simply because the code reads the value of preferred (or equiv) but doesn't use it for anything. In other words, most of remap.c code comes from Alan's restrict.c code, but the following critical bit was accidentally left out. if(equiv && limit) { value = ajTableGet(table,m->cod); if (value) ajStrAss(&m->cod,value); } I think it would be important to fix these problems because these are quite central programs for molecular biologists and expensive projects like transgenic design depend heavily on proper restriction maps. Cheers, Eija _____________________________________________ Eija Korpelainen, Ph.D Science Support/Biosciences CSC - Center for Scientific Computing P.O.Box 405, FIN-02101 Espoo, Finland Phone +358 9 457 2030 Mobile +358 50 381 9726 Fax +358 9 457 2302 E-Mail Eija.Korpelainen at csc.fi ________________________________________________ From ableasby at hgmp.mrc.ac.uk Fri May 2 06:50:57 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Fri, 2 May 2003 07:50:57 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200305020650.h426ovQ24261@bromine.hgmp.mrc.ac.uk> Eija's analysis is quite correct. In fact the modifications to remap/showseq (or their equivalent) were made yesterday and passed on to the original author so they can be tested for any knock-on effects. It is true that, when the program was written, there were no GUIs for EMBOSS so the '-limit' confusion didn't arise. Eija's suggestion is a good one and will be tested Alan From bianji at jincao.com Fri May 2 10:18:10 2003 From: bianji at jincao.com (bianji at jincao.com) Date: Fri, 2 May 2003 18:18:10 +0800 Subject: =?GB2312?B?ufq80rDksry52NPaIrfHteQi1+7QwreowsmhoreoueY=?= Message-ID: <20030502100834.1E0B37D1A5@mercury.hgmp.mrc.ac.uk> ????????????????? ????????"??"???????????????????? "??"?????? ??? http://www.jincao.com/t1.htm ????????????????CEO??? ?????????????? msm at jincao.com 2003?5?2? From peptides at earthlink.net Wed May 7 08:15:47 2003 From: peptides at earthlink.net (David Stephens) Date: Wed, 7 May 2003 01:15:47 -0700 Subject: Growth In Radiolabeled Peptides Message-ID: <20030507081548.EB25A7D20A@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: From Marc.Logghe at devgen.com Wed May 7 10:18:13 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 7 May 2003 12:18:13 +0200 Subject: dbiflat question Message-ID: Hi all, I feel a little dumb but I'll ask it anyhow. I seem not to succeed in creating indices for a database using dbiflat. As a test I just wanted to index the genbank file /data/genbank/gbest226.seq Ok, I wanted my indices to be in /data/emboss/est so I have run dbiflat in that folder. dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq -dbname est I added this entry to emboss.default DB est [ type: N format: genbank method: emblcd directory: /data/emboss/est ] But, you guessed it, this did not work. What am I doing wrong here ? What happens with the passed dbname (could not find any file with that name after running dbiflat) ? TIA, marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 83 fax: +32 (0) 9 324 24 25 *********************************************************** From pmr at ebi.ac.uk Wed May 7 10:34:58 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 07 May 2003 11:34:58 +0100 Subject: dbiflat question References: Message-ID: <3EB8E152.7090706@ebi.ac.uk> Marc Logghe wrote: > I added this entry to emboss.default > DB est [ > type: N > format: genbank > method: emblcd > directory: /data/emboss/est > ] You need: directory: /data/genbank indexdirectory: /data/emboss/est EMBOSS needs to find the index files and the data files. Just specifying "directory" works if both files are there (it becomes the defualt for indexdirectory), so your confusion is quite understandable. Hope this helps, Peter Rice From pemberaj at pugh.bip.bham.ac.uk Wed May 7 11:17:31 2003 From: pemberaj at pugh.bip.bham.ac.uk (Tony Pemberton) Date: Wed, 7 May 2003 12:17:31 +0100 Subject: dbiflat question In-Reply-To: References: Message-ID: On Wed, 7 May 2003, Marc Logghe wrote: > Hi all, > I feel a little dumb but I'll ask it anyhow. I seem not to succeed in > creating indices for a database using dbiflat. > As a test I just wanted to index the genbank file /data/genbank/gbest226.seq > Ok, I wanted my indices to be in /data/emboss/est > so I have run dbiflat in that folder. > dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq > -dbname est > I added this entry to emboss.default > DB est [ > type: N > format: genbank > method: emblcd > directory: /data/emboss/est > ] > > But, you guessed it, this did not work. > What am I doing wrong here ? What happens with the passed dbname (could not > find any file with that name after running dbiflat) ? > TIA, > marc > > *********************************************************** > Marc Logghe, Ph.D. > Senior Scientist > Scientific Computing Group > deVGen > Technologiepark 9 > 9052 Zwijnaarde > Belgium > tel: +32 (0) 9 324 24 83 > fax: +32 (0) 9 324 24 25 > *********************************************************** > > > Marc, You need the .seq file also to be in the directory where you run dbiflat. Or make symbolic links! You will note that the dialogue of dbiflat asks about the files to process (*.seq). At this stage, I think I am correct in saying, that the database directory file emboss.default is not operable. This merely directs the user programs e.g. seqret to the formatted database (indeces) as shown by showdb. Regards, Tony ********************************************************************* Mr. A.J.Pemberton Tel: +121-414-3388 c/o Dept. Rheumatology, Fax: +121-414-6794 Medical School, E-mail: A.J.Pemberton at bham.ac.uk The University of Birmingham, Birmingham B15 2TT. U.K. ********************************************************************* From Marc.Logghe at devgen.com Wed May 7 12:05:25 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 7 May 2003 14:05:25 +0200 Subject: dbiflat question Message-ID: Thanks for the reply ! That is what I have figured out: when you run dbiflat -idformat genbank -directory /data/genbank -filenames gbest226.seq in the index directory (e.g. /data/emboss/est) has the same effect as running dbiflat -idformat genbank -directory /data/genbank -indexdirectory /data/emboss/est -filenames gbest226.seq meaning, the index files are created in the desired place. But still the sequences themselves are not accessible using the mentioned entry in emboss.default ('seqret est -firstonly' gives a segmentation fault). I suppose the 'directory' key should point to the indexdirectory, right ? Because, the index itself should be pointing to the correct sequence path. At least that is what I expect. And indeed, as suggested by Tony, everything worked fine when putting index and sequence files in the same directory (indexdirectory and directory are the same). OK, just tried something which appears to work now. Switch to the first scenario again: separate paths for index and sequence files. When I changed the emboss.default to the following, everything worked fine: DB est [ type: N format: genbank method: emblcd indexdirectory: /data/emboss/est directory: /data/genbank ] Apparently you have to set the indexdirectory and directory explicitely in the configuration file also; pointing to the indexdirectory alone is not sufficient ! Regards, Marc > -----Original Message----- > From: Tony Pemberton [mailto:pemberaj at pugh.bip.bham.ac.uk] > Sent: Wednesday, May 07, 2003 1:18 PM > To: Marc Logghe > Cc: Emboss (E-mail) > Subject: Re: dbiflat question > > > On Wed, 7 May 2003, Marc Logghe wrote: > > > Hi all, > > I feel a little dumb but I'll ask it anyhow. I seem not to > succeed in > > creating indices for a database using dbiflat. > > As a test I just wanted to index the genbank file > /data/genbank/gbest226.seq > > Ok, I wanted my indices to be in /data/emboss/est > > so I have run dbiflat in that folder. > > dbiflat -idformat genbank -directory /data/genbank > -filenames gbest226.seq > > -dbname est > > I added this entry to emboss.default > > DB est [ > > type: N > > format: genbank > > method: emblcd > > directory: /data/emboss/est > > ] > > > > But, you guessed it, this did not work. > > What am I doing wrong here ? What happens with the passed > dbname (could not > > find any file with that name after running dbiflat) ? > > TIA, > > marc > > > > *********************************************************** > > Marc Logghe, Ph.D. > > Senior Scientist > > Scientific Computing Group > > deVGen > > Technologiepark 9 > > 9052 Zwijnaarde > > Belgium > > tel: +32 (0) 9 324 24 83 > > fax: +32 (0) 9 324 24 25 > > *********************************************************** > > > > > > > > Marc, > > You need the .seq file also to be in the directory where you run > dbiflat. Or make symbolic links! > > You will note that the dialogue of dbiflat asks about the files to > process (*.seq). At this stage, I think I am correct in saying, that > the database directory file emboss.default is not operable. This > merely directs the user programs e.g. seqret to the formatted > database (indeces) as shown by showdb. > > Regards, > > Tony > > > ********************************************************************* > Mr. A.J.Pemberton Tel: +121-414-3388 > c/o Dept. Rheumatology, Fax: +121-414-6794 > Medical School, E-mail: A.J.Pemberton at bham.ac.uk > The University of Birmingham, > Birmingham B15 2TT. > U.K. > ********************************************************************* > From Stephan.Hurling at evotecoai.com Thu May 8 12:41:50 2003 From: Stephan.Hurling at evotecoai.com (Stephan.Hurling at evotecoai.com) Date: Thu, 8 May 2003 14:41:50 +0200 Subject: Problems with dbigcg... Message-ID: Hello Everyone, I would like to use EMBOSS version 2.6.0 together with the GCG Wisconsin package version 10.3 on a Red Hat 7.2 linux server. I followed the installation instructions from the administrators guide and doing the usual ./configure make make install I compiled and installed emboss on my system without any error messages. But when I want to make indexes from a gcg database I run into troubles. See the following output of an interactive session: 14:18 [root at kepler] ~/Temp # dbigcg Index a GCG formatted database EMBL : EMBL SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew GENBANK : Genbank, DDBJ PIR : NBRF Entry format [EMBL]: Database directory [.]: /usr/local/share/EMBOSS/data/GCG_DATABASES/gcgembl Wildcard database filename [*.seq]: Database name: embl Release number [0.0]: 73.0 Index date [00/00/00]: 01/12/02 EMBOSS An error in embdbi.c at line 590: Cannot open embl.idsrt for reading 14:21 [root at kepler] ~/Temp # ll total 252k drwxr-xr-x 2 root root 4.0k May 8 14:18 ./ drwxr-x--- 24 root root 4.0k May 8 14:15 ../ -rw------- 1 root root 1.4M May 8 14:18 core -rw-r--r-- 1 root root 1.5k May 8 14:18 division.lkp -rw-r--r-- 1 root root 675 May 8 14:18 embl001.acnum -rw-r--r-- 1 root root 104 May 8 14:18 embl002.acnum -rw-r--r-- 1 root root 504 May 8 14:18 embl003.acnum -rw-r--r-- 1 root root 51 May 8 14:18 embl004.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl005.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl006.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl007.acnum -rw-r--r-- 1 root root 126 May 8 14:18 embl008.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl009.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl010.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl011.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl012.acnum -rw-r--r-- 1 root root 36 May 8 14:18 embl013.acnum -rw-r--r-- 1 root root 25k May 8 14:18 embl014.acnum -rw-r--r-- 1 root root 161 May 8 14:18 embl015.acnum -rw-r--r-- 1 root root 850 May 8 14:18 embl016.acnum -rw-r--r-- 1 root root 1.3k May 8 14:18 embl017.acnum -rw-r--r-- 1 root root 2.6k May 8 14:18 embl018.acnum -rw-r--r-- 1 root root 121 May 8 14:18 embl019.acnum -rw-r--r-- 1 root root 290 May 8 14:18 embl020.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl021.acnum -rw-r--r-- 1 root root 104 May 8 14:18 embl022.acnum -rw-r--r-- 1 root root 188 May 8 14:18 embl023.acnum -rw-r--r-- 1 root root 0 May 8 14:18 embl024.acnum -rw-r--r-- 1 root root 34 May 8 14:18 embl025.acnum -rw-r--r-- 1 root root 14 May 8 14:18 embl026.acnum -rw-r--r-- 1 root root 490 May 8 14:18 embl027.acnum -rw-r--r-- 1 root root 300 May 8 14:18 entrynam.idx -rw------- 1 root root 0 May 8 14:18 sort9YCQfK Can somebody help me? Have I done something wrong during the compilation step of emboss? Any hint would help me. Thanks in advance... All the best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ablavier at wanadoo.fr Sun May 11 18:36:24 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Sun, 11 May 2003 20:36:24 +0200 Subject: EMBOSS for Windows: DLL build Message-ID: <001e01c317ec$3c9feb60$5ca03551@bach> EMBOSS for Windows is now built with ajax and nucleus compiled as DLLs, so the EMBOSS programs are now much smaller, and the distribution as well. dbiblast is now in the package. See http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html. -- Andr? Blavier From arunanirudhan at yahoo.co.in Mon May 12 08:27:44 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Mon, 12 May 2003 09:27:44 +0100 (BST) Subject: seqret Message-ID: <20030512082744.65129.qmail@web8203.mail.in.yahoo.com> Hello allHow can i use seqret to retrieve sequences from a database like we use in entrez? For eg: I want to get sequences of all insulin from genbank. What to give as command?seqret embl:insulin ? Arun Catch all the cricket action. Download Yahoo! Score tracker -------------- next part -------------- An HTML attachment was scrubbed... URL: From maoj at mail.nih.gov Mon May 12 14:16:05 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Mon, 12 May 2003 10:16:05 -0400 Subject: about ftp site of EMBOSS Administrators Guide Message-ID: <00e801c31891$0c341410$618a70a5@citjmao> Hi, where can I find the pdf version of emboss administrators guide? the link on the website doesn't work. thanks. Jean -------------- next part -------------- An HTML attachment was scrubbed... URL: From gwilliam at hgmp.mrc.ac.uk Mon May 12 14:40:35 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 12 May 2003 15:40:35 +0100 Subject: about ftp site of EMBOSS Administrators Guide References: <00e801c31891$0c341410$618a70a5@citjmao> Message-ID: <3EBFB263.95B0DDAB@hgmp.mrc.ac.uk> There is no PDF version of the current guide. The link on the web was left there by accident and has now been tidied away - sorry. Gary > Jean Mao wrote: > > Hi, where can I find the pdf version of emboss administrators guide? > the link on the website doesn't work. thanks. > > Jean -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From gwilliam at hgmp.mrc.ac.uk Mon May 12 15:40:10 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 12 May 2003 16:40:10 +0100 Subject: about ftp site of EMBOSS Administrators Guide References: <00e801c31891$0c341410$618a70a5@citjmao> <3EBFB263.95B0DDAB@hgmp.mrc.ac.uk> Message-ID: <3EBFC05A.16A96393@hgmp.mrc.ac.uk> The .ps and .pdf versions of the current guide are now on the web page: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html See: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/admin.ps and http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/admin.pdf Gary > > Jean Mao wrote: > > > > Hi, where can I find the pdf version of emboss administrators guide? > > the link on the website doesn't work. thanks. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From arunanirudhan at yahoo.co.in Tue May 13 07:25:13 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Tue, 13 May 2003 08:25:13 +0100 (BST) Subject: Fwd: seqret Message-ID: <20030513072513.53308.qmail@web8204.mail.in.yahoo.com> Note: forwarded message attached. Catch all the cricket action. Download Yahoo! Score tracker -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded message was scrubbed... From: =?iso-8859-1?q?arun=20anirudhan?= Subject: seqret Date: Mon, 12 May 2003 09:27:44 +0100 (BST) Size: 2520 URL: From peptides at earthlink.net Tue May 13 08:45:09 2003 From: peptides at earthlink.net (David Stephens) Date: Tue, 13 May 2003 01:45:09 -0700 Subject: Sourcing Information For Amino Acids and Custom Peptides Message-ID: <20030513084512.363E87D2CC@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: From yann-francois.bizouerne at bayercropscience.com Thu May 15 15:29:46 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Thu, 15 May 2003 17:29:46 +0200 Subject: Search for organism of entry Message-ID: Hello, I have install EMBOSS on our server since recently. I reallly enjoy a lot the different tools but I have a little problem and I can't find the solution anywhere. I have index the SwissProt database with the command line : dbiflat -idformat SWISS -directory . -filenames sprot.dat -dnname sprot -fields acnum,seqvn,des,keyword,taxon So after that I could find sequence information when I am looking by for particular organism or keyword. For the moment What I could retrieve with the accession number is the following : >infoseq sprot:P15711 Displays some simple information about sequences # USA Name Accession Type Length Description ian-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa microneme-rhoptry antigen. And when I am looking with the organism I obtain : >infoseq sprot-org:"*Theileria*" -outfile stdout Displays some simple information about sequences # USA Name Accession Type Length Description sprot-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa microneme-rhoptry antigen. So now I want to know if I could for one particular entry (sprot:P15711) find the Organism (Theileria prava) or not ? Thanks in advance for your answer. Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From pmr at ebi.ac.uk Thu May 15 16:26:31 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 15 May 2003 17:26:31 +0100 Subject: Search for organism of entry References: Message-ID: <3EC3BFB7.5000605@ebi.ac.uk> yann-francois.bizouerne at bayercropscience.com wrote: > And when I am looking with the organism I obtain : > >infoseq sprot-org:"*Theileria*" -outfile stdout > Displays some simple information about sequences > # USA Name Accession Type Length Description > sprot-id:104K_THEPA 104K_THEPA P15711 P 924 104 kDa > microneme-rhoptry antigen. > > > So now I want to know if I could for one particular entry (sprot:P15711) find > the Organism (Theileria prava) or not ? EMBOSS can search a database by organism, but reads the sequence (in most programs) or the whole entry (entret) ... but I am looking into ways to parse out more detail, including organism, citation, and features. The database definition would have a list of fields that can be retrieved, and a program like (for example) entret could check the fields and let you choose the ones you need. For now, you can run entret and look for the organism in the text. Hope this helps, Peter Rice From henrikki.almusa at helsinki.fi Tue May 20 06:42:36 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Tue, 20 May 2003 09:42:36 +0300 Subject: Support for nexus in alignment format Message-ID: <200305200942.36206.henrikki.almusa@helsinki.fi> Hello I read through the alignment formats that emboss supports. I was wondering if nexus is supported as alignment format (-aformat nexus)? I seems to be supported as sequence format but it wasnt mentioned as alignment format. -- Henrikki Almusa From pmr at ebi.ac.uk Tue May 20 09:29:57 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 May 2003 10:29:57 +0100 Subject: Support for nexus in alignment format References: <200305200942.36206.henrikki.almusa@helsinki.fi> Message-ID: <3EC9F595.5000008@ebi.ac.uk> Henrikki Almusa wrote: > I read through the alignment formats that emboss supports. I was wondering if > nexus is supported as alignment format (-aformat nexus)? I seems to be > supported as sequence format but it wasnt mentioned as alignment format. The sequence formats are easy to add as alignment formats. Not sure quite how useful that is. You can do this: 1. Create your alignment in a sequence format (FASTA, MSF) 2. Use seqret to convert to nexus format ... or does NEXUS format hold some extra information that would make it a useful alignment format, and that we lose by going through FASTA? Hope this helps, Peter From yann-francois.bizouerne at bayercropscience.com Wed May 21 08:53:19 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Wed, 21 May 2003 10:53:19 +0200 Subject: Use two Emboss package with one database Message-ID: Hello, I am working with 2 diffretns servers on different locations. On each of them a EMBOSS package tools is installed. I need to know if I could configure these 2 EMBOSS in order to work with the same database (which is located on one of the two servers). Is EMBOSS could working this way or does I need to have only one EMBOSS package (tools + databse) installed on one server ? I hope that my question is clear enough. Best Regards Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From pmr at ebi.ac.uk Wed May 21 08:59:18 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 21 May 2003 09:59:18 +0100 Subject: Use two Emboss package with one database References: Message-ID: <3ECB3FE6.1020906@ebi.ac.uk> yann-francois.bizouerne at bayercropscience.com wrote: > Hello, > > I am working with 2 diffretns servers on different locations. On each of them a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only one EMBOSS package > (tools + databse) installed on one server ? Yes ... but you need to do some work. The EMBOSS package on the same server as the databases is easy. The second EMBOSS package needs to read from remote databases. I assume you indexed them with dbiflat (an the other dbi programs). You can access a remote database by: SRSWWW if it on an SRS server URL if you have a web page to query the database APP (EXTERNAL) if yuo have a script that can return an entry Assuming you don't have them under SRS ... You can provide a simple web CGI script that runs entret (for whole entry) or seqret (for sequence only - you can put -osformat on the command line to get the format of your choice)) You can write a script that will access the databases somehow (possibly also by talking to a web page - your choice). Meanwhile, I am working on ways to define EMBOSS web services and data services that would give an alternative access method, but that is for later in the year. Hope this helps, Peter Rice From Marc.Logghe at devgen.com Wed May 21 09:03:00 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 21 May 2003 11:03:00 +0200 Subject: Use two Emboss package with one database Message-ID: Hi, At our site emboss is installed on every node of a cluster while the databases are installed on only one. The only thing you have to do is mount the database directory/directories in one way or another on every node and adapt the emboss.default files appropriately (if necessary) so that the DB entries are pointing to the correct directories. HTH, Marc > -----Original Message----- > From: yann-francois.bizouerne at bayercropscience.com > [mailto:yann-francois.bizouerne at bayercropscience.com] > Sent: Wednesday, May 21, 2003 10:53 AM > To: emboss at embnet.org > Subject: Use two Emboss package with one database > > > Hello, > > I am working with 2 diffretns servers on different locations. > On each of them a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order > to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only > one EMBOSS package > (tools + databse) installed on one server ? > > I hope that my question is clear enough. > > Best Regards > > > > Yann-Fran?ois BIZOUERNE > BioInformatic Team > BAYER CropScience > 1, rue Pierre Fontaine > 91058 Evry Cedex > FRANCE > Phone: 33-(0) 1-69-47-61-56 > FAX: 33-(0) 1-69-47-61-42 > E-mail: yann-francois.bizouerne at bayercropscience.com > Intranet: http://bioinfo.evry.fr.bayercropscience/ > > From d.m.a.martin at dundee.ac.uk Wed May 21 09:04:24 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 21 May 2003 10:04:24 +0100 Subject: Use two Emboss package with one database In-Reply-To: Message-ID: On 21/5/03 9:53 am, "yann-francois.bizouerne at bayercropscience.com" wrote: > Hello, > > I am working with 2 diffretns servers on different locations. On each of them > a > EMBOSS package tools is installed. > I need to know if I could configure these 2 EMBOSS in order to work with the > same database (which is located on one of the two servers). > Is EMBOSS could working this way or does I need to have only one EMBOSS > package > (tools + databse) installed on one server ? There are two options here: 1. Different platforms accessing the same database 2. same platform (different machines) accessing the same database. 1. Easy. When you do a configure set the prefix (prefixes) approrpriately so that the executables go to an appropriate place and the databases point to a shared (NFS or similar) drive containing the config files. Obviously you have to use the same mountpoint on all your machines for this to work (I use /site/share/EMBOSS for the config and /site/databases as a root for the databases. In this case /site/bin is local, not shared and contains the appropriate binaries [or can be a symlink to /site/Linux/bin, /site/IRIX/bin, /site/Solaris/bin, /site/Darwin/bin as appropriate if you are supporting more than on emachine on a particular platform] ) 2. Can be done in the same way using a shared drive for the executables. One gotcha is that EMBOSS, despite all efforts, does not compile statically so you have to ensure that the library versions are the same across the various platforms or you will get runtime errors. Either of these methods will reduce the maintenance load considerably. In my case I use NFS for the data directories and use a nightly scheduled rsync to synchronise the executables and config files with the master machine as these don't take much space and it reduces the network overhead. Hope this helps. ..d > > I hope that my question is clear enough. > > Best Regards > > > > Yann-Fran?ois BIZOUERNE > BioInformatic Team > BAYER CropScience > 1, rue Pierre Fontaine > 91058 Evry Cedex > FRANCE > Phone: 33-(0) 1-69-47-61-56 > FAX: 33-(0) 1-69-47-61-42 > E-mail: yann-francois.bizouerne at bayercropscience.com > Intranet: http://bioinfo.evry.fr.bayercropscience/ > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From d.m.a.martin at dundee.ac.uk Wed May 21 09:13:25 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 21 May 2003 10:13:25 +0100 Subject: Use two Emboss package with one database In-Reply-To: <3ECB3FE6.1020906@ebi.ac.uk> Message-ID: On 21/5/03 9:59 am, "Peter Rice" wrote: > yann-francois.bizouerne at bayercropscience.com wrote: >> Hello, >> >> I am working with 2 diffretns servers on different locations. On each of them >> a >> EMBOSS package tools is installed. >> I need to know if I could configure these 2 EMBOSS in order to work with the >> same database (which is located on one of the two servers). >> Is EMBOSS could working this way or does I need to have only one EMBOSS >> package >> (tools + databse) installed on one server ? > > Yes ... but you need to do some work. > > The EMBOSS package on the same server as the databases is easy. > > The second EMBOSS package needs to read from remote databases. I assume > you indexed them with dbiflat (an the other dbi programs). > > You can access a remote database by: > > SRSWWW if it on an SRS server > URL if you have a web page to query the database > APP (EXTERNAL) if yuo have a script that can return an entry > > Assuming you don't have them under SRS ... > > You can provide a simple web CGI script that runs entret (for whole > entry) or seqret (for sequence only - you can put -osformat on the > command line to get the format of your choice)) > > You can write a script that will access the databases somehow (possibly > also by talking to a web page - your choice). > > Meanwhile, I am working on ways to define EMBOSS web services and data > services that would give an alternative access method, but that is for > later in the year. > What about just using Jemboss (or a variant thereof) to talk to the master machine? The alternative is shunting lots of data around which isn't really feasible unless you have a fast network between the machines. Do you move the data to the problem or the problem to the data? The trade off is between transport time and execution time. Does Jemboss make use of a SOAP server? If not it would be really nice to have a script that could generate a WSDL definition from the ACD files. It's then one step away from being a Grid service.. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From maoj at mail.nih.gov Wed May 21 16:21:43 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Wed, 21 May 2003 12:21:43 -0400 Subject: question about databases setup Message-ID: <038001c31fb5$1171acf0$618a70a5@citjmao> Hi, I am new to emboss. have question in database setup. I have a file in the directory /data/maoj/emboss/db/mouse/ called 'test.dat'. this file has 9 entries in embl format. i ran dbiflat, acnum.hit, acnum.trg, division.lkp, entrynam.idx were generated. then I setup a .embossrc file in my home dir as follows : ------------------------------------------------------------------------------------------- # Logfile - set this to a file that any user can append to # and EMBOSS applications will automatically write log information # SET emboss_logfile /home/db/emboss/tmp/log DB test [ type: N method: emblcd format: embl dir: /data/maoj/emboss/db/mouse file: "*.dat" release: "1" comment: "Test DB" ] ------------------------------------------------- when i run seqret and try to retrieve 1 of the 9 entries, i got following error: % seqret Reads and writes (returns) sequences Input sequence(s): test:AB001363 Warning: Cannot open division file '' for database 'test' Warning: seqCdQry failed Error: Unable to read sequence 'test:AB001363' Please help. Thank you in advance. Jean -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmr at ebi.ac.uk Wed May 21 16:30:43 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 21 May 2003 17:30:43 +0100 Subject: question about databases setup References: <038001c31fb5$1171acf0$618a70a5@citjmao> Message-ID: <3ECBA9B3.3070802@ebi.ac.uk> Jean Mao wrote: > Hi, I am new to emboss. have question in database setup. > > I have a file in the directory /data/maoj/emboss/db/mouse/ called > 'test.dat'. this file has 9 entries in embl format. i ran dbiflat, > acnum.hit, acnum.trg, division.lkp, entrynam.idx were generated. You need to specify where the index files are (indexdir) in the database definition. Hope this helps, Peter Rice From mathog at mendel.bio.caltech.edu Wed May 21 22:04:02 2003 From: mathog at mendel.bio.caltech.edu (David Mathog) Date: Wed, 21 May 2003 15:04:02 -0700 Subject: extractfeat with gff files? Message-ID: EMBOSS 2.6.0 I cannot seem to locate the magic incantation that will make gff files work as desired with fasta files. HELP! Here's the sort of line I want to extract (sorry about the wrap): X gadfly translation 1880 3119 . - . genegrp=CG3038; transgrp=CG3038-RB There are many other lines in the gff for transcription,exon, gene, etc. which should not be extracted. The fasta input file currently has entries with names like X,2L,etc. which correspond to the first column of the gff file. Ideally I'd like to be able to use one gff file (with X->3R in the first column) to extract from one fasta file (again, with X->3R for the fasta name), and have the descriptions of X act only on the sequence X, and so forth. The idea being to be able to extract features on a genomic level using only one fasta/gff pair, rather than N (=#of scaffolds) pairs. First though I tried an input fasta file containing (11 X 10kb entries, the first being X) and a gff file also starting only with X (but with references for the whole chromosome, 22101 lines). The following command sat for about two minutes, burned a lot of CPU time, but emitted nothing: extractfeat -sequence=dmel_genome_frag.nfa\ -ufo=x.gff -type=translation -outseq=x.nfa When the -type qualifier was removed it went nuts and emitted over 40000 entries (more than there were lines in the gff file!) before I killed it. Clearly there was no error checking for size of gff entry versus size of sequence. The input fasta file had 11 entries of 10000 bp each. The first was X. Yet a bunch of lines like: >X_12390_12854 [exon] X release:3 length:21780003bp Assembled X chromosome arm sequence md5:f3fbbb4c44f0d30d1effeecc87b5bd18 T were emitted. So the fasta file was reduced to just one entry (X, 10kb) and this time the output fasta file held 22101 entries. As before, those beyond 10kb were emitted with a single base. So apparently the entire gff description is applied to each fasta sequence and there's no checking of the first column against the sequence name. That's ok - we can live with that for now, but it would be better if the descriptions could automatically matched to the sequence names. I'm not sure though that we can live with it emitting single bp sequences when the description is outside of the sequence. If the feature is beyond the end of the input sequence it just isn't there, right? Just to spite me "translation" was never emitted. There were only lines for gene,exon.misc_feature,tRNA,snoRNA. So I tried: extractfeat -sequence=dmel_x.nfa \ -ufo=x.gff -outseq=x.nfa -type=gene And it emitted a single whole gene match at (1488,3280,-) correctly. The next one at (3445,11463,+) partially (and correctly, ending at the end of the sequence - a warning would have been nice) and then a slew of (>2000) single base pair "empty" entries outside of the input sequence. Note also that there's no indication on the fasta header line in the output of the strand which was selected. So, how does one get extractfeat to emit only matches to "translation"? Please tell me there's some way other than by extracting those lines into a separate gff file and renaming them all "gene"! Extractfeat seems to have a predefined set of "features" that it's willing to work with and doesn't handle others well. To narrow this down a bit more I made a small gff file containing "fred" where "gene" had been and specifying positions <10kb. The features were emitted but all were labeled "misc_feature". Is this documented somewhere? It isn't in an obvious place in the on line help, as both of these searches come up empty. extractfeat -h 2>&1 | grep -i misc tfm extractfeat 2>&1 | grep -i misc It would also be nice if there was some way to get column 9 from the gff file onto the fasta header line somewhere. (It can then be rearranged to suit later.) Currently even if one has the gene names lined up with the gene entries in the gff file the resulting fasta file just says "X_100_123 [gene]..." without any of the comment info. You've got the sequence but not the names of the genes. Very painful to work with if the output is the coding sequences for an entire genome. Is there a switch (or bug fix) that stops extractfeat from emitting garbage single bp entries for descriptions outside the sequence? Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From yann-francois.bizouerne at bayercropscience.com Thu May 22 14:26:36 2003 From: yann-francois.bizouerne at bayercropscience.com (yann-francois.bizouerne at bayercropscience.com) Date: Thu, 22 May 2003 16:26:36 +0200 Subject: creation of new output fasta format Message-ID: Hello, Fisrt thanks a lot for your quick response to my last mail. Now, I am trying to create a new fasta format. The format I want to obtain : > dbname:id |accession|organism|description By the way I create a new function in the ajseqwrite.c (seqWriteNewFasta). I have select the diffrent informations I want to retrieve by using the examples of others functions. It is working quite well. Except for the Pir and Nrl_3D databases. Indeed for these databases, I have no database name and no organism (taxon) /** Database name **/ if (ajStrLen(outseq->Db)) (void) ajFmtPrintF (outseq->File, ">%S:", outseq->Db); else if (ajStrLen(outseq->Setdb)) (void) ajFmtPrintF (outseq->File, ">%S:", outseq->Setdb); else (void) ajFmtPrintF (outseq->File, ">unk:"); /** Organism **/ if (ajStrLen(outseq->Tax)) (void) ajFmtPrintF (outseq->File, "%S|", outseq->Tax); I try to find some information about NBRF format in EMboss and the way to use it, but I could find nothing. Do you have a clue for me ? Best regards Yann-Fran?ois BIZOUERNE BioInformatic Team BAYER CropScience 1, rue Pierre Fontaine 91058 Evry Cedex FRANCE Phone: 33-(0) 1-69-47-61-56 FAX: 33-(0) 1-69-47-61-42 E-mail: yann-francois.bizouerne at bayercropscience.com Intranet: http://bioinfo.evry.fr.bayercropscience/ From peptides at earthlink.net Sat May 24 22:55:53 2003 From: peptides at earthlink.net (David Stephens) Date: Sat, 24 May 2003 15:55:53 -0700 Subject: Happy Memorial Day Message-ID: <20030524225553.29CB27D181@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: From henrikki.almusa at helsinki.fi Tue May 27 10:46:13 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Tue, 27 May 2003 13:46:13 +0300 Subject: Graph data handling Message-ID: <200305271339.51333.henrikki.almusa@helsinki.fi> Hello, I'm trying to use graphs in scripts outside of emboss. However i got into problems with options conserning the graph handling. I found following options from "banana" tools webpage: "-graph" related qualifiers -gprompt boolean Graph prompting -gtitle string Graph title -gsubtitle string Graph subtitle -gxtitle string Graph x axis title -gytitle string Graph y axis title -goutfile string Output file for non interactive displays -gdirectory string Output directory I tried to use -goutfile and -gdirectory with banana, but i seem to be unable to effect the data file(s) or their directories. If i understand correctly this should work "banana mRNA.seq -graph data -goutfile /home/hena/banana_data_file -auto" or then "banana mRNA.seq -graph data -goutfile banana_data_file -gdirectory /home/hena". For second i get error: "Died: unknown qualifier -gdirectory" and with first i get "Created banana_data_file.dat", but no such file is created and no data fale is there. Also if i use "-data" option there, i get multiple bananaX.dat (in which X is running number) files. So, my questions are. How exactly how is this supposed to work? And could it be added to webpages "User documentation" section with other formats. And thirdly is there a reason, why some programs expect "-data" option and others do not? TIA -- Henrikki Almusa From sebastien.frade at bayercropscience.com Wed May 28 12:38:49 2003 From: sebastien.frade at bayercropscience.com (sebastien.frade at bayercropscience.com) Date: Wed, 28 May 2003 14:38:49 +0200 Subject: No subject Message-ID: Hi, I'm a new user of EMBOSS and i like to extract some information of EMBL flat file like clone, strain, tissue ... that are stored in the FT section. But i don't know how to do that. I've look for a tool that can extract features, but no one of them extract these fields. if a such tool doesn't exist how can i develop it ? Please help me !! Thank S?bastien Frade BioInformatic Team BAYER CropScience 1 Rue Pierre FONTAINE 91058 EVRY ? France tel : 33 (0) 1 69 47 61 52 fax : 33 (0) 1 69 47 61 42 mail : sebastien.frade at bayercropscience.com http://bioinfo.evry.fr.bayercropscience From pmr at ebi.ac.uk Wed May 28 14:30:43 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 28 May 2003 15:30:43 +0100 Subject: References: Message-ID: <3ED4C813.4010900@ebi.ac.uk> sebastien.frade at bayercropscience.com wrote: > Hi, > > I'm a new user of EMBOSS and i like to extract some information of EMBL flat > file like clone, strain, tissue ... that are stored in the FT section. > But i don't know how to do that. > > I've look for a tool that can extract features, but no one of them extract these > fields. This sounds like a task for SRS :-) http://srs.ebi.ac.uk/ EMBOSS really works with the sequence data. We can try to extract more of the other data but it is a non-trivial task. But ... you could write your own EMBOSS tool, and we can help you to do that!!! Hope this helps Peter Rice From maoj at mail.nih.gov Fri May 30 14:49:17 2003 From: maoj at mail.nih.gov (Jean Mao) Date: Fri, 30 May 2003 10:49:17 -0400 Subject: question about dbiblast Message-ID: <0d4401c326ba$a5847600$618a70a5@citjmao> Hi, I am new in emboss db config. Need some help in indexing blast db. I have in dir following files: -rw-rw-r-- 1 maoj Seqdb 190733 May 30 08:19 drosoph.nt.nhr -rw-rw-r-- 1 maoj Seqdb 14108 May 30 08:19 drosoph.nt.nin -rw-rw-r-- 1 maoj Seqdb 9360 May 30 08:19 drosoph.nt.nnd -rw-rw-r-- 1 maoj Seqdb 84 May 30 08:19 drosoph.nt.nni -rw-rw-r-- 1 maoj Seqdb 174584 May 30 08:19 drosoph.nt.nsd -rw-rw-r-- 1 maoj Seqdb 3699 May 30 08:19 drosoph.nt.nsi -rw-rw-r-- 1 maoj Seqdb 31368306 May 30 08:19 drosoph.nt.nsq i believe these are files from NCBI and generated use formatdb version 2.2.5. I run dbiblast in this directory: Index a BLAST database Database name: drosoph Database directory [.]: Wildcard database filename [drosoph]: drosoph.nt.* Release number [0.0]: Index date [00/00/00]: N : nucleic P : protein ? : unknown Sequence type [unknown]: N 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 then I got many lines of the following message: Warning: Duplicate ID skipped: '0?0? ?^DROSOPHILA' All hits will point to first ID found The following new files were generated: -rw-rw-r-- 1 maoj Seqdb 322 May 30 10:44 division.lkp -rw-rw-r-- 1 maoj Seqdb 496 May 30 10:44 entrynam.idx -rw-rw-r-- 1 maoj Seqdb 300 May 30 10:44 acnum.trg -rw-rw-r-- 1 maoj Seqdb 300 May 30 10:44 acnum.hit I then edit my ~/.embossrc file by add the following lines: DB drosoph [ type: N method: blast format: ncbi dir: /data/maoj/emboss/db/blast/drosoph indexdir: /data/maoj/emboss/db/blast/drosoph file: "drosoph.nt.*" release: "0.0" comment: "blast drosoph" ] Then I run showdb: % showdb Displays information on the currently available databases # Name Type ID Qry All Comment # ==== ==== == === === ======= drosoph N OK OK OK blast drosoph test N OK OK OK Test DB The test db is genbank format and was running fine. Then I run seqret: % seqret Reads and writes (returns) sequences Input sequence(s): drosoph:A* Error: BLAST Query failed Error: Unable to read sequence 'drosoph:A*' Input sequence(s): drosoph:* EMBOSS An error in ajseqdb.c at line 4006: error reading file /data/maoj/emboss/db/blast/drosoph/drosoph.nt.nhr Please advise what I might did wrong. Thank you very much!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmr at ebi.ac.uk Fri May 30 18:24:58 2003 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Fri, 30 May 2003 19:24:58 +0100 (BST) Subject: question about dbiblast In-Reply-To: <0d4401c326ba$a5847600$618a70a5@citjmao> References: <0d4401c326ba$a5847600$618a70a5@citjmao> Message-ID: <1189.217.134.86.144.1054319098.squirrel@webmail.ebi.ac.uk> > Hi, I am new in emboss db config. Need some help in indexing blast db. This is the long-standing problem of the "new ASN.1 format blast database" NCBI changed formatdb to create a new index file format, but we have no documentation on the new format, so we cannot update dbiblast to index it. We hope to provide the ability to index these blast databases in a future release, once NCBI release the format specification. I suspect EMBOSS and FASTA are the only other applications using blast index formats so it is not an urgent task for them. Meanwhile, you need to use the 'old' format: First, you need the original FASTA format file (drosophila.nt) Then, index it with formatdb but add "-A F" to the command line (to turn off ASN.1 format). Hope this helps, Peter Rice From calvinwangxi at yahoo.com Sat May 31 11:43:11 2003 From: calvinwangxi at yahoo.com (calvin wang) Date: Sat, 31 May 2003 04:43:11 -0700 (PDT) Subject: new moethods In-Reply-To: <1189.217.134.86.144.1054319098.squirrel@webmail.ebi.ac.uk> Message-ID: <20030531114311.68280.qmail@web41115.mail.yahoo.com> I need to use TCOFFE, I understand this is not part of EMBOSS. Is it possible to include new methods in to EMBOSS? how? is there a guide? thanks. > Hi, I am new in emboss db config. Need some help in indexing blast db. This is the long-standing problem of the "new ASN.1 format blast database" NCBI changed formatdb to create a new index file format, but we have no documentation on the new format, so we cannot update dbiblast to index it. We hope to provide the ability to index these blast databases in a future release, once NCBI release the format specification. I suspect EMBOSS and FASTA are the only other applications using blast index formats so it is not an urgent task for them. Meanwhile, you need to use the 'old' format: First, you need the original FASTA format file (drosophila.nt) Then, index it with formatdb but add "-A F" to the command line (to turn off ASN.1 format). Hope this helps, Peter Rice --------------------------------- Do you Yahoo!? Free online calendar with sync to Outlook(TM). -------------- next part -------------- An HTML attachment was scrubbed... URL: