From irv at midalink.net Mon Sep 3 00:04:09 2001 From: irv at midalink.net (Irv Edelman) Date: Sun, 2 Sep 2001 22:04:09 -0600 Subject: prima application Message-ID: Hi, I don't work for GCG any longer, but, when I did, I wrote a fair number of application programs. I was just looking at the description of the prima program on the EMBOSS web site and noticed that it seemed remarkably familiar. Large sections of the description seem to have been taken, verbatim, from the manual entry I wrote for the GCG Prime program. The methods used in the program, the program parameters, and the program function itself, seem to be remarkably similar to the Prime program I wrote for GCG. Yet the prima program is completely attributed to someone at HGMP. Is that really so? Just curious. Cheers, Irv Edelman From gwilliam at hgmp.mrc.ac.uk Mon Sep 3 04:15:24 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 03 Sep 2001 09:15:24 +0100 Subject: prima application References: Message-ID: <3B933C1C.FC5F6366@hgmp.mrc.ac.uk> Irv Edelman wrote: > > Hi, > > I don't work for GCG any longer, but, when I did, I wrote a fair > number of application programs. I was just looking at the > description of the prima program on the EMBOSS web site and > noticed that it seemed remarkably familiar. Large sections of the > description seem to have been taken, verbatim, from the manual > entry I wrote for the GCG Prime program. The methods used in the > program, the program parameters, and the program function itself, > seem to be remarkably similar to the Prime program I wrote for > GCG. Yet the prima program is completely attributed to someone at > HGMP. Is that really so? Just curious. Sorry - the 'prima' documentation is incorrect - I had been doing bulk copies of documentation from the old EGCG programs into the corresponding EMBOSS documentation and this one slipped through by mistake. 'prima' has no code in common with the GCG 'prime' program. I will change the documentation. Gary -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From cutler at tularik.com Tue Sep 4 20:59:59 2001 From: cutler at tularik.com (Gene Cutler) Date: Tue, 4 Sep 2001 17:59:59 -0700 Subject: drawing trees In-Reply-To: <3B8B507D.B7639FE5@bioss.ac.uk> References: <3B8B507D.B7639FE5@bioss.ac.uk> Message-ID: I finally got around to trying this, using protdist and neighbor as suggested below. That worked, but gave me an ascii tree. Is there any way to get a tree in postscript format? >Gene Cutler asked: > >>Hello, all. I have a question about phylogenetic-type trees for >>sequences. I haven't quite figured out how to do this using >>emboss/phylip. This is how I have been doing this with gcg: >> >>run gcg program distances on the msf file >>run gcg program growtree on the distances file >> >>How would I do this with PHYLIP instead? > >The GCG DISTANCES program and GCG GROWTREE programs are very similar to >the DNADIST/PROTDIST and Neighbor programs in PHYLIP. In other words, >they allow phylogenetic trees to be constructed using "distance-based" >methods, but do not allow maximum likelihood or parsimony methods to be >used. They also don't do bootstrapping tests, tree comparisons, and >lots of other things. From mikep at entigen.com Wed Sep 5 14:49:00 2001 From: mikep at entigen.com (Michael Poidinger) Date: Wed, 05 Sep 2001 11:49:00 -0700 Subject: drawing trees In-Reply-To: References: <3B8B507D.B7639FE5@bioss.ac.uk> <3B8B507D.B7639FE5@bioss.ac.uk> Message-ID: <5.0.2.1.0.20010905114303.02210eb0@mail.au.int.en-bio.com> The Phylip programs drawgram and drawtree will produce postscript, depedning on whether you want rooted or unrooted trees respectively I tend to use drawgram, changing tree type to phenogram, grows horizontally, angle of labes = 90 or for interactive phylip options: L N 1 2 P 4 90 y At 05:59 PM 9/4/2001 -0700, you wrote: >I finally got around to trying this, using protdist and neighbor as >suggested below. >That worked, but gave me an ascii tree. Is there any way to get a tree in >postscript >format? > > > >>Gene Cutler asked: >> >>>Hello, all. I have a question about phylogenetic-type trees for >>>sequences. I haven't quite figured out how to do this using >>>emboss/phylip. This is how I have been doing this with gcg: >>> >>>run gcg program distances on the msf file >>>run gcg program growtree on the distances file >>> >>>How would I do this with PHYLIP instead? >> >>The GCG DISTANCES program and GCG GROWTREE programs are very similar to >>the DNADIST/PROTDIST and Neighbor programs in PHYLIP. In other words, >>they allow phylogenetic trees to be constructed using "distance-based" >>methods, but do not allow maximum likelihood or parsimony methods to be >>used. They also don't do bootstrapping tests, tree comparisons, and >>lots of other things. > From seb at i112pc09.vu-wien.ac.at Tue Sep 11 04:59:45 2001 From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka) Date: Tue, 11 Sep 2001 10:59:45 +0200 (CEST) Subject: Seq retrieval tool Announce Message-ID: Dear EMBOSS users, I'm using the EMBOSS program suite for a couple of months and I do not have local/direct access to the embl/genbank databases. Since I wanted to use the programs w/out every time downloading the embl/genbank entries by search/click/save ... I have written a litte PERL program that can be easily used by most emboss programs as a database/USA/app,external resource using the NIH Entrez server. Maybe this program is useful for other people, too. Links to the Homepage of gbwget and the project/download pages is at the end of this email. I hope I did not waste your bandwith! Thanks, Sebastian Announce: gbwget is a nucleic/protein sequence search and retrieval program to be used mainly by users of the EMBOSS sequence anylsis suite that do not have a local access to the huge genbank, embl or swissprot sequence databases. It allows users to directly use the (most ? of the) EMBOSS programs without having to retrieve and store sequences manually through web interfaces. With most programs of the EMBOSS suite one can give Uniform Sequence Addresses to directly access database entries to perform different tasks. For instace to quickly check for single restriction enzyme sites in one or more cloning vectors (Example: pGEX-5x3 vector from Amersham/Pharmacia, genbank ID is in the catalog) you only have to do: restrict -single ::gb:U13858 and you have the list of enzymes. But only if you have direct access to the database. Otherwise you have to open a webbrowser, go to http://www.ncbi.nlm.nih.gov, choose nucleotide, search for U13858, save the data file, and the run restrict on the file. And if you want to check the other 9 pGEX vectors ?? My program 'dbwget' enables EMBOSS users to do exactly that without local access to the db's and much more. An alternative might be to install the SRS program suite, but it's a quite large package and won't compile on linux (at least for me and others). I have written this program for me personally and use it now for my own research in the field of molecular biology. About: gbwget is a command line/screen oriented tool to search in nucleotide or protein databases and to view or retrieve database entries using the Entrez server at http://www.ncbi.nlm.nih.gov. It is intended as a sequence retrieval method for the EMBOSS (The European Molecular Biology Open Software Suite, see: http://www.uk.embnet.org/Software/EMBOSS/index.html) an alternative for the gcg sequence analysis suite. gbwget can also be used standalone, but web-based retrieval systems might be more comfortable. LICENSE: GPL Homepage and Download: http://gbwget.sourceforge.net and/or http://sourceforge.net/projects/gbwget Sebastian Bunka, Dr. med. vet. Inst. Med. Chemistry, Vet. University Vienna Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290 e-mail: Sebastian.Bunka at vu-wien.ac.at From bauer at genprofile.com Tue Sep 11 05:57:45 2001 From: bauer at genprofile.com (David Bauer) Date: Tue, 11 Sep 2001 11:57:45 +0200 Subject: Seq retrieval tool Announce References: Message-ID: <3B9DE019.6F13B542@genprofile.com> Hi, this is a nice remote entrez client. But why don't you use the url method to retrieve entries from ncbi and embl? In emboss.default I have: ##################################### DB gb [ type: N method: url format: gb url: "http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=s&form=6&dopt=g&html=no&uid=%s" comment: "GenBank via Entrez WWW Server" ] DB embldb [ type: N method: url format: embl url: "http://www.ebi.ac.uk/htbin/emblfetch?%s" comment: "EMBL via EBI WWW Server" ] ############################################## Then I can use "entret gb:U13858" to get the full entry or just "seqret gb:U13858" to get just the fasta formated sequence without header information. Same with embldb:U13858. Ciao, David. From seb at i112pc09.vu-wien.ac.at Tue Sep 11 06:47:35 2001 From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka) Date: Tue, 11 Sep 2001 12:47:35 +0200 (CEST) Subject: Seq retrieval tool Announce In-Reply-To: <3B9DE019.6F13B542@genprofile.com> Message-ID: On Tue, 11 Sep 2001, David Bauer wrote: > Hi, > > this is a nice remote entrez client. > But why don't you use the url method to retrieve entries from ncbi and > embl? > That's right, thanks for the tip! I have written this program some time ago before I even knew EMBOSS. The main purpose was to have this "selection" lists to fetch entries in bulk. I did not include any changes for the use in EMBOSS - it simply worked. But you're right - it's somehow useless ;-) Ciao, Sebastian Sebastian Bunka, Dr. med. vet. Inst. Med. Chemistry, Vet. University Vienna Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290 e-mail: Sebastian.Bunka at vu-wien.ac.at From peter.rice at uk.lionbioscience.com Tue Sep 11 06:59:58 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Tue, 11 Sep 2001 11:59:58 +0100 Subject: Seq retrieval tool Announce References: <3B9DE019.6F13B542@genprofile.com> Message-ID: <3B9DEEAE.F5BD3834@uk.lionbioscience.com> David Bauer wrote: > this is a nice remote entrez client. > But why don't you use the url method to retrieve entries from ncbi and > embl? True, but ... The EMBOSS url method has already needed C source code changes when Entrez output (and SRS output) changed. It can be very useful to have a script to process these sites. It is also a great help to have a script like this as a model for how to use the external application access method. The original external application was the ACEDB 'efetch' utility, no longer needed because EMBOSS now uses (and creates) 'efetch' index files to index databases. You can also use GCG's typedata as an external application, to save reindexing a GCG database. External applications normally read one entry at a time, but if gbwget can read more than one entry and return them in an EMBOSS-friendly format then it will do something URL access does not. It could also produce more helpful error messages when access fails. Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From johann at egenetics.com Tue Sep 18 04:04:42 2001 From: johann at egenetics.com (Johann Visagie) Date: Tue, 18 Sep 2001 10:04:42 +0200 Subject: [bioproj@physics.iisc.ernet.in: ] (fwd) Message-ID: <20010918100442.B25228@fling.sanbi.ac.za> The following arrived in my personal mailbox for some reason. I'm not sure I'm best qualified to assist this fellow. -- Johann ----- Forwarded message from "Selvarani.P" ----- > From: "Selvarani.P" > To: johann at egenetics.com > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST) > > > Respected Sir, > > It was a great time for us to know about your package EMBOSS. The > stage is so set now, that we have installed the software and the software > works fine with the test data and now we plan to update the database. But > the file formats .seq, .ref, .numbers , .offset, .names couldn't be > retrieved by us that were found in the "PIR directory of TEST" and so with > other files found in the directories within "TEST". We want the updated > copy of these databases. I would be grateful to you if you could arrange > the same for me. > > from Selvarani P. > > ----- End forwarded message ----- From uma at avesthagen.com Tue Sep 18 05:11:28 2001 From: uma at avesthagen.com (Uma Maheswari) Date: Tue, 18 Sep 2001 14:41:28 +0530 (IST) Subject: [bioproj@physics.iisc.ernet.in: ] (fwd) In-Reply-To: <20010918100442.B25228@fling.sanbi.ac.za> Message-ID: I think u are refering to "indexing the database for EMBOSS"...if u hhave set of seq.(database) and u want EMBOSS prog. to use that, just index the database for EMBOSS...The seq. given in the test folder is just a sample one and u need not update it. check the application called dbiflat in EMBOSS... http://www.uk.embnet.org/Software/EMBOSS/Apps/dbiflat.html hth uma. On Tue, 18 Sep 2001, Johann Visagie wrote: > The following arrived in my personal mailbox for some reason. I'm not sure > I'm best qualified to assist this fellow. > > -- Johann > > > > ----- Forwarded message from "Selvarani.P" ----- > > > From: "Selvarani.P" > > To: johann at egenetics.com > > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST) > > > > > > Respected Sir, > > > > It was a great time for us to know about your package EMBOSS. The > > stage is so set now, that we have installed the software and the software > > works fine with the test data and now we plan to update the database. But > > the file formats .seq, .ref, .numbers , .offset, .names couldn't be > > retrieved by us that were found in the "PIR directory of TEST" and so with > > other files found in the directories within "TEST". We want the updated > > copy of these databases. I would be grateful to you if you could arrange > > the same for me. > > > > from Selvarani P. > > > > > > ----- End forwarded message ----- > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ S.UmaMaheswari, Avesthagen Technologies Ltd, Web : http://www.avesthagen.com Unit III,9th Floor,Discoverer, Email: umasairam at rediffmail.com ITPL,WhiteField Road, uma6666 at yahoo.com Banglore-560 066. Tel : 080-8411665 ext.110 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From MarcL at DEVGEN.com Tue Sep 25 08:55:49 2001 From: MarcL at DEVGEN.com (Marc Logghe) Date: Tue, 25 Sep 2001 14:55:49 +0200 Subject: passing two sequences to application with pipe Message-ID: Hi, I know you can pipe a sequence (without the need to saving it to a file first) into an EMBOSS application using the -filter argument like eg. fastacmd -d nr -s p38398 | extractseq -filter -sformat ncbi -regions '10-110' But what if your EMBOSS applicion expects two input sequences like eg diffseq ? As far as I know, -filter takes only the first sequence, the second is lost. I tried something (silly) like diffseq -filter -sformat ncbi -filter -sformat ncbi or even numbering the arguments like diffseq -filter1 -filter2 but nothing worked out. Is somethin like this possible anyhow ? Marc From tchiang at bioinfo.sickkids.on.ca Tue Sep 25 14:49:28 2001 From: tchiang at bioinfo.sickkids.on.ca (Ted Chiang) Date: Tue, 25 Sep 2001 14:49:28 -0400 (EDT) Subject: question about "PROFIT" Message-ID: Hi I have a question about emboss' PROFIT. Could someone explain the algorithm of how it uses a frequency matrix to scan a sequence to determine whether that sequence is a match based on the satisfying the threshol percentage? The description documenting PROFIT seems a bit confusing. Any lit. references? -Ted ===================================== Ted Chiang Bioinformatics Supercomputing Centre Hospital for Sick Children, Toronto ext. 7028 tchiang at bioinfo.sickkids.on.ca From Alain.Empain at ulg.ac.be Fri Sep 28 05:27:09 2001 From: Alain.Empain at ulg.ac.be (Alain EMPAIN) Date: Fri, 28 Sep 2001 11:27:09 +0200 Subject: Problem to debug the 'external' database link Message-ID: <01092811270909.12447@kwak> Hi ! I am trying to link EMBOSS 2.0.1 tools to an internal database and I do not find a way to debug the error. For ex. I replaced the app expression by app: "echo %s > /tmp/log" to at least take a look at what is passed, but nothing is written to /tmp/log ?? --------------------------------------------------------- alain at kwak:/work/genbase/db$ seqret Reads and writes (returns) sequences Input sequence(s): app:essai An error has been found: option -sequence: Unable to read sequence 'app:essai' ----------------------- ==> normal error because there is nothing returned, but the /tmp/log is not created =========================================== Here is a real try, working well from the shell : look 'AGLA13' /work/genbase/db/sequence.str | g_fasta-io -f my .embossrc : (...) DB gmol [ method: app format: fasta app: "look '%s' /work/genbase/db/sequence.str | g_fasta-io -f" type: P comment: "Genbase/db/sequence.str" ] (...) Thanks for any information, Alain +-------------------------------------------------------------------------------------- | Dr Alain EMPAIN Bioinformatique, G?n?tique Mol?culaire B43, | Fac. M?d. V?t?rinaire, Univ. de Li?ge, Sart-Tilman / B-4000 Li?ge | Alain.EMPAIN at ulg.ac.be | WORK:+32 4 366 3821 Fax: +32 4 366 4122 GSM:+32 497 701764 | HOME:+32 85 512341 -- Rue des Martyrs,7 B-4550 Nandrin From irv at midalink.net Mon Sep 3 04:04:09 2001 From: irv at midalink.net (Irv Edelman) Date: Sun, 2 Sep 2001 22:04:09 -0600 Subject: prima application Message-ID: Hi, I don't work for GCG any longer, but, when I did, I wrote a fair number of application programs. I was just looking at the description of the prima program on the EMBOSS web site and noticed that it seemed remarkably familiar. Large sections of the description seem to have been taken, verbatim, from the manual entry I wrote for the GCG Prime program. The methods used in the program, the program parameters, and the program function itself, seem to be remarkably similar to the Prime program I wrote for GCG. Yet the prima program is completely attributed to someone at HGMP. Is that really so? Just curious. Cheers, Irv Edelman From gwilliam at hgmp.mrc.ac.uk Mon Sep 3 08:15:24 2001 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 03 Sep 2001 09:15:24 +0100 Subject: prima application References: Message-ID: <3B933C1C.FC5F6366@hgmp.mrc.ac.uk> Irv Edelman wrote: > > Hi, > > I don't work for GCG any longer, but, when I did, I wrote a fair > number of application programs. I was just looking at the > description of the prima program on the EMBOSS web site and > noticed that it seemed remarkably familiar. Large sections of the > description seem to have been taken, verbatim, from the manual > entry I wrote for the GCG Prime program. The methods used in the > program, the program parameters, and the program function itself, > seem to be remarkably similar to the Prime program I wrote for > GCG. Yet the prima program is completely attributed to someone at > HGMP. Is that really so? Just curious. Sorry - the 'prima' documentation is incorrect - I had been doing bulk copies of documentation from the old EGCG programs into the corresponding EMBOSS documentation and this one slipped through by mistake. 'prima' has no code in common with the GCG 'prime' program. I will change the documentation. Gary -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From cutler at tularik.com Wed Sep 5 00:59:59 2001 From: cutler at tularik.com (Gene Cutler) Date: Tue, 4 Sep 2001 17:59:59 -0700 Subject: drawing trees In-Reply-To: <3B8B507D.B7639FE5@bioss.ac.uk> References: <3B8B507D.B7639FE5@bioss.ac.uk> Message-ID: I finally got around to trying this, using protdist and neighbor as suggested below. That worked, but gave me an ascii tree. Is there any way to get a tree in postscript format? >Gene Cutler asked: > >>Hello, all. I have a question about phylogenetic-type trees for >>sequences. I haven't quite figured out how to do this using >>emboss/phylip. This is how I have been doing this with gcg: >> >>run gcg program distances on the msf file >>run gcg program growtree on the distances file >> >>How would I do this with PHYLIP instead? > >The GCG DISTANCES program and GCG GROWTREE programs are very similar to >the DNADIST/PROTDIST and Neighbor programs in PHYLIP. In other words, >they allow phylogenetic trees to be constructed using "distance-based" >methods, but do not allow maximum likelihood or parsimony methods to be >used. They also don't do bootstrapping tests, tree comparisons, and >lots of other things. From mikep at entigen.com Wed Sep 5 18:49:00 2001 From: mikep at entigen.com (Michael Poidinger) Date: Wed, 05 Sep 2001 11:49:00 -0700 Subject: drawing trees In-Reply-To: References: <3B8B507D.B7639FE5@bioss.ac.uk> <3B8B507D.B7639FE5@bioss.ac.uk> Message-ID: <5.0.2.1.0.20010905114303.02210eb0@mail.au.int.en-bio.com> The Phylip programs drawgram and drawtree will produce postscript, depedning on whether you want rooted or unrooted trees respectively I tend to use drawgram, changing tree type to phenogram, grows horizontally, angle of labes = 90 or for interactive phylip options: L N 1 2 P 4 90 y At 05:59 PM 9/4/2001 -0700, you wrote: >I finally got around to trying this, using protdist and neighbor as >suggested below. >That worked, but gave me an ascii tree. Is there any way to get a tree in >postscript >format? > > > >>Gene Cutler asked: >> >>>Hello, all. I have a question about phylogenetic-type trees for >>>sequences. I haven't quite figured out how to do this using >>>emboss/phylip. This is how I have been doing this with gcg: >>> >>>run gcg program distances on the msf file >>>run gcg program growtree on the distances file >>> >>>How would I do this with PHYLIP instead? >> >>The GCG DISTANCES program and GCG GROWTREE programs are very similar to >>the DNADIST/PROTDIST and Neighbor programs in PHYLIP. In other words, >>they allow phylogenetic trees to be constructed using "distance-based" >>methods, but do not allow maximum likelihood or parsimony methods to be >>used. They also don't do bootstrapping tests, tree comparisons, and >>lots of other things. > From seb at i112pc09.vu-wien.ac.at Tue Sep 11 08:59:45 2001 From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka) Date: Tue, 11 Sep 2001 10:59:45 +0200 (CEST) Subject: Seq retrieval tool Announce Message-ID: Dear EMBOSS users, I'm using the EMBOSS program suite for a couple of months and I do not have local/direct access to the embl/genbank databases. Since I wanted to use the programs w/out every time downloading the embl/genbank entries by search/click/save ... I have written a litte PERL program that can be easily used by most emboss programs as a database/USA/app,external resource using the NIH Entrez server. Maybe this program is useful for other people, too. Links to the Homepage of gbwget and the project/download pages is at the end of this email. I hope I did not waste your bandwith! Thanks, Sebastian Announce: gbwget is a nucleic/protein sequence search and retrieval program to be used mainly by users of the EMBOSS sequence anylsis suite that do not have a local access to the huge genbank, embl or swissprot sequence databases. It allows users to directly use the (most ? of the) EMBOSS programs without having to retrieve and store sequences manually through web interfaces. With most programs of the EMBOSS suite one can give Uniform Sequence Addresses to directly access database entries to perform different tasks. For instace to quickly check for single restriction enzyme sites in one or more cloning vectors (Example: pGEX-5x3 vector from Amersham/Pharmacia, genbank ID is in the catalog) you only have to do: restrict -single ::gb:U13858 and you have the list of enzymes. But only if you have direct access to the database. Otherwise you have to open a webbrowser, go to http://www.ncbi.nlm.nih.gov, choose nucleotide, search for U13858, save the data file, and the run restrict on the file. And if you want to check the other 9 pGEX vectors ?? My program 'dbwget' enables EMBOSS users to do exactly that without local access to the db's and much more. An alternative might be to install the SRS program suite, but it's a quite large package and won't compile on linux (at least for me and others). I have written this program for me personally and use it now for my own research in the field of molecular biology. About: gbwget is a command line/screen oriented tool to search in nucleotide or protein databases and to view or retrieve database entries using the Entrez server at http://www.ncbi.nlm.nih.gov. It is intended as a sequence retrieval method for the EMBOSS (The European Molecular Biology Open Software Suite, see: http://www.uk.embnet.org/Software/EMBOSS/index.html) an alternative for the gcg sequence analysis suite. gbwget can also be used standalone, but web-based retrieval systems might be more comfortable. LICENSE: GPL Homepage and Download: http://gbwget.sourceforge.net and/or http://sourceforge.net/projects/gbwget Sebastian Bunka, Dr. med. vet. Inst. Med. Chemistry, Vet. University Vienna Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290 e-mail: Sebastian.Bunka at vu-wien.ac.at From bauer at genprofile.com Tue Sep 11 09:57:45 2001 From: bauer at genprofile.com (David Bauer) Date: Tue, 11 Sep 2001 11:57:45 +0200 Subject: Seq retrieval tool Announce References: Message-ID: <3B9DE019.6F13B542@genprofile.com> Hi, this is a nice remote entrez client. But why don't you use the url method to retrieve entries from ncbi and embl? In emboss.default I have: ##################################### DB gb [ type: N method: url format: gb url: "http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=s&form=6&dopt=g&html=no&uid=%s" comment: "GenBank via Entrez WWW Server" ] DB embldb [ type: N method: url format: embl url: "http://www.ebi.ac.uk/htbin/emblfetch?%s" comment: "EMBL via EBI WWW Server" ] ############################################## Then I can use "entret gb:U13858" to get the full entry or just "seqret gb:U13858" to get just the fasta formated sequence without header information. Same with embldb:U13858. Ciao, David. From seb at i112pc09.vu-wien.ac.at Tue Sep 11 10:47:35 2001 From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka) Date: Tue, 11 Sep 2001 12:47:35 +0200 (CEST) Subject: Seq retrieval tool Announce In-Reply-To: <3B9DE019.6F13B542@genprofile.com> Message-ID: On Tue, 11 Sep 2001, David Bauer wrote: > Hi, > > this is a nice remote entrez client. > But why don't you use the url method to retrieve entries from ncbi and > embl? > That's right, thanks for the tip! I have written this program some time ago before I even knew EMBOSS. The main purpose was to have this "selection" lists to fetch entries in bulk. I did not include any changes for the use in EMBOSS - it simply worked. But you're right - it's somehow useless ;-) Ciao, Sebastian Sebastian Bunka, Dr. med. vet. Inst. Med. Chemistry, Vet. University Vienna Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290 e-mail: Sebastian.Bunka at vu-wien.ac.at From peter.rice at uk.lionbioscience.com Tue Sep 11 10:59:58 2001 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Tue, 11 Sep 2001 11:59:58 +0100 Subject: Seq retrieval tool Announce References: <3B9DE019.6F13B542@genprofile.com> Message-ID: <3B9DEEAE.F5BD3834@uk.lionbioscience.com> David Bauer wrote: > this is a nice remote entrez client. > But why don't you use the url method to retrieve entries from ncbi and > embl? True, but ... The EMBOSS url method has already needed C source code changes when Entrez output (and SRS output) changed. It can be very useful to have a script to process these sites. It is also a great help to have a script like this as a model for how to use the external application access method. The original external application was the ACEDB 'efetch' utility, no longer needed because EMBOSS now uses (and creates) 'efetch' index files to index databases. You can also use GCG's typedata as an external application, to save reindexing a GCG database. External applications normally read one entry at a time, but if gbwget can read more than one entry and return them in an EMBOSS-friendly format then it will do something URL access does not. It could also produce more helpful error messages when access fails. Peter -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From johann at egenetics.com Tue Sep 18 08:04:42 2001 From: johann at egenetics.com (Johann Visagie) Date: Tue, 18 Sep 2001 10:04:42 +0200 Subject: [bioproj@physics.iisc.ernet.in: ] (fwd) Message-ID: <20010918100442.B25228@fling.sanbi.ac.za> The following arrived in my personal mailbox for some reason. I'm not sure I'm best qualified to assist this fellow. -- Johann ----- Forwarded message from "Selvarani.P" ----- > From: "Selvarani.P" > To: johann at egenetics.com > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST) > > > Respected Sir, > > It was a great time for us to know about your package EMBOSS. The > stage is so set now, that we have installed the software and the software > works fine with the test data and now we plan to update the database. But > the file formats .seq, .ref, .numbers , .offset, .names couldn't be > retrieved by us that were found in the "PIR directory of TEST" and so with > other files found in the directories within "TEST". We want the updated > copy of these databases. I would be grateful to you if you could arrange > the same for me. > > from Selvarani P. > > ----- End forwarded message ----- From uma at avesthagen.com Tue Sep 18 09:11:28 2001 From: uma at avesthagen.com (Uma Maheswari) Date: Tue, 18 Sep 2001 14:41:28 +0530 (IST) Subject: [bioproj@physics.iisc.ernet.in: ] (fwd) In-Reply-To: <20010918100442.B25228@fling.sanbi.ac.za> Message-ID: I think u are refering to "indexing the database for EMBOSS"...if u hhave set of seq.(database) and u want EMBOSS prog. to use that, just index the database for EMBOSS...The seq. given in the test folder is just a sample one and u need not update it. check the application called dbiflat in EMBOSS... http://www.uk.embnet.org/Software/EMBOSS/Apps/dbiflat.html hth uma. On Tue, 18 Sep 2001, Johann Visagie wrote: > The following arrived in my personal mailbox for some reason. I'm not sure > I'm best qualified to assist this fellow. > > -- Johann > > > > ----- Forwarded message from "Selvarani.P" ----- > > > From: "Selvarani.P" > > To: johann at egenetics.com > > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST) > > > > > > Respected Sir, > > > > It was a great time for us to know about your package EMBOSS. The > > stage is so set now, that we have installed the software and the software > > works fine with the test data and now we plan to update the database. But > > the file formats .seq, .ref, .numbers , .offset, .names couldn't be > > retrieved by us that were found in the "PIR directory of TEST" and so with > > other files found in the directories within "TEST". We want the updated > > copy of these databases. I would be grateful to you if you could arrange > > the same for me. > > > > from Selvarani P. > > > > > > ----- End forwarded message ----- > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ S.UmaMaheswari, Avesthagen Technologies Ltd, Web : http://www.avesthagen.com Unit III,9th Floor,Discoverer, Email: umasairam at rediffmail.com ITPL,WhiteField Road, uma6666 at yahoo.com Banglore-560 066. Tel : 080-8411665 ext.110 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From MarcL at DEVGEN.com Tue Sep 25 12:55:49 2001 From: MarcL at DEVGEN.com (Marc Logghe) Date: Tue, 25 Sep 2001 14:55:49 +0200 Subject: passing two sequences to application with pipe Message-ID: Hi, I know you can pipe a sequence (without the need to saving it to a file first) into an EMBOSS application using the -filter argument like eg. fastacmd -d nr -s p38398 | extractseq -filter -sformat ncbi -regions '10-110' But what if your EMBOSS applicion expects two input sequences like eg diffseq ? As far as I know, -filter takes only the first sequence, the second is lost. I tried something (silly) like diffseq -filter -sformat ncbi -filter -sformat ncbi or even numbering the arguments like diffseq -filter1 -filter2 but nothing worked out. Is somethin like this possible anyhow ? Marc From tchiang at bioinfo.sickkids.on.ca Tue Sep 25 18:49:28 2001 From: tchiang at bioinfo.sickkids.on.ca (Ted Chiang) Date: Tue, 25 Sep 2001 14:49:28 -0400 (EDT) Subject: question about "PROFIT" Message-ID: Hi I have a question about emboss' PROFIT. Could someone explain the algorithm of how it uses a frequency matrix to scan a sequence to determine whether that sequence is a match based on the satisfying the threshol percentage? The description documenting PROFIT seems a bit confusing. Any lit. references? -Ted ===================================== Ted Chiang Bioinformatics Supercomputing Centre Hospital for Sick Children, Toronto ext. 7028 tchiang at bioinfo.sickkids.on.ca From Alain.Empain at ulg.ac.be Fri Sep 28 09:27:09 2001 From: Alain.Empain at ulg.ac.be (Alain EMPAIN) Date: Fri, 28 Sep 2001 11:27:09 +0200 Subject: Problem to debug the 'external' database link Message-ID: <01092811270909.12447@kwak> Hi ! I am trying to link EMBOSS 2.0.1 tools to an internal database and I do not find a way to debug the error. For ex. I replaced the app expression by app: "echo %s > /tmp/log" to at least take a look at what is passed, but nothing is written to /tmp/log ?? --------------------------------------------------------- alain at kwak:/work/genbase/db$ seqret Reads and writes (returns) sequences Input sequence(s): app:essai An error has been found: option -sequence: Unable to read sequence 'app:essai' ----------------------- ==> normal error because there is nothing returned, but the /tmp/log is not created =========================================== Here is a real try, working well from the shell : look 'AGLA13' /work/genbase/db/sequence.str | g_fasta-io -f my .embossrc : (...) DB gmol [ method: app format: fasta app: "look '%s' /work/genbase/db/sequence.str | g_fasta-io -f" type: P comment: "Genbase/db/sequence.str" ] (...) Thanks for any information, Alain +-------------------------------------------------------------------------------------- | Dr Alain EMPAIN Bioinformatique, G?n?tique Mol?culaire B43, | Fac. M?d. V?t?rinaire, Univ. de Li?ge, Sart-Tilman / B-4000 Li?ge | Alain.EMPAIN at ulg.ac.be | WORK:+32 4 366 3821 Fax: +32 4 366 4122 GSM:+32 497 701764 | HOME:+32 85 512341 -- Rue des Martyrs,7 B-4550 Nandrin