From irv at midalink.net  Mon Sep  3 00:04:09 2001
From: irv at midalink.net (Irv Edelman)
Date: Sun, 2 Sep 2001 22:04:09 -0600
Subject: prima application
Message-ID: <NEBBJAPOEMOKCLJADLLKCEAJCMAA.irv@midalink.net>

Hi,

I don't work for GCG any longer, but, when I did, I wrote a fair
number of application programs.  I was just looking at the
description of the prima program on the EMBOSS web site and
noticed that it seemed remarkably familiar.  Large sections of the
description seem to have been taken, verbatim, from the manual
entry I wrote for the GCG Prime program.  The methods used in the
program, the program parameters, and the program function itself,
seem to be remarkably similar to the Prime program I wrote for
GCG.  Yet the prima program is completely attributed to someone at
HGMP.  Is that really so?  Just curious.

Cheers,
    Irv Edelman


From gwilliam at hgmp.mrc.ac.uk  Mon Sep  3 04:15:24 2001
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Mon, 03 Sep 2001 09:15:24 +0100
Subject: prima application
References: <NEBBJAPOEMOKCLJADLLKCEAJCMAA.irv@midalink.net>
Message-ID: <3B933C1C.FC5F6366@hgmp.mrc.ac.uk>

Irv Edelman wrote:
> 
> Hi,
> 
> I don't work for GCG any longer, but, when I did, I wrote a fair
> number of application programs.  I was just looking at the
> description of the prima program on the EMBOSS web site and
> noticed that it seemed remarkably familiar.  Large sections of the
> description seem to have been taken, verbatim, from the manual
> entry I wrote for the GCG Prime program.  The methods used in the
> program, the program parameters, and the program function itself,
> seem to be remarkably similar to the Prime program I wrote for
> GCG.  Yet the prima program is completely attributed to someone at
> HGMP.  Is that really so?  Just curious.

Sorry - the 'prima' documentation is incorrect - I had been doing bulk
copies of documentation from the old EGCG programs into the
corresponding EMBOSS documentation and this one slipped through by
mistake. 

'prima' has no code in common with the GCG 'prime' program.

I will change the documentation.

Gary

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From cutler at tularik.com  Tue Sep  4 20:59:59 2001
From: cutler at tularik.com (Gene Cutler)
Date: Tue, 4 Sep 2001 17:59:59 -0700
Subject: drawing trees
In-Reply-To: <3B8B507D.B7639FE5@bioss.ac.uk>
References: <a05101006b7b041c61d1e@[192.168.50.41]>
 <3B8B507D.B7639FE5@bioss.ac.uk>
Message-ID: <a05101003b7bb29023af1@[192.168.50.41]>


I finally got around to trying this, using protdist and neighbor as 
suggested below.
That worked, but gave me an ascii tree.  Is there any way to get a 
tree in postscript
format?


>Gene Cutler asked:
>
>>Hello, all.  I have a question about phylogenetic-type trees for
>>sequences.  I haven't quite figured out how to do this using
>>emboss/phylip.  This is how I have been doing this with gcg:
>>
>>run gcg program distances on the msf file
>>run gcg program growtree on the distances file
>>
>>How would I do this with PHYLIP instead?
>
>The GCG DISTANCES program and GCG GROWTREE programs are very similar to
>the DNADIST/PROTDIST and Neighbor programs in PHYLIP.  In other words,
>they allow phylogenetic trees to be constructed using "distance-based"
>methods, but do not allow maximum likelihood or parsimony methods to be
>used.  They also don't do bootstrapping tests, tree comparisons, and
>lots of other things.


From mikep at entigen.com  Wed Sep  5 14:49:00 2001
From: mikep at entigen.com (Michael Poidinger)
Date: Wed, 05 Sep 2001 11:49:00 -0700
Subject: drawing trees
In-Reply-To: <a05101003b7bb29023af1@[192.168.50.41]>
References: <3B8B507D.B7639FE5@bioss.ac.uk>
 <a05101006b7b041c61d1e@[192.168.50.41]>
 <3B8B507D.B7639FE5@bioss.ac.uk>
Message-ID: <5.0.2.1.0.20010905114303.02210eb0@mail.au.int.en-bio.com>

The Phylip programs drawgram and drawtree will produce postscript, 
depedning on whether you want rooted or unrooted trees respectively


I tend to use drawgram, changing tree type to phenogram, grows 
horizontally, angle of labes = 90

or for interactive phylip options:

L
N
1
2
P
4
90
y


At 05:59 PM 9/4/2001 -0700, you wrote:

>I finally got around to trying this, using protdist and neighbor as 
>suggested below.
>That worked, but gave me an ascii tree.  Is there any way to get a tree in 
>postscript
>format?
>
>
>
>>Gene Cutler asked:
>>
>>>Hello, all.  I have a question about phylogenetic-type trees for
>>>sequences.  I haven't quite figured out how to do this using
>>>emboss/phylip.  This is how I have been doing this with gcg:
>>>
>>>run gcg program distances on the msf file
>>>run gcg program growtree on the distances file
>>>
>>>How would I do this with PHYLIP instead?
>>
>>The GCG DISTANCES program and GCG GROWTREE programs are very similar to
>>the DNADIST/PROTDIST and Neighbor programs in PHYLIP.  In other words,
>>they allow phylogenetic trees to be constructed using "distance-based"
>>methods, but do not allow maximum likelihood or parsimony methods to be
>>used.  They also don't do bootstrapping tests, tree comparisons, and
>>lots of other things.
>


From seb at i112pc09.vu-wien.ac.at  Tue Sep 11 04:59:45 2001
From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka)
Date: Tue, 11 Sep 2001 10:59:45 +0200 (CEST)
Subject: Seq retrieval tool Announce
Message-ID: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at>

Dear EMBOSS users,

I'm using the EMBOSS program suite for a couple of months and I do not have
local/direct access to the embl/genbank databases. Since I wanted to use the
programs w/out every time downloading the embl/genbank entries by
search/click/save ... I have written a litte PERL program that can be easily
used by most emboss programs as a database/USA/app,external resource using
the NIH Entrez server. Maybe this program is useful for other people, too.

Links to the Homepage of gbwget and the project/download pages is at the end
of this email. I hope I did not waste your bandwith!

Thanks,
Sebastian


Announce:
gbwget is a nucleic/protein sequence search and retrieval program to be used
mainly by users of the EMBOSS sequence anylsis suite that do not have a
local access to the huge genbank, embl or swissprot sequence databases. It
allows users to directly use the (most ? of the) EMBOSS programs without
having to retrieve and store sequences manually through web interfaces. With
most programs of the EMBOSS suite one can give Uniform Sequence Addresses to
directly access database entries to perform different tasks. For instace to
quickly check for single restriction enzyme sites in one or more cloning
vectors (Example: pGEX-5x3 vector from Amersham/Pharmacia, genbank ID is in
the catalog) you only have to do: restrict -single ::gb:U13858 and you have
the list of enzymes. But only if you have direct access to the database.
Otherwise you have to open a webbrowser, go to http://www.ncbi.nlm.nih.gov,
choose nucleotide, search for U13858, save the data file, and the run
restrict on the file. And if you want to check the other 9 pGEX vectors ??
My program 'dbwget' enables EMBOSS users to do exactly that without local
access to the db's and much more. An alternative might be to install the SRS
program suite, but it's a quite large package and won't compile on linux (at
least for me and others).

I have written this program for me personally and use it now for my own
research in the field of molecular biology.

About:
gbwget is a command line/screen oriented tool to search in nucleotide
or protein databases and to view or retrieve database entries using
the Entrez server at http://www.ncbi.nlm.nih.gov. It is intended as
a sequence retrieval method for the EMBOSS (The European Molecular Biology
Open Software Suite, see:
http://www.uk.embnet.org/Software/EMBOSS/index.html) an alternative for
the gcg sequence analysis suite. gbwget can also be used standalone, but
web-based retrieval systems might be more comfortable.
LICENSE: GPL

Homepage and Download:
http://gbwget.sourceforge.net
and/or
http://sourceforge.net/projects/gbwget


Sebastian Bunka, Dr. med. vet.
Inst. Med. Chemistry, Vet. University Vienna
Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290
e-mail: Sebastian.Bunka at vu-wien.ac.at


From bauer at genprofile.com  Tue Sep 11 05:57:45 2001
From: bauer at genprofile.com (David Bauer)
Date: Tue, 11 Sep 2001 11:57:45 +0200
Subject: Seq retrieval tool Announce
References: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at>
Message-ID: <3B9DE019.6F13B542@genprofile.com>

Hi,

this is a nice remote entrez client.
But why don't you use the url method to retrieve entries from ncbi and
embl?

In emboss.default I have:
#####################################
DB gb [ type: N method: url format: gb
  url:
"http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=s&form=6&dopt=g&html=no&uid=%s"
  comment: "GenBank via Entrez WWW Server" ]
DB embldb [ type: N method: url format: embl
  url: "http://www.ebi.ac.uk/htbin/emblfetch?%s"
  comment: "EMBL via EBI WWW Server" ]
##############################################

Then I can use "entret gb:U13858" to get the full entry or just "seqret
gb:U13858" to get just the fasta formated sequence without header
information.
Same with embldb:U13858.

Ciao, 
David.


From seb at i112pc09.vu-wien.ac.at  Tue Sep 11 06:47:35 2001
From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka)
Date: Tue, 11 Sep 2001 12:47:35 +0200 (CEST)
Subject: Seq retrieval tool Announce
In-Reply-To: <3B9DE019.6F13B542@genprofile.com>
Message-ID: <Pine.LNX.4.33.0109111242100.9512-100000@i112pc09.vu-wien.ac.at>

On Tue, 11 Sep 2001, David Bauer wrote:

> Hi,
>
> this is a nice remote entrez client.
> But why don't you use the url method to retrieve entries from ncbi and
> embl?
>
That's right, thanks for the tip! I have written this program some time ago
before I even knew EMBOSS. The main purpose was to have this "selection"
lists to fetch entries in bulk. I did not include any changes for the use in
EMBOSS - it simply worked.

But you're right - it's somehow useless ;-)

Ciao, Sebastian


Sebastian Bunka, Dr. med. vet.
Inst. Med. Chemistry, Vet. University Vienna
Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290
e-mail: Sebastian.Bunka at vu-wien.ac.at


From peter.rice at uk.lionbioscience.com  Tue Sep 11 06:59:58 2001
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 11 Sep 2001 11:59:58 +0100
Subject: Seq retrieval tool Announce
References: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at> <3B9DE019.6F13B542@genprofile.com>
Message-ID: <3B9DEEAE.F5BD3834@uk.lionbioscience.com>

David Bauer wrote:
> this is a nice remote entrez client.
> But why don't you use the url method to retrieve entries from ncbi and
> embl?

True, but ...

The EMBOSS url method has already needed C source code changes when Entrez
output (and SRS output) changed. It can be very useful to have a script to
process these sites.

It is also a great help to have a script like this as a model for how to
use the external application access method.

The original external application was the ACEDB 'efetch' utility, no longer
needed because EMBOSS now uses (and creates) 'efetch' index files to index
databases.

You can also use GCG's typedata as an external application, to save
reindexing a GCG database.

External applications normally read one entry at a time, but if gbwget can
read more than one entry and return them in an EMBOSS-friendly format then
it will do something URL access does not.

It could also produce more helpful error messages when access fails.

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From johann at egenetics.com  Tue Sep 18 04:04:42 2001
From: johann at egenetics.com (Johann Visagie)
Date: Tue, 18 Sep 2001 10:04:42 +0200
Subject: [bioproj@physics.iisc.ernet.in: ] (fwd)
Message-ID: <20010918100442.B25228@fling.sanbi.ac.za>

The following arrived in my personal mailbox for some reason.  I'm not sure
I'm best qualified to assist this fellow.

-- Johann


----- Forwarded message from "Selvarani.P" <bioproj at physics.iisc.ernet.in> -----

> From: "Selvarani.P" <bioproj at physics.iisc.ernet.in>
> To: johann at egenetics.com
> Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST)
> 
> 
> Respected Sir,
> 
>         It was a great time for us to know about your package EMBOSS. The
> stage is so set now, that we have installed the software and the software
> works fine with the test data and now we plan to update the database. But
> the file formats .seq, .ref, .numbers , .offset, .names couldn't be
> retrieved by us that were found in the "PIR directory of TEST" and so with
> other files found in the directories within "TEST". We want the updated
> copy of these databases. I would be grateful to you if you could arrange
> the same for me.
> 
> from Selvarani P.
> 
> 

----- End forwarded message -----


From uma at avesthagen.com  Tue Sep 18 05:11:28 2001
From: uma at avesthagen.com (Uma Maheswari)
Date: Tue, 18 Sep 2001 14:41:28 +0530 (IST)
Subject: [bioproj@physics.iisc.ernet.in: ] (fwd)
In-Reply-To: <20010918100442.B25228@fling.sanbi.ac.za>
Message-ID: <Pine.LNX.4.33.0109181426290.11432-100000@mail.avesthagen.com>


I think u are refering to "indexing the database for EMBOSS"...if u hhave
set of seq.(database) and u want EMBOSS prog. to use that, just index the
database for EMBOSS...The seq. given in the test folder is just a sample
one and u need not update it.

check the application called dbiflat in EMBOSS...

http://www.uk.embnet.org/Software/EMBOSS/Apps/dbiflat.html

hth
uma.

On Tue, 18 Sep 2001, Johann Visagie wrote:

> The following arrived in my personal mailbox for some reason.  I'm not sure
> I'm best qualified to assist this fellow.
>
> -- Johann
>
>
>
> ----- Forwarded message from "Selvarani.P" <bioproj at physics.iisc.ernet.in> -----
>
> > From: "Selvarani.P" <bioproj at physics.iisc.ernet.in>
> > To: johann at egenetics.com
> > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST)
> >
> >
> > Respected Sir,
> >
> >         It was a great time for us to know about your package EMBOSS. The
> > stage is so set now, that we have installed the software and the software
> > works fine with the test data and now we plan to update the database. But
> > the file formats .seq, .ref, .numbers , .offset, .names couldn't be
> > retrieved by us that were found in the "PIR directory of TEST" and so with
> > other files found in the directories within "TEST". We want the updated
> > copy of these databases. I would be grateful to you if you could arrange
> > the same for me.
> >
> > from Selvarani P.
> >
> >
>
> ----- End forwarded message -----
>

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
S.UmaMaheswari,
Avesthagen Technologies Ltd,            Web  : http://www.avesthagen.com
Unit III,9th Floor,Discoverer,          Email:  umasairam at rediffmail.com
ITPL,WhiteField Road,                           uma6666 at yahoo.com
Banglore-560 066.                       Tel  : 080-8411665 ext.110
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


From MarcL at DEVGEN.com  Tue Sep 25 08:55:49 2001
From: MarcL at DEVGEN.com (Marc Logghe)
Date: Tue, 25 Sep 2001 14:55:49 +0200
Subject: passing two sequences to application with pipe
Message-ID: <B78EB1B2063ED4119B6100B0D03D0BA1317FD9@morelia.be.devgen.com>

Hi,
I know you can pipe a sequence (without the need to saving it to a file
first) into an EMBOSS application using the -filter argument like eg.
fastacmd -d nr -s p38398 | extractseq -filter -sformat ncbi -regions
'10-110'
But what if your EMBOSS applicion expects two input sequences like eg
diffseq ?
As far as I know, -filter takes only the first sequence, the second is lost.
I tried something (silly) like diffseq -filter -sformat ncbi -filter
-sformat ncbi or even numbering the arguments like diffseq -filter1 -filter2
but nothing worked out.
Is somethin like this possible anyhow ?
Marc


From tchiang at bioinfo.sickkids.on.ca  Tue Sep 25 14:49:28 2001
From: tchiang at bioinfo.sickkids.on.ca (Ted Chiang)
Date: Tue, 25 Sep 2001 14:49:28 -0400 (EDT)
Subject: question about "PROFIT"
Message-ID: <Pine.GSO.4.05.10109251441371.6656-100000@kenny.bioinfo.sickkids.on.ca>


Hi 

I have a question about emboss' PROFIT.  Could someone explain the
algorithm of how it uses a frequency matrix to scan a sequence to
determine whether that sequence is a match based on the satisfying the
threshol percentage?

The description documenting PROFIT seems a bit confusing.  Any lit.
references?

-Ted


=====================================
Ted Chiang
Bioinformatics Supercomputing Centre
Hospital for Sick Children, Toronto
ext. 7028
tchiang at bioinfo.sickkids.on.ca


From Alain.Empain at ulg.ac.be  Fri Sep 28 05:27:09 2001
From: Alain.Empain at ulg.ac.be (Alain EMPAIN)
Date: Fri, 28 Sep 2001 11:27:09 +0200
Subject: Problem to debug the 'external' database link
Message-ID: <01092811270909.12447@kwak>

Hi !

I am trying to link EMBOSS 2.0.1 tools to an internal database and I do not 
find a way to debug the error.

For ex. I replaced the app expression by 
	app: "echo %s > /tmp/log" 
to at least take a look at what is passed, but nothing 
is written to /tmp/log ??
 
---------------------------------------------------------
alain at kwak:/work/genbase/db$ seqret
Reads and writes (returns) sequences
Input sequence(s): app:essai
   An error has been found: option -sequence: 
	Unable to read sequence 'app:essai'
-----------------------
==> normal error because there is nothing returned, 
	but the /tmp/log is not created


===========================================
Here is a real try, working well from the shell :
look 'AGLA13' /work/genbase/db/sequence.str | g_fasta-io -f

my .embossrc :
(...)
DB  gmol  [
	method: app
	format: fasta
	app: "look '%s' /work/genbase/db/sequence.str | g_fasta-io -f"
	type: P
	comment: "Genbase/db/sequence.str"
]
(...)


	Thanks for any information,

	Alain
+--------------------------------------------------------------------------------------
|  Dr Alain EMPAIN      Bioinformatique, G?n?tique Mol?culaire B43,
|  Fac. M?d. V?t?rinaire, Univ. de Li?ge, Sart-Tilman / B-4000 Li?ge  
|       Alain.EMPAIN at ulg.ac.be
|       WORK:+32 4 366 3821 Fax: +32 4 366 4122   GSM:+32 497 701764
|       HOME:+32 85 512341  -- Rue des Martyrs,7  B-4550 Nandrin


From irv at midalink.net  Mon Sep  3 04:04:09 2001
From: irv at midalink.net (Irv Edelman)
Date: Sun, 2 Sep 2001 22:04:09 -0600
Subject: prima application
Message-ID: <NEBBJAPOEMOKCLJADLLKCEAJCMAA.irv@midalink.net>

Hi,

I don't work for GCG any longer, but, when I did, I wrote a fair
number of application programs.  I was just looking at the
description of the prima program on the EMBOSS web site and
noticed that it seemed remarkably familiar.  Large sections of the
description seem to have been taken, verbatim, from the manual
entry I wrote for the GCG Prime program.  The methods used in the
program, the program parameters, and the program function itself,
seem to be remarkably similar to the Prime program I wrote for
GCG.  Yet the prima program is completely attributed to someone at
HGMP.  Is that really so?  Just curious.

Cheers,
    Irv Edelman


From gwilliam at hgmp.mrc.ac.uk  Mon Sep  3 08:15:24 2001
From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522)
Date: Mon, 03 Sep 2001 09:15:24 +0100
Subject: prima application
References: <NEBBJAPOEMOKCLJADLLKCEAJCMAA.irv@midalink.net>
Message-ID: <3B933C1C.FC5F6366@hgmp.mrc.ac.uk>

Irv Edelman wrote:
> 
> Hi,
> 
> I don't work for GCG any longer, but, when I did, I wrote a fair
> number of application programs.  I was just looking at the
> description of the prima program on the EMBOSS web site and
> noticed that it seemed remarkably familiar.  Large sections of the
> description seem to have been taken, verbatim, from the manual
> entry I wrote for the GCG Prime program.  The methods used in the
> program, the program parameters, and the program function itself,
> seem to be remarkably similar to the Prime program I wrote for
> GCG.  Yet the prima program is completely attributed to someone at
> HGMP.  Is that really so?  Just curious.

Sorry - the 'prima' documentation is incorrect - I had been doing bulk
copies of documentation from the old EGCG programs into the
corresponding EMBOSS documentation and this one slipped through by
mistake. 

'prima' has no code in common with the GCG 'prime' program.

I will change the documentation.

Gary

-- 
Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK


From cutler at tularik.com  Wed Sep  5 00:59:59 2001
From: cutler at tularik.com (Gene Cutler)
Date: Tue, 4 Sep 2001 17:59:59 -0700
Subject: drawing trees
In-Reply-To: <3B8B507D.B7639FE5@bioss.ac.uk>
References: <a05101006b7b041c61d1e@[192.168.50.41]>
 <3B8B507D.B7639FE5@bioss.ac.uk>
Message-ID: <a05101003b7bb29023af1@[192.168.50.41]>


I finally got around to trying this, using protdist and neighbor as 
suggested below.
That worked, but gave me an ascii tree.  Is there any way to get a 
tree in postscript
format?


>Gene Cutler asked:
>
>>Hello, all.  I have a question about phylogenetic-type trees for
>>sequences.  I haven't quite figured out how to do this using
>>emboss/phylip.  This is how I have been doing this with gcg:
>>
>>run gcg program distances on the msf file
>>run gcg program growtree on the distances file
>>
>>How would I do this with PHYLIP instead?
>
>The GCG DISTANCES program and GCG GROWTREE programs are very similar to
>the DNADIST/PROTDIST and Neighbor programs in PHYLIP.  In other words,
>they allow phylogenetic trees to be constructed using "distance-based"
>methods, but do not allow maximum likelihood or parsimony methods to be
>used.  They also don't do bootstrapping tests, tree comparisons, and
>lots of other things.


From mikep at entigen.com  Wed Sep  5 18:49:00 2001
From: mikep at entigen.com (Michael Poidinger)
Date: Wed, 05 Sep 2001 11:49:00 -0700
Subject: drawing trees
In-Reply-To: <a05101003b7bb29023af1@[192.168.50.41]>
References: <3B8B507D.B7639FE5@bioss.ac.uk>
 <a05101006b7b041c61d1e@[192.168.50.41]>
 <3B8B507D.B7639FE5@bioss.ac.uk>
Message-ID: <5.0.2.1.0.20010905114303.02210eb0@mail.au.int.en-bio.com>

The Phylip programs drawgram and drawtree will produce postscript, 
depedning on whether you want rooted or unrooted trees respectively


I tend to use drawgram, changing tree type to phenogram, grows 
horizontally, angle of labes = 90

or for interactive phylip options:

L
N
1
2
P
4
90
y


At 05:59 PM 9/4/2001 -0700, you wrote:

>I finally got around to trying this, using protdist and neighbor as 
>suggested below.
>That worked, but gave me an ascii tree.  Is there any way to get a tree in 
>postscript
>format?
>
>
>
>>Gene Cutler asked:
>>
>>>Hello, all.  I have a question about phylogenetic-type trees for
>>>sequences.  I haven't quite figured out how to do this using
>>>emboss/phylip.  This is how I have been doing this with gcg:
>>>
>>>run gcg program distances on the msf file
>>>run gcg program growtree on the distances file
>>>
>>>How would I do this with PHYLIP instead?
>>
>>The GCG DISTANCES program and GCG GROWTREE programs are very similar to
>>the DNADIST/PROTDIST and Neighbor programs in PHYLIP.  In other words,
>>they allow phylogenetic trees to be constructed using "distance-based"
>>methods, but do not allow maximum likelihood or parsimony methods to be
>>used.  They also don't do bootstrapping tests, tree comparisons, and
>>lots of other things.
>


From seb at i112pc09.vu-wien.ac.at  Tue Sep 11 08:59:45 2001
From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka)
Date: Tue, 11 Sep 2001 10:59:45 +0200 (CEST)
Subject: Seq retrieval tool Announce
Message-ID: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at>

Dear EMBOSS users,

I'm using the EMBOSS program suite for a couple of months and I do not have
local/direct access to the embl/genbank databases. Since I wanted to use the
programs w/out every time downloading the embl/genbank entries by
search/click/save ... I have written a litte PERL program that can be easily
used by most emboss programs as a database/USA/app,external resource using
the NIH Entrez server. Maybe this program is useful for other people, too.

Links to the Homepage of gbwget and the project/download pages is at the end
of this email. I hope I did not waste your bandwith!

Thanks,
Sebastian


Announce:
gbwget is a nucleic/protein sequence search and retrieval program to be used
mainly by users of the EMBOSS sequence anylsis suite that do not have a
local access to the huge genbank, embl or swissprot sequence databases. It
allows users to directly use the (most ? of the) EMBOSS programs without
having to retrieve and store sequences manually through web interfaces. With
most programs of the EMBOSS suite one can give Uniform Sequence Addresses to
directly access database entries to perform different tasks. For instace to
quickly check for single restriction enzyme sites in one or more cloning
vectors (Example: pGEX-5x3 vector from Amersham/Pharmacia, genbank ID is in
the catalog) you only have to do: restrict -single ::gb:U13858 and you have
the list of enzymes. But only if you have direct access to the database.
Otherwise you have to open a webbrowser, go to http://www.ncbi.nlm.nih.gov,
choose nucleotide, search for U13858, save the data file, and the run
restrict on the file. And if you want to check the other 9 pGEX vectors ??
My program 'dbwget' enables EMBOSS users to do exactly that without local
access to the db's and much more. An alternative might be to install the SRS
program suite, but it's a quite large package and won't compile on linux (at
least for me and others).

I have written this program for me personally and use it now for my own
research in the field of molecular biology.

About:
gbwget is a command line/screen oriented tool to search in nucleotide
or protein databases and to view or retrieve database entries using
the Entrez server at http://www.ncbi.nlm.nih.gov. It is intended as
a sequence retrieval method for the EMBOSS (The European Molecular Biology
Open Software Suite, see:
http://www.uk.embnet.org/Software/EMBOSS/index.html) an alternative for
the gcg sequence analysis suite. gbwget can also be used standalone, but
web-based retrieval systems might be more comfortable.
LICENSE: GPL

Homepage and Download:
http://gbwget.sourceforge.net
and/or
http://sourceforge.net/projects/gbwget


Sebastian Bunka, Dr. med. vet.
Inst. Med. Chemistry, Vet. University Vienna
Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290
e-mail: Sebastian.Bunka at vu-wien.ac.at


From bauer at genprofile.com  Tue Sep 11 09:57:45 2001
From: bauer at genprofile.com (David Bauer)
Date: Tue, 11 Sep 2001 11:57:45 +0200
Subject: Seq retrieval tool Announce
References: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at>
Message-ID: <3B9DE019.6F13B542@genprofile.com>

Hi,

this is a nice remote entrez client.
But why don't you use the url method to retrieve entries from ncbi and
embl?

In emboss.default I have:
#####################################
DB gb [ type: N method: url format: gb
  url:
"http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=s&form=6&dopt=g&html=no&uid=%s"
  comment: "GenBank via Entrez WWW Server" ]
DB embldb [ type: N method: url format: embl
  url: "http://www.ebi.ac.uk/htbin/emblfetch?%s"
  comment: "EMBL via EBI WWW Server" ]
##############################################

Then I can use "entret gb:U13858" to get the full entry or just "seqret
gb:U13858" to get just the fasta formated sequence without header
information.
Same with embldb:U13858.

Ciao, 
David.


From seb at i112pc09.vu-wien.ac.at  Tue Sep 11 10:47:35 2001
From: seb at i112pc09.vu-wien.ac.at (Sebastian Bunka)
Date: Tue, 11 Sep 2001 12:47:35 +0200 (CEST)
Subject: Seq retrieval tool Announce
In-Reply-To: <3B9DE019.6F13B542@genprofile.com>
Message-ID: <Pine.LNX.4.33.0109111242100.9512-100000@i112pc09.vu-wien.ac.at>

On Tue, 11 Sep 2001, David Bauer wrote:

> Hi,
>
> this is a nice remote entrez client.
> But why don't you use the url method to retrieve entries from ncbi and
> embl?
>
That's right, thanks for the tip! I have written this program some time ago
before I even knew EMBOSS. The main purpose was to have this "selection"
lists to fetch entries in bulk. I did not include any changes for the use in
EMBOSS - it simply worked.

But you're right - it's somehow useless ;-)

Ciao, Sebastian


Sebastian Bunka, Dr. med. vet.
Inst. Med. Chemistry, Vet. University Vienna
Ph. +43-1-250 77 ext. 4208, Fax: ext. 4290
e-mail: Sebastian.Bunka at vu-wien.ac.at


From peter.rice at uk.lionbioscience.com  Tue Sep 11 10:59:58 2001
From: peter.rice at uk.lionbioscience.com (Peter Rice)
Date: Tue, 11 Sep 2001 11:59:58 +0100
Subject: Seq retrieval tool Announce
References: <Pine.LNX.4.33.0109111057120.9013-100000@i112pc09.vu-wien.ac.at> <3B9DE019.6F13B542@genprofile.com>
Message-ID: <3B9DEEAE.F5BD3834@uk.lionbioscience.com>

David Bauer wrote:
> this is a nice remote entrez client.
> But why don't you use the url method to retrieve entries from ncbi and
> embl?

True, but ...

The EMBOSS url method has already needed C source code changes when Entrez
output (and SRS output) changed. It can be very useful to have a script to
process these sites.

It is also a great help to have a script like this as a model for how to
use the external application access method.

The original external application was the ACEDB 'efetch' utility, no longer
needed because EMBOSS now uses (and creates) 'efetch' index files to index
databases.

You can also use GCG's typedata as an external application, to save
reindexing a GCG database.

External applications normally read one entry at a time, but if gbwget can
read more than one entry and return them in an EMBOSS-friendly format then
it will do something URL access does not.

It could also produce more helpful error messages when access fails.

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723


From johann at egenetics.com  Tue Sep 18 08:04:42 2001
From: johann at egenetics.com (Johann Visagie)
Date: Tue, 18 Sep 2001 10:04:42 +0200
Subject: [bioproj@physics.iisc.ernet.in: ] (fwd)
Message-ID: <20010918100442.B25228@fling.sanbi.ac.za>

The following arrived in my personal mailbox for some reason.  I'm not sure
I'm best qualified to assist this fellow.

-- Johann


----- Forwarded message from "Selvarani.P" <bioproj at physics.iisc.ernet.in> -----

> From: "Selvarani.P" <bioproj at physics.iisc.ernet.in>
> To: johann at egenetics.com
> Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST)
> 
> 
> Respected Sir,
> 
>         It was a great time for us to know about your package EMBOSS. The
> stage is so set now, that we have installed the software and the software
> works fine with the test data and now we plan to update the database. But
> the file formats .seq, .ref, .numbers , .offset, .names couldn't be
> retrieved by us that were found in the "PIR directory of TEST" and so with
> other files found in the directories within "TEST". We want the updated
> copy of these databases. I would be grateful to you if you could arrange
> the same for me.
> 
> from Selvarani P.
> 
> 

----- End forwarded message -----


From uma at avesthagen.com  Tue Sep 18 09:11:28 2001
From: uma at avesthagen.com (Uma Maheswari)
Date: Tue, 18 Sep 2001 14:41:28 +0530 (IST)
Subject: [bioproj@physics.iisc.ernet.in: ] (fwd)
In-Reply-To: <20010918100442.B25228@fling.sanbi.ac.za>
Message-ID: <Pine.LNX.4.33.0109181426290.11432-100000@mail.avesthagen.com>


I think u are refering to "indexing the database for EMBOSS"...if u hhave
set of seq.(database) and u want EMBOSS prog. to use that, just index the
database for EMBOSS...The seq. given in the test folder is just a sample
one and u need not update it.

check the application called dbiflat in EMBOSS...

http://www.uk.embnet.org/Software/EMBOSS/Apps/dbiflat.html

hth
uma.

On Tue, 18 Sep 2001, Johann Visagie wrote:

> The following arrived in my personal mailbox for some reason.  I'm not sure
> I'm best qualified to assist this fellow.
>
> -- Johann
>
>
>
> ----- Forwarded message from "Selvarani.P" <bioproj at physics.iisc.ernet.in> -----
>
> > From: "Selvarani.P" <bioproj at physics.iisc.ernet.in>
> > To: johann at egenetics.com
> > Date: Tue, 18 Sep 2001 12:36:38 +0530 (IST)
> >
> >
> > Respected Sir,
> >
> >         It was a great time for us to know about your package EMBOSS. The
> > stage is so set now, that we have installed the software and the software
> > works fine with the test data and now we plan to update the database. But
> > the file formats .seq, .ref, .numbers , .offset, .names couldn't be
> > retrieved by us that were found in the "PIR directory of TEST" and so with
> > other files found in the directories within "TEST". We want the updated
> > copy of these databases. I would be grateful to you if you could arrange
> > the same for me.
> >
> > from Selvarani P.
> >
> >
>
> ----- End forwarded message -----
>

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
S.UmaMaheswari,
Avesthagen Technologies Ltd,            Web  : http://www.avesthagen.com
Unit III,9th Floor,Discoverer,          Email:  umasairam at rediffmail.com
ITPL,WhiteField Road,                           uma6666 at yahoo.com
Banglore-560 066.                       Tel  : 080-8411665 ext.110
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


From MarcL at DEVGEN.com  Tue Sep 25 12:55:49 2001
From: MarcL at DEVGEN.com (Marc Logghe)
Date: Tue, 25 Sep 2001 14:55:49 +0200
Subject: passing two sequences to application with pipe
Message-ID: <B78EB1B2063ED4119B6100B0D03D0BA1317FD9@morelia.be.devgen.com>

Hi,
I know you can pipe a sequence (without the need to saving it to a file
first) into an EMBOSS application using the -filter argument like eg.
fastacmd -d nr -s p38398 | extractseq -filter -sformat ncbi -regions
'10-110'
But what if your EMBOSS applicion expects two input sequences like eg
diffseq ?
As far as I know, -filter takes only the first sequence, the second is lost.
I tried something (silly) like diffseq -filter -sformat ncbi -filter
-sformat ncbi or even numbering the arguments like diffseq -filter1 -filter2
but nothing worked out.
Is somethin like this possible anyhow ?
Marc


From tchiang at bioinfo.sickkids.on.ca  Tue Sep 25 18:49:28 2001
From: tchiang at bioinfo.sickkids.on.ca (Ted Chiang)
Date: Tue, 25 Sep 2001 14:49:28 -0400 (EDT)
Subject: question about "PROFIT"
Message-ID: <Pine.GSO.4.05.10109251441371.6656-100000@kenny.bioinfo.sickkids.on.ca>


Hi 

I have a question about emboss' PROFIT.  Could someone explain the
algorithm of how it uses a frequency matrix to scan a sequence to
determine whether that sequence is a match based on the satisfying the
threshol percentage?

The description documenting PROFIT seems a bit confusing.  Any lit.
references?

-Ted


=====================================
Ted Chiang
Bioinformatics Supercomputing Centre
Hospital for Sick Children, Toronto
ext. 7028
tchiang at bioinfo.sickkids.on.ca


From Alain.Empain at ulg.ac.be  Fri Sep 28 09:27:09 2001
From: Alain.Empain at ulg.ac.be (Alain EMPAIN)
Date: Fri, 28 Sep 2001 11:27:09 +0200
Subject: Problem to debug the 'external' database link
Message-ID: <01092811270909.12447@kwak>

Hi !

I am trying to link EMBOSS 2.0.1 tools to an internal database and I do not 
find a way to debug the error.

For ex. I replaced the app expression by 
	app: "echo %s > /tmp/log" 
to at least take a look at what is passed, but nothing 
is written to /tmp/log ??
 
---------------------------------------------------------
alain at kwak:/work/genbase/db$ seqret
Reads and writes (returns) sequences
Input sequence(s): app:essai
   An error has been found: option -sequence: 
	Unable to read sequence 'app:essai'
-----------------------
==> normal error because there is nothing returned, 
	but the /tmp/log is not created


===========================================
Here is a real try, working well from the shell :
look 'AGLA13' /work/genbase/db/sequence.str | g_fasta-io -f

my .embossrc :
(...)
DB  gmol  [
	method: app
	format: fasta
	app: "look '%s' /work/genbase/db/sequence.str | g_fasta-io -f"
	type: P
	comment: "Genbase/db/sequence.str"
]
(...)


	Thanks for any information,

	Alain
+--------------------------------------------------------------------------------------
|  Dr Alain EMPAIN      Bioinformatique, G?n?tique Mol?culaire B43,
|  Fac. M?d. V?t?rinaire, Univ. de Li?ge, Sart-Tilman / B-4000 Li?ge  
|       Alain.EMPAIN at ulg.ac.be
|       WORK:+32 4 366 3821 Fax: +32 4 366 4122   GSM:+32 497 701764
|       HOME:+32 85 512341  -- Rue des Martyrs,7  B-4550 Nandrin