From shrish at ccmb.res.in  Tue Jan  9 05:36:17 2007
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Tue, 9 Jan 2007 16:06:17 +0530 (IST)
Subject: [EMBOSS] (no subject)
Message-ID: <18119340.1168338977168.JavaMail.root@mailserver>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://lists.open-bio.org/pipermail/emboss/attachments/20070109/d1681de2/attachment.pl 

From Squig at web.de  Mon Jan 15 06:19:13 2007
From: Squig at web.de (Squig at web.de)
Date: Mon, 15 Jan 2007 12:19:13 +0100
Subject: [EMBOSS] EMBOSS 4.0 and libnucleus.so.4
Message-ID: <1056759432@web.de>

Hello,

I just installed EMBOSS 4.0 on my system and wanted to run a few tests if everything is working right.
Every tool I tried ends up with following message:

splitter: error while loading shared libraries: libnucleus.so.4: cannot open shared object file: No such file or directory

The binaries are loacted in "/usr/local/bin" and the libaries in "/usr/local/lib".

There are also these "libnucleus" files and symlinks:

libnucleus.a
libnucleus.la
libnucleus.so -> libnucleus.so.4.0.0
libnucleus.so.4 -> libnucleus.so.4.0.0
libnucleus.so.4.0.0


Do I oversee some more symlinks to add?

Some hint or help would be really appreciated.


With kind regards

Stefan Kesberg


_______________________________________________________________________
Viren-Scan f?r Ihren PC! Jetzt f?r jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222


From Squig at web.de  Mon Jan 15 08:21:23 2007
From: Squig at web.de (Squig at web.de)
Date: Mon, 15 Jan 2007 14:21:23 +0100
Subject: [EMBOSS] EMBOSS 4.0 and libnucleus.so.4
Message-ID: <1057011473@web.de>

Hello,

Using "ldd" shows that there were some dynamic libaries unknow.
I added their path to "ld.so.conf" and restarted the system.

Now everything works fine :)

Thank you.


With kind regards

Stefan Kesberg
_______________________________________________________________________
Viren-Scan f?r Ihren PC! Jetzt f?r jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222


From pmr at ebi.ac.uk  Mon Jan 15 15:52:03 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Mon, 15 Jan 2007 20:52:03 -0000 (GMT)
Subject: [EMBOSS] emboss-bug list and Debian 2.0 on IBM T22
In-Reply-To: <20070115192119.907C087049@webmail223.herald.ox.ac.uk>
References: <20070115192119.907C087049@webmail223.herald.ox.ac.uk>
Message-ID: <3884.86.133.34.142.1168894323.squirrel@webmail.ebi.ac.uk>

Dear Robert,

> NB. My email to: emboss_bug at emboss.open-bio.org did not get transmitted,
> isn't
> there anyone there anymore? Anyway, I hope I can get help at this address

Yes we are here. The list address is emboss-bug (dash, not underscore).

But we have had very few messages on the emboss-bug list in the past
month. Has anyone else had error messages (or not had a reply from us)
from an emboss-bug message?

> I got a rather disagreeable rejection message when I sent this to
> emboss at emboss.open-bio.org

Hmm .... this one did get through to the emboss list.

> Dear Emboss_support
>
> I have tried and failed to install Emboss (on an IBM T22 laptop running
> under
> Debian 2.0).
>
> The config* files and the error messages from 'make' are attached.

The files seem to be corrupted ... only the config.log file looked right,
so I cannot see the error message(s).

> PS. I will also attempt to compile under cygwin on another machine. Wish
> me
> luck!

good luck!

regards,

Peter Rice


From shrish at ccmb.res.in  Tue Jan 16 06:56:55 2007
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Tue, 16 Jan 2007 17:26:55 +0530 (IST)
Subject: [EMBOSS] extracting 3' UTRs
Message-ID: <12502499.1168948615113.JavaMail.root@mailserver>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://lists.open-bio.org/pipermail/emboss/attachments/20070116/734a50ae/attachment.pl 

From jison at ebi.ac.uk  Tue Jan 16 09:49:53 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 16 Jan 2007 14:49:53 -0000 (GMT)
Subject: [EMBOSS] extracting 3' UTRs
In-Reply-To: <12502499.1168948615113.JavaMail.root@mailserver>
References: <12502499.1168948615113.JavaMail.root@mailserver>
Message-ID: <49980.84.92.187.247.1168958993.squirrel@webmail.ebi.ac.uk>

Hi Shrish

So far as I know, not directly, but it's easily done using a combination
of e.g. coderet, getorf, plotorf and seqret.

Should be obvious from the documentation, e.g.
http://emboss.sourceforge.net/apps/cvs/index.html
http://emboss.sourceforge.net/docs/emboss_tutorial/node4.html

If you envisage a single tool for your task, please let us know to
emboss-bug at emboss.open-bio.org please)

Cheers

Jon


> Hi!
> Is there a way to extract 3' UTRs using EMBOSS programs?
> Shrish
> Dr. Shrish Tiwari
> E503, Centre for Cellular and Molecular Biology
> Uppal Road, Hyderabad - 500 007, INDIA
> Phone: 91-40-27192777
> Alternate email: shrish.geo at yahoo.com
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From David.Bauer at SCHERING.DE  Tue Jan 16 10:44:23 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Tue, 16 Jan 2007 16:44:23 +0100
Subject: [EMBOSS] extracting 3' UTRs
In-Reply-To: <49980.84.92.187.247.1168958993.squirrel@webmail.ebi.ac.uk>
Message-ID: <OF38A5EF46.2078413B-ONC1257265.0054DF9A-C1257265.00567631@schering.de>

Hi Shrish,

in principle this would be an easy task for 'extractfeat' because the EMBL
feature table definition contains also the feature key "3' UTR".
But nearly nobody uses this feature key in praxi.
So I would use coderet to look for the end of a CDS and then extract the
remaining part with seqret. or extractseq.
It will be straightforward for single mRNA entries with one CDS.
If you want to do this on a genome level, you should take a look at
Ensembl (www.ensembl.org) and the Mart interface. There you can extract
3'UTR.
But from my experience the annotation of UTRs is very incomplete so don't
expect to get something comprehensive with these methods.

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 16/01/2007 15:49:53:

> Hi Shrish
>
> So far as I know, not directly, but it's easily done using a combination
> of e.g. coderet, getorf, plotorf and seqret.
>
> Should be obvious from the documentation, e.g.
> http://emboss.sourceforge.net/apps/cvs/index.html
> http://emboss.sourceforge.net/docs/emboss_tutorial/node4.html
>
> If you envisage a single tool for your task, please let us know to
> emboss-bug at emboss.open-bio.org please)
>
> Cheers
>
> Jon
>
>
>
>
> > Hi!
> > Is there a way to extract 3' UTRs using EMBOSS programs?
> > Shrish
> > Dr. Shrish Tiwari
> > E503, Centre for Cellular and Molecular Biology
> > Uppal Road, Hyderabad - 500 007, INDIA
> > Phone: 91-40-27192777
> > Alternate email: shrish.geo at yahoo.com
> > _______________________________________________
> > EMBOSS mailing list
> > EMBOSS at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/emboss
> >
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From maoj at helix.nih.gov  Tue Jan 16 13:14:15 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Tue, 16 Jan 2007 13:14:15 -0500
Subject: [EMBOSS] bug in restrict?
Message-ID: <000001c7399a$222f71f0$be4de780@CIT.NIH.GOV>

I used est:af436075 and run 'restrict' program in EMBOSS. One of the enzyme
which cut this sequence is called 'Tth111II' . When using 'redata' program
to search for this enzyme, result shows this is from unpublished
observations. Since the default of 'restrict' is to search enzymes that is
only commercially available, I think the appearance of 'Tth111II' is a bug.
Please advise. Thanks.

Jean Mao

From maoj at helix.nih.gov  Tue Jan 16 13:41:43 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Tue, 16 Jan 2007 13:41:43 -0500
Subject: [EMBOSS] Bug in 'remap' program?
Message-ID: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>

Hi, I used genbank:A00006 sequence to run 'remap' program in emboss. Among
the Enzymes that cut, 

BmgT120I , FmuI , PabI , TspRI , UnbI  does not show any Isoschizomers.
However, they all have Isoschizomers based on the 'redata' program'. One
thing they have in common is that they don't exist in the embossre.equ file.
All of them exist in withrefm file. All of them except 'TspRI' exist in
proto file. How can I make the rebaseextract program work the way that they
will show their  Isoschizomers if exist? 

Thank you.

Jean Mao

From gbottu at ben.vub.ac.be  Wed Jan 17 03:59:24 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Wed, 17 Jan 2007 09:59:24 +0100
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
Message-ID: <20070117085924.GA2027@bigben.ulb.ac.be>

	Dear Jean,

The program remap by default only outputs one representative member (the 
prototype) of a series of isoschizomers and it only considers enzymes 
that have a commercial provider. If you want to see all enzymes you must 
run remap with parameters -nolimit -nocommercial.

	Regards,
	Guy Bottu,
	Belgian EMBnet Node


From maoj at helix.nih.gov  Wed Jan 17 08:30:45 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Wed, 17 Jan 2007 08:30:45 -0500
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <20070117085924.GA2027@bigben.ulb.ac.be>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
Message-ID: <000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>

 Hi Guy,

There has inconsistency in the result I get. You may want to run remap with
the sequence I provide below to see the problem.

on the commend line, I run :

% remap -opt

accept all the default using the file provided below, part of the output
file look like this (which I don't see when not using -opt flag, why?) :

===============================================================
# Enzymes that cut  Frequency   Isoschizomers
      AluI          1   MltI
      ApaI          1   PpeI
      AsuI          2
AspS9I,AvcI,Bac36I,Bal228I,BavAII,BavBII,Bce22I,BshKI,BsiZI,Bsp1894I,BspBII,
BspF4I,Bsu54I,CcuI,Cfr13I,MaeK81
II,Nsp7121I,NspIV,Pde12I,PspPI,Sau96I
      BfiI          1   BmrI,BmuI
  BmgT120I          2   
     BseSI          1   Bme1580I
     BsiYI          1   BflI,Bsc107I,Bsc4I,BseLI,AfiI,BslI,Bst22I
   Bsp120I          1   PspOMI
      BsrI          1   BseNI,Bse1I,BsrSI,Bst11I,Tsp1I
     Csp6I          1   CviQI,CviRII
     CviJI          2   CviKI,CviKI-1
     CviRI          1   HpyCH4V,HpyF44III
     DraII          1   EcoO109I
      FmuI          2   
    HaeIII          1
BecAII,Bim19II,Bme361I,BseQI,BshFI,BshI,BsnI,Bsp211I,BspANI,BspBRI,BspKI,Bsp
RI,BsuRI,BteI,CltI,DsaII,EsaBC4I
,FnuDI,BanAI,MchAII,MfoAI,NgoPII,NspLKI,PalI,Pde133I,PflKI,PhoI,PlaI,Pru2I,S
bvI,SfaI,SuaI
    HgiJII          1
BpuI,Bsp519I,Bsu1854I,BvuI,Eco24I,Eco75KI,EcoT38I,FriOI,BanII,KoxII,PaeHI,Sa
cNI
     Hpy8I          1   HpyBII
   Kaz48kI          1   PssI
     NlaIV          1   BmiI,BscBI,BspLI,AspNI,PspN4I
      PabI          1   
      RsaI          1   HpyBI,PlaAII,AfaI
      SduI          1   BmyI,BsoCI,Bsp1286I,BspLS2I,MhlI,NspII,AocII
     TspRI          1   
      UnbI          2   
==========================================================================


As you can see, 6 enzymes show NO isoschizomers. I assume all of them have
commercial supplier(s) since I accept the default setting. However, using
'redata' program in EMBOSS on these 6 enzymers, some of them DO have
isoschizomers but the field was left blank. In addition, some of them has NO
suppliers listed which is not suppose to appear when I use the default
settings, isn't it?

Thank you in advance.


================= Sequence I Used
=============================================


!!NA_SEQUENCE 1.0
LOCUS       A00006                    26 bp    DNA     linear   PAT
10-FEB-1993
DEFINITION  Artificial oligonucleotide sequence (Fra 3), sequence 5 from
patent
            application EP0238993.
ACCESSION   A00006
VERSION     A00006.1  GI:57973
KEYWORDS    .
SOURCE      synthetic construct
  ORGANISM  synthetic construct
            other sequences; artificial sequences.
REFERENCE   1  (bases 1 to 26)
  AUTHORS   Auerswald,E.A., Schroeder,W., Schnabel,E., Bruns,W.,
Reinhardt,G.
            and Kotick,M.
  TITLE     Aprotinin homologues produced by genetic engineering
  JOURNAL   Patent: EP 0238993-A 5 30-SEP-1987;
            BAYER AG
FEATURES             Location/Qualifiers
     source          1. .26
                     /organism="synthetic construct"
                     /mol_type="unassigned DNA"
                     /db_xref="taxon:32630"
ORIGIN

  A00006  Length: 26  January 11, 2007 09:44  Type: N  Check: 4746  ..

       1  CGCCGTACAC TGGGCCCTGC AAAGCT


-----Original Message-----
From: Guy Bottu [mailto:gbottu at ben.vub.ac.be] 
Sent: 2007?1?17? 3:59
To: Jean Mao
Cc: emboss at lists.open-bio.org
Subject: Re: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
version 

	Dear Jean,

The program remap by default only outputs one representative member (the
prototype) of a series of isoschizomers and it only considers enzymes that
have a commercial provider. If you want to see all enzymes you must run
remap with parameters -nolimit -nocommercial.

	Regards,
	Guy Bottu,
	Belgian EMBnet Node

From gbottu at ben.vub.ac.be  Thu Jan 18 04:05:27 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 18 Jan 2007 10:05:27 +0100
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
	<000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>
	<20070117164311.GA8769@bigben.ulb.ac.be>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
Message-ID: <20070118090527.GA22252@bigben.ulb.ac.be>

On Wed, Jan 17, 2007 at 12:07:31PM -0500, Mao, Jean (NIH/CIT) [E] wrote:
> How about BmgT120I, based on the 'redata' program, it has isoschizomers, but non was listed in my output.
> UnbI has isoschizomers also and has NO commercial provider listed.

You have indeed pinpointed a bug or misfeature. The
problem might be that the prototype enzyme is AsuI.
But AsuI has no commercial providers. 
It is more easy to see this in our MRS server 
than using redata :
http://bendisk.ulb.ac.be/mrs/cgi-bin/mrs.cgi?db=rebase&query=BmgT120I
So, several isoschizomers of AsuI are displayed in 
the output instead of just one enzyme.

Could Alan Bleasby comment about this ?

	Guy Bottu,
	BEN


From ajb at ebi.ac.uk  Thu Jan 18 04:47:20 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 18 Jan 2007 09:47:20 -0000 (GMT)
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
 version
In-Reply-To: <20070118090527.GA22252@bigben.ulb.ac.be>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
	<000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>
	<20070117164311.GA8769@bigben.ulb.ac.be>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
	<20070118090527.GA22252@bigben.ulb.ac.be>
Message-ID: <56655.81.98.244.247.1169113640.squirrel@webmail.ebi.ac.uk>

Hi Jean, Guy,

> Could Alan Bleasby comment about this ?

I'm currently looking at that area of the code (for restrict). I suspect
that it is just a problem with the positioning of the commercial
availability test.

Alan


From maoj at helix.nih.gov  Thu Jan 18 09:14:20 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 18 Jan 2007 09:14:20 -0500
Subject: [EMBOSS] question using 'matpatmotifs'
Message-ID: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>

Hi, 

I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
EMBOSS and got 0 hits. However, when I used the same input sequnce on
interproscan, the result
(http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
70118-14025926) show that it contains basic Helix-loop-helix motif which is
ID PS50888 in prosite database. Is this a bug or did I do something wrong? I
also run the same sequence against the 'motifs' program in GCG package.
Again no hit was found. 

Thank you.

Jean Mao

From ajb at ebi.ac.uk  Thu Jan 18 09:42:44 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 18 Jan 2007 14:42:44 -0000 (GMT)
Subject: [EMBOSS] question using 'matpatmotifs'
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
References: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <39856.81.98.244.247.1169131364.squirrel@webmail.ebi.ac.uk>

Hi Jean,

It is more like a feature. PS50888 is a matrix and patmatmotifs
doesn't deal with that type of PROSITE entry - it just compares
the pattern string entries.

Alan


> Hi,
>
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
> (http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
> 70118-14025926) show that it contains basic Helix-loop-helix motif which
> is
> ID PS50888 in prosite database. Is this a bug or did I do something wrong?
> I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found.
>
> Thank you.
>
> Jean Mao
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From gbottu at ben.vub.ac.be  Thu Jan 18 10:04:31 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 18 Jan 2007 16:04:31 +0100
Subject: [EMBOSS] question using 'matpatmotifs' - Checked by AntiVir
	DEMO ve
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
References: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <20070118150431.GA28210@bigben.ulb.ac.be>

On Thu, Jan 18, 2007 at 09:14:20AM -0500, Jean Mao wrote:
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
> (http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
> 70118-14025926) show that it contains basic Helix-loop-helix motif which is
> ID PS50888 in prosite database. Is this a bug or did I do something wrong? I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found. 

The reason is that GCG motifs and EMBOSS patmatmotif search only the 
PROSITE entries of type "pattern", while PS50888 is of type "matrix". If 
you want to search the complete PROSITE (patterns+matrices+rules), you can 
download the ps_scan script from ftp://ftp.expasy.org/databases/prosite/tools/ps_scan/sources
and the pftools package from 
ftp://ftp.isrec.isb-sib.ch/pub/sib-isrec/pftools/pft2.3
You can run this under EMBOSS with the wrappers4EMBOSS package 
(http://wemboss.sourceforge.net/).

	Hope this helps,
	Guy Bottu,
	BEN


From David.Bauer at SCHERING.DE  Thu Jan 18 09:58:09 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Thu, 18 Jan 2007 15:58:09 +0100
Subject: [EMBOSS] Antwort:  question using 'matpatmotifs'
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <OF5E94A951.C4B0ECE1-ONC1257267.0051FEB6-C1257267.00523BB1@schering.de>

Hi Jean,

this is an old problem with patmatmotifs.
This program makes use only of the traditonal Prosite patterns and
unfortunately can not handle the newer type Prosite matrix entries.

Cheers,
David.

emboss-bounces at lists.open-bio.org schrieb am 18/01/2007 15:14:20:

> Hi,
>
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs'
in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
>
(http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200

> 70118-14025926) show that it contains basic Helix-loop-helix motif which
is
> ID PS50888 in prosite database. Is this a bug or did I do something
wrong? I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found.
>
> Thank you.
>
> Jean Mao
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From maoj at helix.nih.gov  Thu Jan 18 14:05:51 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 18 Jan 2007 14:05:51 -0500
Subject: [EMBOSS] about 'helixturnhelix'e
Message-ID: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>

Now that I know the reason why patmatmotifs can't find HTH in my hair_drome
input sequence, I would like to know what program in EMBOSS package CAN find
it in my sequence. I tried helixturnhelix but still it couldn't find it.
Does helixturnhelix using matrix or motif? looks like matrix to me in the
documentation. Please advise.

Thank you very much!

Jean

From pmr at ebi.ac.uk  Thu Jan 18 18:02:15 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Thu, 18 Jan 2007 23:02:15 -0000 (GMT)
Subject: [EMBOSS] about 'helixturnhelix'e
In-Reply-To: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>
References: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>
Message-ID: <1771.86.141.183.176.1169161335.squirrel@webmail.ebi.ac.uk>

Hi Jean,

> Now that I know the reason why patmatmotifs can't find HTH in my
> hair_drome
> input sequence, I would like to know what program in EMBOSS package CAN
> find
> it in my sequence. I tried helixturnhelix but still it couldn't find it.
> Does helixturnhelix using matrix or motif? looks like matrix to me in the
> documentation. Please advise.

helixturnhelix uses a matrix, but it is quite an old one from the days
when there were only about 20 HTH examples (and 2 of them were wrong).

We will look at ways to use the prosite matrix entries.

regards,

Peter


From pmr at ebi.ac.uk  Fri Jan 19 10:49:04 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Fri, 19 Jan 2007 15:49:04 -0000
Subject: [EMBOSS] Question regarding dbxflat entry number processed
In-Reply-To: <000a01c71946$94beb600$be4de780@CIT.NIH.GOV>
References: <000a01c71946$94beb600$be4de780@CIT.NIH.GOV>
Message-ID: <13132.193.173.109.1.1165419569.squirrel@webmail.ebi.ac.uk>

Hi Jean,

> Hi, I am using dbxflat to index a database. I would like to find out how
> many entries were processed. In the index file database.pxid, there is a
> line :
>
> Count      123456
>
> which is very close to the number of entries in the database file but not
> exact the same. Is there a way to find out? Thank you very much.

The count should be the number of IDs found. Do you perhaps have some
duplicate IDs?

regards,

Peter


From rls at ebi.ac.uk  Fri Jan 19 10:49:45 2007
From: rls at ebi.ac.uk (Rodrigo Lopez)
Date: Fri, 19 Jan 2007 15:49:45 -0000
Subject: [EMBOSS] Output from seqret in fastaformat.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
Message-ID: <4576C594.3080609@ebi.ac.uk>

Hi,

Use -osdbname UNIPROT in the command line.

R:)

JK (Jesper Agerbo Krogh) wrote:
> Hi.. 
> 
> I've godt dbxflat to index the swissprot database.. but I'd like to have the output 
> formatted with the USA as the fasta ID. 
> 
> Current..:
> 
> seqret UNIPROT:Q12345
> Reads and writes (returns) sequences
> output sequence(s) [ies3_yeast.fasta]:
> 
>> IES3_YEAST Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
> 
> .. but I'd like.. 
> 
>> UNIPROT:Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
> 
> Is that possible? 
> 
> 

From pmr at ebi.ac.uk  Fri Jan 19 10:54:41 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Fri, 19 Jan 2007 15:54:41 -0000
Subject: [EMBOSS] Output from seqret in fastaformat.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
Message-ID: <14850.193.173.109.1.1165419775.squirrel@webmail.ebi.ac.uk>

Hi Jesper,

> I've godt dbxflat to index the swissprot database.. but I'd like to have
> the output
> formatted with the USA as the fasta ID.
>
> Current..:
>
> seqret UNIPROT:Q12345
> Reads and writes (returns) sequences
> output sequence(s) [ies3_yeast.fasta]:
>
>>IES3_YEAST Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
>
> .. but I'd like..
>
>>UNIPROT:Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
>
> Is that possible?

Tricky to do. Q12345 is not the sequence ID, it is only the accession
number. There are ways to rewrite UNiProt as a FASTA format file and index
with dbxfasta but that loses the rest of the information in the entries.

A simple perl script to rearrange the ID lines is your easiest solution.

Alternativelyj, you could invent a new EMBOSS output format that uses the
DBname and accession to create the ID. But EMBOSS would still want to
write to a file called "ies3_yeast.*" because it uses the ID to make up
the default filename.

If you insist, you can try:

seqret UNIPROT:Q12345 -sid Q12345 -osdbname UNIPROT

which gives me the result you expect with the current developers code (I
am away from the office todaty, and there have been changes to the way
database names are propagated to the output so release 4.0.0 may behave
slightly differently).

Hope that helps

Peter


From ajb at ebi.ac.uk  Fri Jan 19 11:22:07 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Fri, 19 Jan 2007 16:22:07 -0000 (GMT)
Subject: [EMBOSS] Explanation: time-warped messages
Message-ID: <40703.81.98.244.247.1169223727.squirrel@webmail.ebi.ac.uk>

Apologies for all the time-warped messages that have been
appearing on this list over the last day or two.

There has been a long-standing problem (from early December) with the
EBI's email setup in that it didn't always respond correctly to the
anti-spam
mechanisms on the open-bio email lists. This was fixed by the
EBI Systems group this week.

So, some messages sent by EBI staff are just getting through.

Alan


From mathog at caltech.edu  Fri Jan 19 11:19:44 2007
From: mathog at caltech.edu (David Mathog)
Date: Fri, 19 Jan 2007 08:19:44 -0800
Subject: [EMBOSS] Output from seqret in fastaformat
Message-ID: <E1H7wSu-0002dv-Qg@mendel.bio.caltech.edu>

Peter Rice wrote:

> "JK (Jesper Agerbo Krogh)" <JK at novozymes.com,<pmr at ebi.ac.uk>

I'm with Peter on this one.  There are way too many possible formats
for fasta comment lines for any software to support all of them.
This command line reformatting is exactly the sort of task my
'extract' program was written to handle (having faced the same
task myself more times than I can count).  Example:

% cat >foo.pfa <<EOD
>IES3_YEAST Q12345 Ino eighty subunit 3.
MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
NKNGLLENIL
EOD
% cat foo.pfa | extract -if '>' -mt -cols 'UNIPROT:[2,]'
UNIPROT:Q12345 Ino eighty subunit 3.
MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
NKNGLLENIL

So you can process the whole thing in a pipe or in two stages through
a temporary file. Your choice.

Extract is part of drm_tools (these have nothing to do with
"digital rights management", they were my initials long before drm
took on its current common meaning) from here:

ftp://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/drm_tools.tar.gz


The man page is here:

  http://saf.caltech.edu/saf_manuals/extract.html

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

From uludag at ebi.ac.uk  Mon Jan 22 10:42:32 2007
From: uludag at ebi.ac.uk (Mahmut Uludag)
Date: Mon, 22 Jan 2007 15:42:32 +0000
Subject: [EMBOSS] workflow ideas
Message-ID: <1169480552.4118.80.camel@emboss2.ebi.ac.uk>

Hi,

We have recently extended the EBI Soaplab server by new webservices for
EMBOSS 4.0 applications including the EMBASSY applications, and made it
publicly available through the following address.

   http://www.ebi.ac.uk/soaplab/emboss4/index.html

We are now in the process of building Taverna workflows to demonstrate
use cases for these services. We need ideas for these use cases. If you
have any use case ideas for EMBOSS services you or your colleagues would
use in the future and would like us to prepare workflow(s) for those use
cases please email me with a brief description then I will prepare
workflow(s) for your use case(s). These workflows will later be
published from a public repository. We are also interested in use cases
that would include other webservices together with the EMBOSS services,
basically to demonstrate the interoperability of the services.

Regards,
Mahmut


From maoj at helix.nih.gov  Mon Jan 22 16:52:33 2007
From: maoj at helix.nih.gov (jean mao)
Date: Mon, 22 Jan 2007 16:52:33 -0500
Subject: [EMBOSS] question about seqret
Message-ID: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>

Hi, I would like to know is there a way I can search all databases 
available for one accession number I have which I don't know what 
database(s) it belongs to? May I do some configuration in the 
emboss.default file for that?

Thank you very much in advance.

Jean

From ajb at ebi.ac.uk  Mon Jan 22 17:54:49 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Mon, 22 Jan 2007 22:54:49 -0000 (GMT)
Subject: [EMBOSS] question about seqret
In-Reply-To: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>
References: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>
Message-ID: <39293.81.98.244.247.1169506489.squirrel@webmail.ebi.ac.uk>

Hello Jean,

The EMBOSS application 'whichdb' should do that.

Alan


> Hi, I would like to know is there a way I can search all databases
> available for one accession number I have which I don't know what
> database(s) it belongs to? May I do some configuration in the
> emboss.default file for that?
>
> Thank you very much in advance.
>
> Jean
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From maoj at helix.nih.gov  Mon Jan 22 19:51:09 2007
From: maoj at helix.nih.gov (jean mao)
Date: Mon, 22 Jan 2007 19:51:09 -0500
Subject: [EMBOSS] question about seqret
Message-ID: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>

Thank you for your reply. I solved it 
by using app method and the emboss farm script by Simon Andrews.

Jean Mao.

From jison at ebi.ac.uk  Tue Jan 23 04:47:03 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 23 Jan 2007 09:47:03 -0000 (GMT)
Subject: [EMBOSS] question about seqret
In-Reply-To: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>
References: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>
Message-ID: <1186.84.92.187.247.1169545623.squirrel@webmail.ebi.ac.uk>

> Thank you for your reply. I solved it
> by using app method and the emboss farm script by Simon Andrews.

FYI see
http://emboss.sourceforge.net/docs/themes/emboss_farm.script

Jon


From fangw at CLEMSON.EDU  Tue Jan 23 10:54:09 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Tue, 23 Jan 2007 10:54:09 -0500 (EST)
Subject: [EMBOSS] question!
Message-ID: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>

Dear EMBOSS people:

I am a Ph.D. student in Clemson University in USA, who is using your
EMBOSS software to extract 5' upstream of a list of genes. However, I met
some problem when I use EMBOSS:

I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command
"extractfeat genbank:*", it does not work. The error message is
"Error:uable to read sequence 'genbank:4101655', Died: extractfeat
termined:Bad value for '-sequence' and no prompt".  But it work fine
with "extractfeat embl:AK222810".Do you know the reason?

Is there any way to access ENsembl database. Is there any new version of
EMBOSS which could support more databases which could installed in
windowsXP?

Are all the databases which EMBOSS connected are the latest version? since
I found some database do not give the same results as what I get from the
database directly.

Thanks!

I am looking forward to your reply.

FANG WANG
Department of Genetics and Biochemistry
Clemson University


From jison at ebi.ac.uk  Tue Jan 23 12:31:02 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 23 Jan 2007 17:31:02 -0000 (GMT)
Subject: [EMBOSS] question!
In-Reply-To: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
References: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
Message-ID: <48163.84.92.187.247.1169573462.squirrel@webmail.ebi.ac.uk>

Dear FANG

The error message suggests that EMBOSS has not been configured to work with "genbank".
Every database you intend to use must be defined in one of the EMBOSS configuration
files "emboss.default" or ".embossrc".

"emboss.default" lives in the top-level emboss directory (e.g. /home/auser/emboss/emboss.default)
and is used for site-wide databases.

".embossrc" lives in your personal home directory and is used for your own databases (or for testing).

Please read the documentation which describes how to configure database access in these files:
http://emboss.sourceforge.net/docs/themes/Databases.html
http://emboss.sourceforge.net/admin/

Or ask your sysadmin to setup access for you (a better route if the database is a shared resource).

So far as I know, EMBOSS cannot read ensembl directly.

The answer to your last question is "It depends on which databases your installation is configured
to use" (see "emboss.default" and ".embossrc").

Good luck !

Cheers

Jon


> Dear EMBOSS people:
>
> I am a Ph.D. student in Clemson University in USA, who is using your
> EMBOSS software to extract 5' upstream of a list of genes. However, I met
> some problem when I use EMBOSS:
>
> I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command
> "extractfeat genbank:*", it does not work. The error message is
> "Error:uable to read sequence 'genbank:4101655', Died: extractfeat
> termined:Bad value for '-sequence' and no prompt".  But it work fine
> with "extractfeat embl:AK222810".Do you know the reason?
>
> Is there any way to access ENsembl database. Is there any new version of
> EMBOSS which could support more databases which could installed in
> windowsXP?
>
> Are all the databases which EMBOSS connected are the latest version? since
> I found some database do not give the same results as what I get from the
> database directly.
>
> Thanks!
>
> I am looking forward to your reply.
>
> FANG WANG
> Department of Genetics and Biochemistry
> Clemson University
>
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From pmr at ebi.ac.uk  Tue Jan 23 12:45:22 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 23 Jan 2007 17:45:22 +0000
Subject: [EMBOSS] question!
In-Reply-To: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
References: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
Message-ID: <45B649B2.3030000@ebi.ac.uk>

Dear Fang,

> I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command 
> "extractfeat genbank:*", it does not work. The error message is "Error:uable 
> to read sequence 'genbank:4101655', Died: extractfeat termined:Bad value for 
> '-sequence' and no prompt".  But it work fine with "extractfeat 
> embl:AK222810".Do you know the reason?

If you used the database definitions provided with EMBOSS ... your genbank is 
possibly
pointing to the CBR  server in Canada which has now closed.

There is also a problem with the way SRS servers define the GI number - there 
are now servers that index it, but as "gid" not as "gi" which EMBOSS 
anticipated. We sill change the field name in the next release of EMBOSS.


To test whether yuor genbank definition works, you could try the ID
We are now at release 4.0.0 which allows "gi" as a search field. Earlier 
versions only had "sv" (sequence version) ... whether that is indexed depends on 
the database provider. Indexing GenBank in EMBOSS does allow GI searches.

> Is there any way to access ENsembl database. Is there any new version of 
> EMBOSS which could support more databases which could installed in windowsXP?

Ah, you are running EMBOSS under windows? embosswin was provided by Andre 
Blavier up to EMBOSS 2.10.0. We now provide a beta release of EMBOSS 4.0.0 for 
windows (nobody did version 3.0.0 for windows).

Hmmmm ... we need to make that more obvious on the EMBOSS website. EMBOSSWIN is 
available by FTP from emboss.open-bio.org/pub/EMBOSS/windows/ ... only a few 
brave people have tested it so far, but they report that it is working.

> Are all the databases which EMBOSS connected are the latest version? since I 
> found some database do not give the same results as what I get from the 
> database directly.

That depends on where the databases are. There is a list of SRS servers you can 
check for the number of entries and the date they were indexed:

http://downloads.biowisdomsrs.com/publicsrs.html

for example:

DB genbank [ type: N method: srswww format: genbank
    url: "http://iubio.bio.indiana.edu/srsbin/cgi-bin/wgetz"
    dbalias: "genbankrelease"
    fields: "gi sv des org key"
    comment: "Genbank IDs" ]

You can also try Entrez databases in EMBOSS 4.0.0 ... I wonder how many users 
have been using entrez as an access method?

Hope that helps

Peter Rice


From fangw at CLEMSON.EDU  Tue Jan 23 13:42:25 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Tue, 23 Jan 2007 13:42:25 -0500 (EST)
Subject: [EMBOSS] question
Message-ID: <4339.130.127.150.224.1169577745.squirrel@wm.clemson.edu>

Dear EMBOSS people:

If I have a list of genes that I would like to extract only the 5 upstream
2000bp of each gene. I choose "extractfeat", but it did not give me any
answer. Someone have met the same problem before?  Looking forward to your
reply. Thanks!

Nice day,
Fang Wang


From golharam at umdnj.edu  Tue Jan 23 14:23:28 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Tue, 23 Jan 2007 14:23:28 -0500
Subject: [EMBOSS] transeq changes sequence id
Message-ID: <FFD943C623B048C8A779AB874CE49F27@PICO>

I'm using transeq to translate a bunch of sequence for me and noticed that
upon translation, it adds a '_1' to the seqid.  For example:
 
I give it a file with
>myseq
ATG...TAG
 
After translation, the resulting file contains:
>myseq_1
M...
 
Is there a way to prevent transeq from manipulating the FASTA header and
just translate the sequence?
 
Ryan
 

From golharam at umdnj.edu  Wed Jan 24 00:22:37 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Wed, 24 Jan 2007 00:22:37 -0500
Subject: [EMBOSS] need or want to support grid comping?
Message-ID: <6328897A6BB8418CAF90745B21F9C738@PICO>

Does anyone use (or need) EMBOSS tools to be supported in a web environment
with grid support?  ie using EMBOSS-Explorer and have the programs execute
on a grid instead of the web server?
 
Is anyone currently doing this?  
 
Ryan
 

From David.Bauer at SCHERING.DE  Wed Jan 24 02:06:17 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Wed, 24 Jan 2007 08:06:17 +0100
Subject: [EMBOSS] Antwort:  transeq changes sequence id
In-Reply-To: <FFD943C623B048C8A779AB874CE49F27@PICO>
Message-ID: <OF6E108E75.FE8D6B76-ONC125726D.00252423-C125726D.00270755@schering.de>


Hi,

the _1 is there to indicate the frame which was used for translation.
You can use
transeq myseq.fa -frame 1,2
and this would give a fasta file with two protein sequences.
And that's where the added number makes sense; to prevent the creation of
protein sequences which all have the same ID.

So far about the philosophy of this number ;-)

And now a solution for your problem:

transeq test.fa | descseq -filter -name `infoseq -nohead -only -name
test.fa`

This works only if you have just one sequence in the input file. If you
have a multiple sequence fasta file, you can use seqretsplit to create
individual sequence files for each sequence.

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 23/01/2007 20:23:28:

> I'm using transeq to translate a bunch of sequence for me and noticed
that
> upon translation, it adds a '_1' to the seqid.  For example:
>
> I give it a file with
> >myseq
> ATG...TAG
>
> After translation, the resulting file contains:
> >myseq_1
> M...
>
> Is there a way to prevent transeq from manipulating the FASTA header and
> just translate the sequence?
>
> Ryan
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From David.Bauer at SCHERING.DE  Wed Jan 24 02:45:56 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Wed, 24 Jan 2007 08:45:56 +0100
Subject: [EMBOSS] Antwort:  question
In-Reply-To: <4339.130.127.150.224.1169577745.squirrel@wm.clemson.edu>
Message-ID: <OF6EF0246A.8A5EB468-ONC125726D.002A5E02-C125726D.002AA884@schering.de>


Hi Fang,

what about this:

extractfeat myseq.embl -type mRNA -join -before 2000 | seqret -filter
-send 2000

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 23/01/2007 19:42:25:

> Dear EMBOSS people:
>
> If I have a list of genes that I would like to extract only the 5
upstream
> 2000bp of each gene. I choose "extractfeat", but it did not give me any
> answer. Someone have met the same problem before?  Looking forward to
your
> reply. Thanks!
>
> Nice day,
> Fang Wang
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From jrvalverde at cnb.uam.es  Wed Jan 24 04:54:21 2007
From: jrvalverde at cnb.uam.es (Jose R. Valverde)
Date: Wed, 24 Jan 2007 10:54:21 +0100
Subject: [EMBOSS] need or want to support grid comping?
In-Reply-To: <6328897A6BB8418CAF90745B21F9C738@PICO>
References: <6328897A6BB8418CAF90745B21F9C738@PICO>
Message-ID: <20070124105421.35375d32@veda.cnb.uam.es>

On Wed, 24 Jan 2007 00:22:37 -0500
"Ryan Golhar" <golharam at umdnj.edu> wrote:
> Does anyone use (or need) EMBOSS tools to be supported in a web environment
> with grid support?  ie using EMBOSS-Explorer and have the programs execute
> on a grid instead of the web server?
>  
> Is anyone currently doing this?  
>  
> Ryan
>  
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


There are various people working on this that I can think of right now off 
the top of my head:

	- ourselves, we are looking into making the jEMBOSS batch facility
use the Grid (EGEE)
	- the EMBnet node at Mexico is also looking into gridifying EMBOSS
using EELA (on top of EGEE)
	- the Argentina EMBnet node in cooperation with the Belgian EMBnet
node may already have solved the problem by porting wEMBOSS over DRMAA. We
want to look into their approach to see if it works over the GridWay DRMAA
implementation which would mean they had it already solved for EGEE and
Globus at least
	- other initiative may be ongoing within EMBRACE but I believe they
are currently more interested in Web Services.

To the extent I am aware most of these projects are going slowly for a hoard
of fortuitous reasons:
	- we are now busy organizing courses and conferences which delays
our work
	- MX is heating up their development steam
	- AR's Martin Sarachu is fighting a leukaemia

This said, any new hands are welcome. If you are interested we can provide
with all the help and assistance needed and it may turn out to be a trivial
task starting from Martin's work. Just let us know.

Otherwise, I expect this to be ready along this year in one way or another
(or all of them).

				j

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    Jos? R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural


From Tim.Troup at ed.ac.uk  Wed Jan 24 12:16:16 2007
From: Tim.Troup at ed.ac.uk (Tim Troup)
Date: Wed, 24 Jan 2007 17:16:16 +0000
Subject: [EMBOSS] EMBOSS FTP Site Down?
Message-ID: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>

Hi,

Is the EMBOSS FTP site down? It keeps timing out for me.

ftp://emboss.open-bio.org/pub/EMBOSS

Tim

From dag at sonsorol.org  Wed Jan 24 13:05:03 2007
From: dag at sonsorol.org (Chris Dagdigian)
Date: Wed, 24 Jan 2007 13:05:03 -0500
Subject: [EMBOSS] EMBOSS FTP Site Down?
In-Reply-To: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
References: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
Message-ID: <AB5BFDDC-0A67-462A-952C-1F7E652628B2@sonsorol.org>

FTP working is ok for me.

Server side we look OK (open-bio FTP server).

I'm also monitoring the IDS and Firewall alerts in real time due to  
some other non-OBF related issues -- the only real FTP issue is that  
the intrusion detection appliance thinks that it has seen a few  
FTP:EXPLOIT:BOUNCE-ATTACK incursions against the open-bio servers  
today. I'm 99% certain that this is a false alarm but I have not  
disabled that particular attack signature yet, if your client is  
trying odd things and redirections (perhaps to get past a NAT gateway  
or firewall) then maybe you are getting caught in this trap.

Random side note:  if anyone tries non-anonymous FTP for longer than  
2 minutes with more than 5 failed login attempts then they are  
candidates for instant inclusion into a "drop all packets from this  
IP" list maintained within the firewall.

Tim - if you send me the IP address of where you are trying to  
connect from I can see if there are any messages on our end.

-Chris
open-bio.org


On Jan 24, 2007, at 12:16 PM, Tim Troup wrote:

> Hi,
>
> Is the EMBOSS FTP site down? It keeps timing out for me.
>
> ftp://emboss.open-bio.org/pub/EMBOSS
>


From arareko at campus.iztacala.unam.mx  Wed Jan 24 13:14:08 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 24 Jan 2007 12:14:08 -0600
Subject: [EMBOSS] EMBOSS FTP Site Down?
In-Reply-To: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
References: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
Message-ID: <45B7A1F0.1090605@campus.iztacala.unam.mx>

It's working ok here. Maybe your DNS can't resolve it.

Mauricio.

Tim Troup wrote:
> Hi,
> 
> Is the EMBOSS FTP site down? It keeps timing out for me.
> 
> ftp://emboss.open-bio.org/pub/EMBOSS
> 
> Tim
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From fangw at CLEMSON.EDU  Wed Jan 24 16:05:47 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Wed, 24 Jan 2007 16:05:47 -0500 (EST)
Subject: [EMBOSS] question!
Message-ID: <1100.130.127.150.224.1169672747.squirrel@wm.clemson.edu>

Hi, All:

Does any one know what kind of database version does EMBOSS version
2.10.0. connect to? Since when I use EMBOSS to extract 5'upstream sequence
from some gene, there are a lot of NNNN in the beginning in the output
file, which is different from the result which I get manualy from Ensembl
BioMart. BioMart give the full sequence which match my request, but must
be done manually.
Looking forward to your reply! ^_^

Nice day,
Fang Wang

From maoj at helix.nih.gov  Thu Jan 25 08:41:55 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 25 Jan 2007 08:41:55 -0500
Subject: [EMBOSS] question about display double-stranded DNA
Message-ID: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>

Hi,

When using remap, I prefer to use the '-noreverse' flag so that the
translation of my DNA is located closer to my DNA strand. However, using
this flag also remove the complementary strand of my DNA in the output which
is less convinient when design primers. Is there a way in remap to display
double-stranded DNA but turn off the restriction sites of the complementary
strand?

If not, is there a program in EMBOSS which can retrieve the sequence from
database, select start/end points and display both strands? I tried seqret
but failed.

Thank you  in advance.

Jean Mao

From pmr at ebi.ac.uk  Thu Jan 25 09:23:00 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 25 Jan 2007 14:23:00 +0000
Subject: [EMBOSS] question about display double-stranded DNA
In-Reply-To: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
Message-ID: <45B8BD44.9050509@ebi.ac.uk>

Hi Jean,

> When using remap, I prefer to use the '-noreverse' flag so that the
> translation of my DNA is located closer to my DNA strand. However, using
> this flag also remove the complementary strand of my DNA in the output which
> is less convinient when design primers. Is there a way in remap to display
> double-stranded DNA but turn off the restriction sites of the complementary
> strand?

I am looking at remap changes at the moment, I will see what I can do.

> If not, is there a program in EMBOSS which can retrieve the sequence from
> database, select start/end points and display both strands? I tried seqret
> but failed.

Showseq does that.

It has a bug at present (I noticed it this week - fixed in the next release) 
that makes it show additional bases up to the end of the last line.

regards,

Peter

From pmr at ebi.ac.uk  Thu Jan 25 09:39:12 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 25 Jan 2007 14:39:12 +0000
Subject: [EMBOSS] question about display double-stranded DNA
In-Reply-To: <416B9A34D7CA1C4C9ED58354E75101BB021CB85A@NIHCESMLBX3.nih.gov>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
	<45B8BD44.9050509@ebi.ac.uk>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB85A@NIHCESMLBX3.nih.gov>
Message-ID: <45B8C110.6040608@ebi.ac.uk>

Hi Jean,

> Peter, Thanks for reply. seqret can retrieve entry and select start/end
> points. But seqret does NOT display both strands. Does it?

Right. Seqret returns a sequence, so it can only rpeort one strand at a time.

> Showseq does that.
> 
> It has a bug at present (I noticed it this week - fixed in the next release)
> that makes it show additional bases up to the end of the last line.

Oops. Spoke too soon. showseq uses the dame display functions as remap and has
the same limitations.

I will see what we can do for the next release.

regards,

Peter

From gbottu at ben.vub.ac.be  Thu Jan 25 10:39:15 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 25 Jan 2007 16:39:15 +0100
Subject: [EMBOSS] question about display double-stranded DNA - Checked
	by An
In-Reply-To: <45B8BD44.9050509@ebi.ac.uk>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
	<45B8BD44.9050509@ebi.ac.uk>
Message-ID: <20070125153915.GA30474@bigben.ulb.ac.be>

On Thu, Jan 25, 2007 at 02:23:00PM +0000, Peter Rice wrote:
> I am looking at remap changes at the moment, I will see what I can do.

Could you consider an option to reject restriction enzymes that cut within 
a certain range (or ranges). This feature existed in GCG and is really 
something we would like to have (back). Allows e.g. to select enzymes that 
cut around the gene you want to clone, but not inside.

	Guy Bottu,
	BEN


From golharam at umdnj.edu  Thu Jan 25 12:31:22 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Thu, 25 Jan 2007 12:31:22 -0500
Subject: [EMBOSS] request for a useful addition
Message-ID: <08780548D8B54EF086EAEEB74AE41C2D@PICO>

The program dottup (and other dotplot tools) takes the two sequence given
and displays a dotplot.  It would be useful if you could give it the option
to reverse complement one of the sequences then perform a dotplot.
 
I was comparing a mRNA with a genomic sequence and wasn't seeing the
similarity between the sequences.  Then I releazied I needed to rev-comp the
mRNA and it showed up fine.  Of course, one can do this using revseq, but to
have dottup do it for you would be even better. 
 
Ryan
 

From David.Bauer at SCHERING.DE  Fri Jan 26 01:46:43 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Fri, 26 Jan 2007 07:46:43 +0100
Subject: [EMBOSS] Antwort:  request for a useful addition
In-Reply-To: <08780548D8B54EF086EAEEB74AE41C2D@PICO>
Message-ID: <OF3B055E59.308DA05C-ONC125726F.0024FD44-C125726F.00253DF6@schering.de>

This can be done with the option -sreverse1 or -sreverse2 to use the
reverse complement of the firts or second sequence used as input for e.g.
dottup. It is a standard option available to all emboss programs. You can
get a list of those options with -help -verbose.

David.

emboss-bounces at lists.open-bio.org schrieb am 25/01/2007 18:31:22:

> The program dottup (and other dotplot tools) takes the two sequence
given
> and displays a dotplot.  It would be useful if you could give it the
option
> to reverse complement one of the sequences then perform a dotplot.
>
> I was comparing a mRNA with a genomic sequence and wasn't seeing the
> similarity between the sequences.  Then I releazied I needed to rev-comp
the
> mRNA and it showed up fine.  Of course, one can do this using revseq,
but to
> have dottup do it for you would be even better.
>
> Ryan
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From golharam at umdnj.edu  Fri Jan 26 12:19:53 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Fri, 26 Jan 2007 12:19:53 -0500
Subject: [EMBOSS] Antwort:  request for a useful addition
In-Reply-To: <A41E1A663B11424AA975D12187E3211C@PICO>
Message-ID: <92EAFE9D771342E99D302B254E3CC6F3@PICO>

Thanks.  I didn't see this in the list of options in 
EMBOSS-Explorer.  Luke, perhaps this can be added as an 
option to the interface?

Ryan


> > -----Original Message-----
> > From: David.Bauer at SCHERING.DE [mailto:David.Bauer at SCHERING.DE]
> > Sent: Friday, January 26, 2007 1:47 AM
> > To: golharam at umdnj.edu
> > Cc: emboss at lists.open-bio.org; emboss-bounces at lists.open-bio.org
> > Subject: Antwort: [EMBOSS] request for a useful addition
> > 
> > 
> > This can be done with the option -sreverse1 or -sreverse2 to
> > use the reverse complement of the firts or second sequence 
> > used as input for e.g. dottup. It is a standard option 
> > available to all emboss programs. You can get a list of those 
> > options with -help -verbose.
> > 
> > David.
> > 
> > emboss-bounces at lists.open-bio.org schrieb am 25/01/2007 18:31:22:
> > 
> > > The program dottup (and other dotplot tools) takes the 
> two sequence
> > given
> > > and displays a dotplot.  It would be useful if you could 
> give it the
> > option
> > > to reverse complement one of the sequences then perform a dotplot.
> > >
> > > I was comparing a mRNA with a genomic sequence and wasn't
> > seeing the
> > > similarity between the sequences.  Then I releazied I needed to
> > > rev-comp
> > the
> > > mRNA and it showed up fine.  Of course, one can do this
> > using revseq,
> > but to
> > > have dottup do it for you would be even better.
> > >
> > > Ryan
> > >
> > >
> > > _______________________________________________
> > > EMBOSS mailing list
> > > EMBOSS at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/emboss
> > 
> > 
> 
> 


From gbottu at ben.vub.ac.be  Tue Jan 30 04:31:35 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Tue, 30 Jan 2007 10:31:35 +0100
Subject: [EMBOSS] [dksamuel@gmail.com: Re: question about display
	double-strande
Message-ID: <20070130093135.GA10496@bigben.ulb.ac.be>

----- Forwarded message from Duleep Samuel <dksamuel at gmail.com> -----

DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=SAzlXlDHdfFJk9cAn6wcMj/Nn8r9SHt3gK528ZaV2wJy2V2yaFiRkPGz3LX4FUAWMl2/Xl582TcZ4BZE6lTi8wAL21S2mv5V4fiAYjp9LM0RHYGDLW9v/xSR8t3N7dvlzEyH0LGk7ejUlYOJQNo9/PYCJP0BJl5oATVEMq9B0xU=
Date: Sat, 27 Jan 2007 10:25:55 +0530
From: "Duleep Samuel" <dksamuel at gmail.com>
To: "Guy Bottu" <gbottu at ben.vub.ac.be>
Subject: Re: [EMBOSS] question about display double-stranded DNA - Checked by An
In-Reply-To: <20070125153915.GA30474 at bigben.ulb.ac.be>
X-AntiVirus: checked by AntiVir Milter 1.0.6; AVE 7.3.0.32; VDF 6.37.0.228

will be useful please add if possible, regards Samuel

On 1/25/07, Guy Bottu <gbottu at ben.vub.ac.be> wrote:
>On Thu, Jan 25, 2007 at 02:23:00PM +0000, Peter Rice wrote:
>> I am looking at remap changes at the moment, I will see what I can do.
>
>Could you consider an option to reject restriction enzymes that cut within
>a certain range (or ranges). This feature existed in GCG and is really
>something we would like to have (back). Allows e.g. to select enzymes that
>cut around the gene you want to clone, but not inside.
>
>        Guy Bottu,
>        BEN
>
>_______________________________________________
>EMBOSS mailing list
>EMBOSS at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/emboss
>

----- End forwarded message -----

From shrish at ccmb.res.in  Tue Jan  9 10:36:17 2007
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Tue, 9 Jan 2007 16:06:17 +0530 (IST)
Subject: [EMBOSS] (no subject)
Message-ID: <18119340.1168338977168.JavaMail.root@mailserver>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20070109/d1681de2/attachment.ksh>

From Squig at web.de  Mon Jan 15 11:19:13 2007
From: Squig at web.de (Squig at web.de)
Date: Mon, 15 Jan 2007 12:19:13 +0100
Subject: [EMBOSS] EMBOSS 4.0 and libnucleus.so.4
Message-ID: <1056759432@web.de>

Hello,

I just installed EMBOSS 4.0 on my system and wanted to run a few tests if everything is working right.
Every tool I tried ends up with following message:

splitter: error while loading shared libraries: libnucleus.so.4: cannot open shared object file: No such file or directory

The binaries are loacted in "/usr/local/bin" and the libaries in "/usr/local/lib".

There are also these "libnucleus" files and symlinks:

libnucleus.a
libnucleus.la
libnucleus.so -> libnucleus.so.4.0.0
libnucleus.so.4 -> libnucleus.so.4.0.0
libnucleus.so.4.0.0


Do I oversee some more symlinks to add?

Some hint or help would be really appreciated.


With kind regards

Stefan Kesberg


_______________________________________________________________________
Viren-Scan f?r Ihren PC! Jetzt f?r jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222


From Squig at web.de  Mon Jan 15 13:21:23 2007
From: Squig at web.de (Squig at web.de)
Date: Mon, 15 Jan 2007 14:21:23 +0100
Subject: [EMBOSS] EMBOSS 4.0 and libnucleus.so.4
Message-ID: <1057011473@web.de>

Hello,

Using "ldd" shows that there were some dynamic libaries unknow.
I added their path to "ld.so.conf" and restarted the system.

Now everything works fine :)

Thank you.


With kind regards

Stefan Kesberg
_______________________________________________________________________
Viren-Scan f?r Ihren PC! Jetzt f?r jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222


From pmr at ebi.ac.uk  Mon Jan 15 20:52:03 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Mon, 15 Jan 2007 20:52:03 -0000 (GMT)
Subject: [EMBOSS] emboss-bug list and Debian 2.0 on IBM T22
In-Reply-To: <20070115192119.907C087049@webmail223.herald.ox.ac.uk>
References: <20070115192119.907C087049@webmail223.herald.ox.ac.uk>
Message-ID: <3884.86.133.34.142.1168894323.squirrel@webmail.ebi.ac.uk>

Dear Robert,

> NB. My email to: emboss_bug at emboss.open-bio.org did not get transmitted,
> isn't
> there anyone there anymore? Anyway, I hope I can get help at this address

Yes we are here. The list address is emboss-bug (dash, not underscore).

But we have had very few messages on the emboss-bug list in the past
month. Has anyone else had error messages (or not had a reply from us)
from an emboss-bug message?

> I got a rather disagreeable rejection message when I sent this to
> emboss at emboss.open-bio.org

Hmm .... this one did get through to the emboss list.

> Dear Emboss_support
>
> I have tried and failed to install Emboss (on an IBM T22 laptop running
> under
> Debian 2.0).
>
> The config* files and the error messages from 'make' are attached.

The files seem to be corrupted ... only the config.log file looked right,
so I cannot see the error message(s).

> PS. I will also attempt to compile under cygwin on another machine. Wish
> me
> luck!

good luck!

regards,

Peter Rice


From shrish at ccmb.res.in  Tue Jan 16 11:56:55 2007
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Tue, 16 Jan 2007 17:26:55 +0530 (IST)
Subject: [EMBOSS] extracting 3' UTRs
Message-ID: <12502499.1168948615113.JavaMail.root@mailserver>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20070116/734a50ae/attachment.ksh>

From jison at ebi.ac.uk  Tue Jan 16 14:49:53 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 16 Jan 2007 14:49:53 -0000 (GMT)
Subject: [EMBOSS] extracting 3' UTRs
In-Reply-To: <12502499.1168948615113.JavaMail.root@mailserver>
References: <12502499.1168948615113.JavaMail.root@mailserver>
Message-ID: <49980.84.92.187.247.1168958993.squirrel@webmail.ebi.ac.uk>

Hi Shrish

So far as I know, not directly, but it's easily done using a combination
of e.g. coderet, getorf, plotorf and seqret.

Should be obvious from the documentation, e.g.
http://emboss.sourceforge.net/apps/cvs/index.html
http://emboss.sourceforge.net/docs/emboss_tutorial/node4.html

If you envisage a single tool for your task, please let us know to
emboss-bug at emboss.open-bio.org please)

Cheers

Jon


> Hi!
> Is there a way to extract 3' UTRs using EMBOSS programs?
> Shrish
> Dr. Shrish Tiwari
> E503, Centre for Cellular and Molecular Biology
> Uppal Road, Hyderabad - 500 007, INDIA
> Phone: 91-40-27192777
> Alternate email: shrish.geo at yahoo.com
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From David.Bauer at SCHERING.DE  Tue Jan 16 15:44:23 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Tue, 16 Jan 2007 16:44:23 +0100
Subject: [EMBOSS] extracting 3' UTRs
In-Reply-To: <49980.84.92.187.247.1168958993.squirrel@webmail.ebi.ac.uk>
Message-ID: <OF38A5EF46.2078413B-ONC1257265.0054DF9A-C1257265.00567631@schering.de>

Hi Shrish,

in principle this would be an easy task for 'extractfeat' because the EMBL
feature table definition contains also the feature key "3' UTR".
But nearly nobody uses this feature key in praxi.
So I would use coderet to look for the end of a CDS and then extract the
remaining part with seqret. or extractseq.
It will be straightforward for single mRNA entries with one CDS.
If you want to do this on a genome level, you should take a look at
Ensembl (www.ensembl.org) and the Mart interface. There you can extract
3'UTR.
But from my experience the annotation of UTRs is very incomplete so don't
expect to get something comprehensive with these methods.

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 16/01/2007 15:49:53:

> Hi Shrish
>
> So far as I know, not directly, but it's easily done using a combination
> of e.g. coderet, getorf, plotorf and seqret.
>
> Should be obvious from the documentation, e.g.
> http://emboss.sourceforge.net/apps/cvs/index.html
> http://emboss.sourceforge.net/docs/emboss_tutorial/node4.html
>
> If you envisage a single tool for your task, please let us know to
> emboss-bug at emboss.open-bio.org please)
>
> Cheers
>
> Jon
>
>
>
>
> > Hi!
> > Is there a way to extract 3' UTRs using EMBOSS programs?
> > Shrish
> > Dr. Shrish Tiwari
> > E503, Centre for Cellular and Molecular Biology
> > Uppal Road, Hyderabad - 500 007, INDIA
> > Phone: 91-40-27192777
> > Alternate email: shrish.geo at yahoo.com
> > _______________________________________________
> > EMBOSS mailing list
> > EMBOSS at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/emboss
> >
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From maoj at helix.nih.gov  Tue Jan 16 18:14:15 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Tue, 16 Jan 2007 13:14:15 -0500
Subject: [EMBOSS] bug in restrict?
Message-ID: <000001c7399a$222f71f0$be4de780@CIT.NIH.GOV>

I used est:af436075 and run 'restrict' program in EMBOSS. One of the enzyme
which cut this sequence is called 'Tth111II' . When using 'redata' program
to search for this enzyme, result shows this is from unpublished
observations. Since the default of 'restrict' is to search enzymes that is
only commercially available, I think the appearance of 'Tth111II' is a bug.
Please advise. Thanks.

Jean Mao


From maoj at helix.nih.gov  Tue Jan 16 18:41:43 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Tue, 16 Jan 2007 13:41:43 -0500
Subject: [EMBOSS] Bug in 'remap' program?
Message-ID: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>

Hi, I used genbank:A00006 sequence to run 'remap' program in emboss. Among
the Enzymes that cut, 

BmgT120I , FmuI , PabI , TspRI , UnbI  does not show any Isoschizomers.
However, they all have Isoschizomers based on the 'redata' program'. One
thing they have in common is that they don't exist in the embossre.equ file.
All of them exist in withrefm file. All of them except 'TspRI' exist in
proto file. How can I make the rebaseextract program work the way that they
will show their  Isoschizomers if exist? 

Thank you.

Jean Mao


From gbottu at ben.vub.ac.be  Wed Jan 17 08:59:24 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Wed, 17 Jan 2007 09:59:24 +0100
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
Message-ID: <20070117085924.GA2027@bigben.ulb.ac.be>

	Dear Jean,

The program remap by default only outputs one representative member (the 
prototype) of a series of isoschizomers and it only considers enzymes 
that have a commercial provider. If you want to see all enzymes you must 
run remap with parameters -nolimit -nocommercial.

	Regards,
	Guy Bottu,
	Belgian EMBnet Node


From maoj at helix.nih.gov  Wed Jan 17 13:30:45 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Wed, 17 Jan 2007 08:30:45 -0500
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <20070117085924.GA2027@bigben.ulb.ac.be>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
Message-ID: <000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>

 Hi Guy,

There has inconsistency in the result I get. You may want to run remap with
the sequence I provide below to see the problem.

on the commend line, I run :

% remap -opt

accept all the default using the file provided below, part of the output
file look like this (which I don't see when not using -opt flag, why?) :

===============================================================
# Enzymes that cut  Frequency   Isoschizomers
      AluI          1   MltI
      ApaI          1   PpeI
      AsuI          2
AspS9I,AvcI,Bac36I,Bal228I,BavAII,BavBII,Bce22I,BshKI,BsiZI,Bsp1894I,BspBII,
BspF4I,Bsu54I,CcuI,Cfr13I,MaeK81
II,Nsp7121I,NspIV,Pde12I,PspPI,Sau96I
      BfiI          1   BmrI,BmuI
  BmgT120I          2   
     BseSI          1   Bme1580I
     BsiYI          1   BflI,Bsc107I,Bsc4I,BseLI,AfiI,BslI,Bst22I
   Bsp120I          1   PspOMI
      BsrI          1   BseNI,Bse1I,BsrSI,Bst11I,Tsp1I
     Csp6I          1   CviQI,CviRII
     CviJI          2   CviKI,CviKI-1
     CviRI          1   HpyCH4V,HpyF44III
     DraII          1   EcoO109I
      FmuI          2   
    HaeIII          1
BecAII,Bim19II,Bme361I,BseQI,BshFI,BshI,BsnI,Bsp211I,BspANI,BspBRI,BspKI,Bsp
RI,BsuRI,BteI,CltI,DsaII,EsaBC4I
,FnuDI,BanAI,MchAII,MfoAI,NgoPII,NspLKI,PalI,Pde133I,PflKI,PhoI,PlaI,Pru2I,S
bvI,SfaI,SuaI
    HgiJII          1
BpuI,Bsp519I,Bsu1854I,BvuI,Eco24I,Eco75KI,EcoT38I,FriOI,BanII,KoxII,PaeHI,Sa
cNI
     Hpy8I          1   HpyBII
   Kaz48kI          1   PssI
     NlaIV          1   BmiI,BscBI,BspLI,AspNI,PspN4I
      PabI          1   
      RsaI          1   HpyBI,PlaAII,AfaI
      SduI          1   BmyI,BsoCI,Bsp1286I,BspLS2I,MhlI,NspII,AocII
     TspRI          1   
      UnbI          2   
==========================================================================


As you can see, 6 enzymes show NO isoschizomers. I assume all of them have
commercial supplier(s) since I accept the default setting. However, using
'redata' program in EMBOSS on these 6 enzymers, some of them DO have
isoschizomers but the field was left blank. In addition, some of them has NO
suppliers listed which is not suppose to appear when I use the default
settings, isn't it?

Thank you in advance.


================= Sequence I Used
=============================================


!!NA_SEQUENCE 1.0
LOCUS       A00006                    26 bp    DNA     linear   PAT
10-FEB-1993
DEFINITION  Artificial oligonucleotide sequence (Fra 3), sequence 5 from
patent
            application EP0238993.
ACCESSION   A00006
VERSION     A00006.1  GI:57973
KEYWORDS    .
SOURCE      synthetic construct
  ORGANISM  synthetic construct
            other sequences; artificial sequences.
REFERENCE   1  (bases 1 to 26)
  AUTHORS   Auerswald,E.A., Schroeder,W., Schnabel,E., Bruns,W.,
Reinhardt,G.
            and Kotick,M.
  TITLE     Aprotinin homologues produced by genetic engineering
  JOURNAL   Patent: EP 0238993-A 5 30-SEP-1987;
            BAYER AG
FEATURES             Location/Qualifiers
     source          1. .26
                     /organism="synthetic construct"
                     /mol_type="unassigned DNA"
                     /db_xref="taxon:32630"
ORIGIN

  A00006  Length: 26  January 11, 2007 09:44  Type: N  Check: 4746  ..

       1  CGCCGTACAC TGGGCCCTGC AAAGCT


-----Original Message-----
From: Guy Bottu [mailto:gbottu at ben.vub.ac.be] 
Sent: 2007?1?17? 3:59
To: Jean Mao
Cc: emboss at lists.open-bio.org
Subject: Re: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
version 

	Dear Jean,

The program remap by default only outputs one representative member (the
prototype) of a series of isoschizomers and it only considers enzymes that
have a commercial provider. If you want to see all enzymes you must run
remap with parameters -nolimit -nocommercial.

	Regards,
	Guy Bottu,
	Belgian EMBnet Node


From gbottu at ben.vub.ac.be  Thu Jan 18 09:05:27 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 18 Jan 2007 10:05:27 +0100
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
	version
In-Reply-To: <416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
	<000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>
	<20070117164311.GA8769@bigben.ulb.ac.be>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
Message-ID: <20070118090527.GA22252@bigben.ulb.ac.be>

On Wed, Jan 17, 2007 at 12:07:31PM -0500, Mao, Jean (NIH/CIT) [E] wrote:
> How about BmgT120I, based on the 'redata' program, it has isoschizomers, but non was listed in my output.
> UnbI has isoschizomers also and has NO commercial provider listed.

You have indeed pinpointed a bug or misfeature. The
problem might be that the prototype enzyme is AsuI.
But AsuI has no commercial providers. 
It is more easy to see this in our MRS server 
than using redata :
http://bendisk.ulb.ac.be/mrs/cgi-bin/mrs.cgi?db=rebase&query=BmgT120I
So, several isoschizomers of AsuI are displayed in 
the output instead of just one enzyme.

Could Alan Bleasby comment about this ?

	Guy Bottu,
	BEN


From ajb at ebi.ac.uk  Thu Jan 18 09:47:20 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 18 Jan 2007 09:47:20 -0000 (GMT)
Subject: [EMBOSS] Bug in 'remap' program? - Checked by AntiVir DEMO
 version
In-Reply-To: <20070118090527.GA22252@bigben.ulb.ac.be>
References: <000501c7399d$f7da5c40$be4de780@CIT.NIH.GOV>
	<20070117085924.GA2027@bigben.ulb.ac.be>
	<000a01c73a3b$b15dbb60$be4de780@CIT.NIH.GOV>
	<20070117164311.GA8769@bigben.ulb.ac.be>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB2EC@NIHCESMLBX3.nih.gov>
	<20070118090527.GA22252@bigben.ulb.ac.be>
Message-ID: <56655.81.98.244.247.1169113640.squirrel@webmail.ebi.ac.uk>

Hi Jean, Guy,

> Could Alan Bleasby comment about this ?

I'm currently looking at that area of the code (for restrict). I suspect
that it is just a problem with the positioning of the commercial
availability test.

Alan


From maoj at helix.nih.gov  Thu Jan 18 14:14:20 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 18 Jan 2007 09:14:20 -0500
Subject: [EMBOSS] question using 'matpatmotifs'
Message-ID: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>

Hi, 

I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
EMBOSS and got 0 hits. However, when I used the same input sequnce on
interproscan, the result
(http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
70118-14025926) show that it contains basic Helix-loop-helix motif which is
ID PS50888 in prosite database. Is this a bug or did I do something wrong? I
also run the same sequence against the 'motifs' program in GCG package.
Again no hit was found. 

Thank you.

Jean Mao


From ajb at ebi.ac.uk  Thu Jan 18 14:42:44 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 18 Jan 2007 14:42:44 -0000 (GMT)
Subject: [EMBOSS] question using 'matpatmotifs'
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
References: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <39856.81.98.244.247.1169131364.squirrel@webmail.ebi.ac.uk>

Hi Jean,

It is more like a feature. PS50888 is a matrix and patmatmotifs
doesn't deal with that type of PROSITE entry - it just compares
the pattern string entries.

Alan


> Hi,
>
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
> (http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
> 70118-14025926) show that it contains basic Helix-loop-helix motif which
> is
> ID PS50888 in prosite database. Is this a bug or did I do something wrong?
> I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found.
>
> Thank you.
>
> Jean Mao
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From gbottu at ben.vub.ac.be  Thu Jan 18 15:04:31 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 18 Jan 2007 16:04:31 +0100
Subject: [EMBOSS] question using 'matpatmotifs' - Checked by AntiVir
	DEMO ve
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
References: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <20070118150431.GA28210@bigben.ulb.ac.be>

On Thu, Jan 18, 2007 at 09:14:20AM -0500, Jean Mao wrote:
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs' in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
> (http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200
> 70118-14025926) show that it contains basic Helix-loop-helix motif which is
> ID PS50888 in prosite database. Is this a bug or did I do something wrong? I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found. 

The reason is that GCG motifs and EMBOSS patmatmotif search only the 
PROSITE entries of type "pattern", while PS50888 is of type "matrix". If 
you want to search the complete PROSITE (patterns+matrices+rules), you can 
download the ps_scan script from ftp://ftp.expasy.org/databases/prosite/tools/ps_scan/sources
and the pftools package from 
ftp://ftp.isrec.isb-sib.ch/pub/sib-isrec/pftools/pft2.3
You can run this under EMBOSS with the wrappers4EMBOSS package 
(http://wemboss.sourceforge.net/).

	Hope this helps,
	Guy Bottu,
	BEN


From David.Bauer at SCHERING.DE  Thu Jan 18 14:58:09 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Thu, 18 Jan 2007 15:58:09 +0100
Subject: [EMBOSS] Antwort:  question using 'matpatmotifs'
In-Reply-To: <000b01c73b0a$f2bd6a40$be4de780@CIT.NIH.GOV>
Message-ID: <OF5E94A951.C4B0ECE1-ONC1257267.0051FEB6-C1257267.00523BB1@schering.de>

Hi Jean,

this is an old problem with patmatmotifs.
This program makes use only of the traditonal Prosite patterns and
unfortunately can not handle the newer type Prosite matrix entries.

Cheers,
David.

emboss-bounces at lists.open-bio.org schrieb am 18/01/2007 15:14:20:

> Hi,
>
> I used 'swissprot:hair_drome' as input sequence and run 'matpatmotifs'
in
> EMBOSS and got 0 hits. However, when I used the same input sequnce on
> interproscan, the result
>
(http://www.ebi.ac.uk/cgi-bin/iprscan/iprscan?tool=iprscan&jobid=iprscan-200

> 70118-14025926) show that it contains basic Helix-loop-helix motif which
is
> ID PS50888 in prosite database. Is this a bug or did I do something
wrong? I
> also run the same sequence against the 'motifs' program in GCG package.
> Again no hit was found.
>
> Thank you.
>
> Jean Mao
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From maoj at helix.nih.gov  Thu Jan 18 19:05:51 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 18 Jan 2007 14:05:51 -0500
Subject: [EMBOSS] about 'helixturnhelix'e
Message-ID: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>

Now that I know the reason why patmatmotifs can't find HTH in my hair_drome
input sequence, I would like to know what program in EMBOSS package CAN find
it in my sequence. I tried helixturnhelix but still it couldn't find it.
Does helixturnhelix using matrix or motif? looks like matrix to me in the
documentation. Please advise.

Thank you very much!

Jean


From pmr at ebi.ac.uk  Thu Jan 18 23:02:15 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Thu, 18 Jan 2007 23:02:15 -0000 (GMT)
Subject: [EMBOSS] about 'helixturnhelix'e
In-Reply-To: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>
References: <001001c73b33$abca20f0$be4de780@CIT.NIH.GOV>
Message-ID: <1771.86.141.183.176.1169161335.squirrel@webmail.ebi.ac.uk>

Hi Jean,

> Now that I know the reason why patmatmotifs can't find HTH in my
> hair_drome
> input sequence, I would like to know what program in EMBOSS package CAN
> find
> it in my sequence. I tried helixturnhelix but still it couldn't find it.
> Does helixturnhelix using matrix or motif? looks like matrix to me in the
> documentation. Please advise.

helixturnhelix uses a matrix, but it is quite an old one from the days
when there were only about 20 HTH examples (and 2 of them were wrong).

We will look at ways to use the prosite matrix entries.

regards,

Peter


From pmr at ebi.ac.uk  Fri Jan 19 15:49:04 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Fri, 19 Jan 2007 15:49:04 -0000
Subject: [EMBOSS] Question regarding dbxflat entry number processed
In-Reply-To: <000a01c71946$94beb600$be4de780@CIT.NIH.GOV>
References: <000a01c71946$94beb600$be4de780@CIT.NIH.GOV>
Message-ID: <13132.193.173.109.1.1165419569.squirrel@webmail.ebi.ac.uk>

Hi Jean,

> Hi, I am using dbxflat to index a database. I would like to find out how
> many entries were processed. In the index file database.pxid, there is a
> line :
>
> Count      123456
>
> which is very close to the number of entries in the database file but not
> exact the same. Is there a way to find out? Thank you very much.

The count should be the number of IDs found. Do you perhaps have some
duplicate IDs?

regards,

Peter


From rls at ebi.ac.uk  Fri Jan 19 15:49:45 2007
From: rls at ebi.ac.uk (Rodrigo Lopez)
Date: Fri, 19 Jan 2007 15:49:45 -0000
Subject: [EMBOSS] Output from seqret in fastaformat.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
Message-ID: <4576C594.3080609@ebi.ac.uk>

Hi,

Use -osdbname UNIPROT in the command line.

R:)

JK (Jesper Agerbo Krogh) wrote:
> Hi.. 
> 
> I've godt dbxflat to index the swissprot database.. but I'd like to have the output 
> formatted with the USA as the fasta ID. 
> 
> Current..:
> 
> seqret UNIPROT:Q12345
> Reads and writes (returns) sequences
> output sequence(s) [ies3_yeast.fasta]:
> 
>> IES3_YEAST Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
> 
> .. but I'd like.. 
> 
>> UNIPROT:Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
> 
> Is that possible? 
> 
> 


From pmr at ebi.ac.uk  Fri Jan 19 15:54:41 2007
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Fri, 19 Jan 2007 15:54:41 -0000
Subject: [EMBOSS] Output from seqret in fastaformat.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196190B84C672@NZT0004E.dknz.nzcorp.net>
Message-ID: <14850.193.173.109.1.1165419775.squirrel@webmail.ebi.ac.uk>

Hi Jesper,

> I've godt dbxflat to index the swissprot database.. but I'd like to have
> the output
> formatted with the USA as the fasta ID.
>
> Current..:
>
> seqret UNIPROT:Q12345
> Reads and writes (returns) sequences
> output sequence(s) [ies3_yeast.fasta]:
>
>>IES3_YEAST Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
>
> .. but I'd like..
>
>>UNIPROT:Q12345 Ino eighty subunit 3.
> MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
> ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
> KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
> QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
> NKNGLLENIL
>
> Is that possible?

Tricky to do. Q12345 is not the sequence ID, it is only the accession
number. There are ways to rewrite UNiProt as a FASTA format file and index
with dbxfasta but that loses the rest of the information in the entries.

A simple perl script to rearrange the ID lines is your easiest solution.

Alternativelyj, you could invent a new EMBOSS output format that uses the
DBname and accession to create the ID. But EMBOSS would still want to
write to a file called "ies3_yeast.*" because it uses the ID to make up
the default filename.

If you insist, you can try:

seqret UNIPROT:Q12345 -sid Q12345 -osdbname UNIPROT

which gives me the result you expect with the current developers code (I
am away from the office todaty, and there have been changes to the way
database names are propagated to the output so release 4.0.0 may behave
slightly differently).

Hope that helps

Peter


From ajb at ebi.ac.uk  Fri Jan 19 16:22:07 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Fri, 19 Jan 2007 16:22:07 -0000 (GMT)
Subject: [EMBOSS] Explanation: time-warped messages
Message-ID: <40703.81.98.244.247.1169223727.squirrel@webmail.ebi.ac.uk>

Apologies for all the time-warped messages that have been
appearing on this list over the last day or two.

There has been a long-standing problem (from early December) with the
EBI's email setup in that it didn't always respond correctly to the
anti-spam
mechanisms on the open-bio email lists. This was fixed by the
EBI Systems group this week.

So, some messages sent by EBI staff are just getting through.

Alan


From mathog at caltech.edu  Fri Jan 19 16:19:44 2007
From: mathog at caltech.edu (David Mathog)
Date: Fri, 19 Jan 2007 08:19:44 -0800
Subject: [EMBOSS] Output from seqret in fastaformat
Message-ID: <E1H7wSu-0002dv-Qg@mendel.bio.caltech.edu>

Peter Rice wrote:

> "JK (Jesper Agerbo Krogh)" <JK at novozymes.com,<pmr at ebi.ac.uk>

I'm with Peter on this one.  There are way too many possible formats
for fasta comment lines for any software to support all of them.
This command line reformatting is exactly the sort of task my
'extract' program was written to handle (having faced the same
task myself more times than I can count).  Example:

% cat >foo.pfa <<EOD
>IES3_YEAST Q12345 Ino eighty subunit 3.
MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
NKNGLLENIL
EOD
% cat foo.pfa | extract -if '>' -mt -cols 'UNIPROT:[2,]'
UNIPROT:Q12345 Ino eighty subunit 3.
MKFEDLLATNKQVQFAHAATQHYKSVKTPDFLEKDPHHKKFHNADGLNQQGSSTPSTATD
ANAASTASTHTNTTTFKRHIVAVDDISKMNYEMIKNSPGNVITNANQDEIDISTLKTRLY
KDNLYAMNDNFLQAVNDQIVTLNAAEQDQETEDPDLSDDEKIDILTKIQENLLEEYQKLS
QKERKWFILKELLLDANVELDLFSNRGRKASHPIAFGAVAIPTNVNANSLAFNRTKRRKI
NKNGLLENIL

So you can process the whole thing in a pipe or in two stages through
a temporary file. Your choice.

Extract is part of drm_tools (these have nothing to do with
"digital rights management", they were my initials long before drm
took on its current common meaning) from here:

ftp://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/drm_tools.tar.gz


The man page is here:

  http://saf.caltech.edu/saf_manuals/extract.html

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


From uludag at ebi.ac.uk  Mon Jan 22 15:42:32 2007
From: uludag at ebi.ac.uk (Mahmut Uludag)
Date: Mon, 22 Jan 2007 15:42:32 +0000
Subject: [EMBOSS] workflow ideas
Message-ID: <1169480552.4118.80.camel@emboss2.ebi.ac.uk>

Hi,

We have recently extended the EBI Soaplab server by new webservices for
EMBOSS 4.0 applications including the EMBASSY applications, and made it
publicly available through the following address.

   http://www.ebi.ac.uk/soaplab/emboss4/index.html

We are now in the process of building Taverna workflows to demonstrate
use cases for these services. We need ideas for these use cases. If you
have any use case ideas for EMBOSS services you or your colleagues would
use in the future and would like us to prepare workflow(s) for those use
cases please email me with a brief description then I will prepare
workflow(s) for your use case(s). These workflows will later be
published from a public repository. We are also interested in use cases
that would include other webservices together with the EMBOSS services,
basically to demonstrate the interoperability of the services.

Regards,
Mahmut


From maoj at helix.nih.gov  Mon Jan 22 21:52:33 2007
From: maoj at helix.nih.gov (jean mao)
Date: Mon, 22 Jan 2007 16:52:33 -0500
Subject: [EMBOSS] question about seqret
Message-ID: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>

Hi, I would like to know is there a way I can search all databases 
available for one accession number I have which I don't know what 
database(s) it belongs to? May I do some configuration in the 
emboss.default file for that?

Thank you very much in advance.

Jean


From ajb at ebi.ac.uk  Mon Jan 22 22:54:49 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Mon, 22 Jan 2007 22:54:49 -0000 (GMT)
Subject: [EMBOSS] question about seqret
In-Reply-To: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>
References: <Pine.SGI.4.63.0701221642460.28149926@helix.nih.gov>
Message-ID: <39293.81.98.244.247.1169506489.squirrel@webmail.ebi.ac.uk>

Hello Jean,

The EMBOSS application 'whichdb' should do that.

Alan


> Hi, I would like to know is there a way I can search all databases
> available for one accession number I have which I don't know what
> database(s) it belongs to? May I do some configuration in the
> emboss.default file for that?
>
> Thank you very much in advance.
>
> Jean
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From maoj at helix.nih.gov  Tue Jan 23 00:51:09 2007
From: maoj at helix.nih.gov (jean mao)
Date: Mon, 22 Jan 2007 19:51:09 -0500
Subject: [EMBOSS] question about seqret
Message-ID: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>

Thank you for your reply. I solved it 
by using app method and the emboss farm script by Simon Andrews.

Jean Mao.


From jison at ebi.ac.uk  Tue Jan 23 09:47:03 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 23 Jan 2007 09:47:03 -0000 (GMT)
Subject: [EMBOSS] question about seqret
In-Reply-To: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>
References: <Pine.SGI.4.63.0607071500360.34044944@helix.nih.gov>
Message-ID: <1186.84.92.187.247.1169545623.squirrel@webmail.ebi.ac.uk>

> Thank you for your reply. I solved it
> by using app method and the emboss farm script by Simon Andrews.

FYI see
http://emboss.sourceforge.net/docs/themes/emboss_farm.script

Jon


From fangw at CLEMSON.EDU  Tue Jan 23 15:54:09 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Tue, 23 Jan 2007 10:54:09 -0500 (EST)
Subject: [EMBOSS] question!
Message-ID: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>

Dear EMBOSS people:

I am a Ph.D. student in Clemson University in USA, who is using your
EMBOSS software to extract 5' upstream of a list of genes. However, I met
some problem when I use EMBOSS:

I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command
"extractfeat genbank:*", it does not work. The error message is
"Error:uable to read sequence 'genbank:4101655', Died: extractfeat
termined:Bad value for '-sequence' and no prompt".  But it work fine
with "extractfeat embl:AK222810".Do you know the reason?

Is there any way to access ENsembl database. Is there any new version of
EMBOSS which could support more databases which could installed in
windowsXP?

Are all the databases which EMBOSS connected are the latest version? since
I found some database do not give the same results as what I get from the
database directly.

Thanks!

I am looking forward to your reply.

FANG WANG
Department of Genetics and Biochemistry
Clemson University


From jison at ebi.ac.uk  Tue Jan 23 17:31:02 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Tue, 23 Jan 2007 17:31:02 -0000 (GMT)
Subject: [EMBOSS] question!
In-Reply-To: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
References: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
Message-ID: <48163.84.92.187.247.1169573462.squirrel@webmail.ebi.ac.uk>

Dear FANG

The error message suggests that EMBOSS has not been configured to work with "genbank".
Every database you intend to use must be defined in one of the EMBOSS configuration
files "emboss.default" or ".embossrc".

"emboss.default" lives in the top-level emboss directory (e.g. /home/auser/emboss/emboss.default)
and is used for site-wide databases.

".embossrc" lives in your personal home directory and is used for your own databases (or for testing).

Please read the documentation which describes how to configure database access in these files:
http://emboss.sourceforge.net/docs/themes/Databases.html
http://emboss.sourceforge.net/admin/

Or ask your sysadmin to setup access for you (a better route if the database is a shared resource).

So far as I know, EMBOSS cannot read ensembl directly.

The answer to your last question is "It depends on which databases your installation is configured
to use" (see "emboss.default" and ".embossrc").

Good luck !

Cheers

Jon


> Dear EMBOSS people:
>
> I am a Ph.D. student in Clemson University in USA, who is using your
> EMBOSS software to extract 5' upstream of a list of genes. However, I met
> some problem when I use EMBOSS:
>
> I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command
> "extractfeat genbank:*", it does not work. The error message is
> "Error:uable to read sequence 'genbank:4101655', Died: extractfeat
> termined:Bad value for '-sequence' and no prompt".  But it work fine
> with "extractfeat embl:AK222810".Do you know the reason?
>
> Is there any way to access ENsembl database. Is there any new version of
> EMBOSS which could support more databases which could installed in
> windowsXP?
>
> Are all the databases which EMBOSS connected are the latest version? since
> I found some database do not give the same results as what I get from the
> database directly.
>
> Thanks!
>
> I am looking forward to your reply.
>
> FANG WANG
> Department of Genetics and Biochemistry
> Clemson University
>
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From pmr at ebi.ac.uk  Tue Jan 23 17:45:22 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 23 Jan 2007 17:45:22 +0000
Subject: [EMBOSS] question!
In-Reply-To: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
References: <3195.130.127.150.224.1169567649.squirrel@wm.clemson.edu>
Message-ID: <45B649B2.3030000@ebi.ac.uk>

Dear Fang,

> I installed EMBOSS 2.10.0 in on windowsXP PC.  However, when I use command 
> "extractfeat genbank:*", it does not work. The error message is "Error:uable 
> to read sequence 'genbank:4101655', Died: extractfeat termined:Bad value for 
> '-sequence' and no prompt".  But it work fine with "extractfeat 
> embl:AK222810".Do you know the reason?

If you used the database definitions provided with EMBOSS ... your genbank is 
possibly
pointing to the CBR  server in Canada which has now closed.

There is also a problem with the way SRS servers define the GI number - there 
are now servers that index it, but as "gid" not as "gi" which EMBOSS 
anticipated. We sill change the field name in the next release of EMBOSS.


To test whether yuor genbank definition works, you could try the ID
We are now at release 4.0.0 which allows "gi" as a search field. Earlier 
versions only had "sv" (sequence version) ... whether that is indexed depends on 
the database provider. Indexing GenBank in EMBOSS does allow GI searches.

> Is there any way to access ENsembl database. Is there any new version of 
> EMBOSS which could support more databases which could installed in windowsXP?

Ah, you are running EMBOSS under windows? embosswin was provided by Andre 
Blavier up to EMBOSS 2.10.0. We now provide a beta release of EMBOSS 4.0.0 for 
windows (nobody did version 3.0.0 for windows).

Hmmmm ... we need to make that more obvious on the EMBOSS website. EMBOSSWIN is 
available by FTP from emboss.open-bio.org/pub/EMBOSS/windows/ ... only a few 
brave people have tested it so far, but they report that it is working.

> Are all the databases which EMBOSS connected are the latest version? since I 
> found some database do not give the same results as what I get from the 
> database directly.

That depends on where the databases are. There is a list of SRS servers you can 
check for the number of entries and the date they were indexed:

http://downloads.biowisdomsrs.com/publicsrs.html

for example:

DB genbank [ type: N method: srswww format: genbank
    url: "http://iubio.bio.indiana.edu/srsbin/cgi-bin/wgetz"
    dbalias: "genbankrelease"
    fields: "gi sv des org key"
    comment: "Genbank IDs" ]

You can also try Entrez databases in EMBOSS 4.0.0 ... I wonder how many users 
have been using entrez as an access method?

Hope that helps

Peter Rice


From fangw at CLEMSON.EDU  Tue Jan 23 18:42:25 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Tue, 23 Jan 2007 13:42:25 -0500 (EST)
Subject: [EMBOSS] question
Message-ID: <4339.130.127.150.224.1169577745.squirrel@wm.clemson.edu>

Dear EMBOSS people:

If I have a list of genes that I would like to extract only the 5 upstream
2000bp of each gene. I choose "extractfeat", but it did not give me any
answer. Someone have met the same problem before?  Looking forward to your
reply. Thanks!

Nice day,
Fang Wang


From golharam at umdnj.edu  Tue Jan 23 19:23:28 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Tue, 23 Jan 2007 14:23:28 -0500
Subject: [EMBOSS] transeq changes sequence id
Message-ID: <FFD943C623B048C8A779AB874CE49F27@PICO>

I'm using transeq to translate a bunch of sequence for me and noticed that
upon translation, it adds a '_1' to the seqid.  For example:
 
I give it a file with
>myseq
ATG...TAG
 
After translation, the resulting file contains:
>myseq_1
M...
 
Is there a way to prevent transeq from manipulating the FASTA header and
just translate the sequence?
 
Ryan
 

From golharam at umdnj.edu  Wed Jan 24 05:22:37 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Wed, 24 Jan 2007 00:22:37 -0500
Subject: [EMBOSS] need or want to support grid comping?
Message-ID: <6328897A6BB8418CAF90745B21F9C738@PICO>

Does anyone use (or need) EMBOSS tools to be supported in a web environment
with grid support?  ie using EMBOSS-Explorer and have the programs execute
on a grid instead of the web server?
 
Is anyone currently doing this?  
 
Ryan
 

From David.Bauer at SCHERING.DE  Wed Jan 24 07:06:17 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Wed, 24 Jan 2007 08:06:17 +0100
Subject: [EMBOSS] Antwort:  transeq changes sequence id
In-Reply-To: <FFD943C623B048C8A779AB874CE49F27@PICO>
Message-ID: <OF6E108E75.FE8D6B76-ONC125726D.00252423-C125726D.00270755@schering.de>


Hi,

the _1 is there to indicate the frame which was used for translation.
You can use
transeq myseq.fa -frame 1,2
and this would give a fasta file with two protein sequences.
And that's where the added number makes sense; to prevent the creation of
protein sequences which all have the same ID.

So far about the philosophy of this number ;-)

And now a solution for your problem:

transeq test.fa | descseq -filter -name `infoseq -nohead -only -name
test.fa`

This works only if you have just one sequence in the input file. If you
have a multiple sequence fasta file, you can use seqretsplit to create
individual sequence files for each sequence.

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 23/01/2007 20:23:28:

> I'm using transeq to translate a bunch of sequence for me and noticed
that
> upon translation, it adds a '_1' to the seqid.  For example:
>
> I give it a file with
> >myseq
> ATG...TAG
>
> After translation, the resulting file contains:
> >myseq_1
> M...
>
> Is there a way to prevent transeq from manipulating the FASTA header and
> just translate the sequence?
>
> Ryan
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From David.Bauer at SCHERING.DE  Wed Jan 24 07:45:56 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Wed, 24 Jan 2007 08:45:56 +0100
Subject: [EMBOSS] Antwort:  question
In-Reply-To: <4339.130.127.150.224.1169577745.squirrel@wm.clemson.edu>
Message-ID: <OF6EF0246A.8A5EB468-ONC125726D.002A5E02-C125726D.002AA884@schering.de>


Hi Fang,

what about this:

extractfeat myseq.embl -type mRNA -join -before 2000 | seqret -filter
-send 2000

HTH,
David.

emboss-bounces at lists.open-bio.org schrieb am 23/01/2007 19:42:25:

> Dear EMBOSS people:
>
> If I have a list of genes that I would like to extract only the 5
upstream
> 2000bp of each gene. I choose "extractfeat", but it did not give me any
> answer. Someone have met the same problem before?  Looking forward to
your
> reply. Thanks!
>
> Nice day,
> Fang Wang
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From jrvalverde at cnb.uam.es  Wed Jan 24 09:54:21 2007
From: jrvalverde at cnb.uam.es (Jose R. Valverde)
Date: Wed, 24 Jan 2007 10:54:21 +0100
Subject: [EMBOSS] need or want to support grid comping?
In-Reply-To: <6328897A6BB8418CAF90745B21F9C738@PICO>
References: <6328897A6BB8418CAF90745B21F9C738@PICO>
Message-ID: <20070124105421.35375d32@veda.cnb.uam.es>

On Wed, 24 Jan 2007 00:22:37 -0500
"Ryan Golhar" <golharam at umdnj.edu> wrote:
> Does anyone use (or need) EMBOSS tools to be supported in a web environment
> with grid support?  ie using EMBOSS-Explorer and have the programs execute
> on a grid instead of the web server?
>  
> Is anyone currently doing this?  
>  
> Ryan
>  
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


There are various people working on this that I can think of right now off 
the top of my head:

	- ourselves, we are looking into making the jEMBOSS batch facility
use the Grid (EGEE)
	- the EMBnet node at Mexico is also looking into gridifying EMBOSS
using EELA (on top of EGEE)
	- the Argentina EMBnet node in cooperation with the Belgian EMBnet
node may already have solved the problem by porting wEMBOSS over DRMAA. We
want to look into their approach to see if it works over the GridWay DRMAA
implementation which would mean they had it already solved for EGEE and
Globus at least
	- other initiative may be ongoing within EMBRACE but I believe they
are currently more interested in Web Services.

To the extent I am aware most of these projects are going slowly for a hoard
of fortuitous reasons:
	- we are now busy organizing courses and conferences which delays
our work
	- MX is heating up their development steam
	- AR's Martin Sarachu is fighting a leukaemia

This said, any new hands are welcome. If you are interested we can provide
with all the help and assistance needed and it may turn out to be a trivial
task starting from Martin's work. Just let us know.

Otherwise, I expect this to be ready along this year in one way or another
(or all of them).

				j

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    Jos? R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural


From Tim.Troup at ed.ac.uk  Wed Jan 24 17:16:16 2007
From: Tim.Troup at ed.ac.uk (Tim Troup)
Date: Wed, 24 Jan 2007 17:16:16 +0000
Subject: [EMBOSS] EMBOSS FTP Site Down?
Message-ID: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>

Hi,

Is the EMBOSS FTP site down? It keeps timing out for me.

ftp://emboss.open-bio.org/pub/EMBOSS

Tim


From dag at sonsorol.org  Wed Jan 24 18:05:03 2007
From: dag at sonsorol.org (Chris Dagdigian)
Date: Wed, 24 Jan 2007 13:05:03 -0500
Subject: [EMBOSS] EMBOSS FTP Site Down?
In-Reply-To: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
References: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
Message-ID: <AB5BFDDC-0A67-462A-952C-1F7E652628B2@sonsorol.org>

FTP working is ok for me.

Server side we look OK (open-bio FTP server).

I'm also monitoring the IDS and Firewall alerts in real time due to  
some other non-OBF related issues -- the only real FTP issue is that  
the intrusion detection appliance thinks that it has seen a few  
FTP:EXPLOIT:BOUNCE-ATTACK incursions against the open-bio servers  
today. I'm 99% certain that this is a false alarm but I have not  
disabled that particular attack signature yet, if your client is  
trying odd things and redirections (perhaps to get past a NAT gateway  
or firewall) then maybe you are getting caught in this trap.

Random side note:  if anyone tries non-anonymous FTP for longer than  
2 minutes with more than 5 failed login attempts then they are  
candidates for instant inclusion into a "drop all packets from this  
IP" list maintained within the firewall.

Tim - if you send me the IP address of where you are trying to  
connect from I can see if there are any messages on our end.

-Chris
open-bio.org


On Jan 24, 2007, at 12:16 PM, Tim Troup wrote:

> Hi,
>
> Is the EMBOSS FTP site down? It keeps timing out for me.
>
> ftp://emboss.open-bio.org/pub/EMBOSS
>


From arareko at campus.iztacala.unam.mx  Wed Jan 24 18:14:08 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 24 Jan 2007 12:14:08 -0600
Subject: [EMBOSS] EMBOSS FTP Site Down?
In-Reply-To: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
References: <6B6F31BB-0826-4C1C-B463-4B598B68C513@ed.ac.uk>
Message-ID: <45B7A1F0.1090605@campus.iztacala.unam.mx>

It's working ok here. Maybe your DNS can't resolve it.

Mauricio.

Tim Troup wrote:
> Hi,
> 
> Is the EMBOSS FTP site down? It keeps timing out for me.
> 
> ftp://emboss.open-bio.org/pub/EMBOSS
> 
> Tim
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From fangw at CLEMSON.EDU  Wed Jan 24 21:05:47 2007
From: fangw at CLEMSON.EDU (fangw at CLEMSON.EDU)
Date: Wed, 24 Jan 2007 16:05:47 -0500 (EST)
Subject: [EMBOSS] question!
Message-ID: <1100.130.127.150.224.1169672747.squirrel@wm.clemson.edu>

Hi, All:

Does any one know what kind of database version does EMBOSS version
2.10.0. connect to? Since when I use EMBOSS to extract 5'upstream sequence
from some gene, there are a lot of NNNN in the beginning in the output
file, which is different from the result which I get manualy from Ensembl
BioMart. BioMart give the full sequence which match my request, but must
be done manually.
Looking forward to your reply! ^_^

Nice day,
Fang Wang


From maoj at helix.nih.gov  Thu Jan 25 13:41:55 2007
From: maoj at helix.nih.gov (Jean Mao)
Date: Thu, 25 Jan 2007 08:41:55 -0500
Subject: [EMBOSS] question about display double-stranded DNA
Message-ID: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>

Hi,

When using remap, I prefer to use the '-noreverse' flag so that the
translation of my DNA is located closer to my DNA strand. However, using
this flag also remove the complementary strand of my DNA in the output which
is less convinient when design primers. Is there a way in remap to display
double-stranded DNA but turn off the restriction sites of the complementary
strand?

If not, is there a program in EMBOSS which can retrieve the sequence from
database, select start/end points and display both strands? I tried seqret
but failed.

Thank you  in advance.

Jean Mao


From pmr at ebi.ac.uk  Thu Jan 25 14:23:00 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 25 Jan 2007 14:23:00 +0000
Subject: [EMBOSS] question about display double-stranded DNA
In-Reply-To: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
Message-ID: <45B8BD44.9050509@ebi.ac.uk>

Hi Jean,

> When using remap, I prefer to use the '-noreverse' flag so that the
> translation of my DNA is located closer to my DNA strand. However, using
> this flag also remove the complementary strand of my DNA in the output which
> is less convinient when design primers. Is there a way in remap to display
> double-stranded DNA but turn off the restriction sites of the complementary
> strand?

I am looking at remap changes at the moment, I will see what I can do.

> If not, is there a program in EMBOSS which can retrieve the sequence from
> database, select start/end points and display both strands? I tried seqret
> but failed.

Showseq does that.

It has a bug at present (I noticed it this week - fixed in the next release) 
that makes it show additional bases up to the end of the last line.

regards,

Peter


From pmr at ebi.ac.uk  Thu Jan 25 14:39:12 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 25 Jan 2007 14:39:12 +0000
Subject: [EMBOSS] question about display double-stranded DNA
In-Reply-To: <416B9A34D7CA1C4C9ED58354E75101BB021CB85A@NIHCESMLBX3.nih.gov>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
	<45B8BD44.9050509@ebi.ac.uk>
	<416B9A34D7CA1C4C9ED58354E75101BB021CB85A@NIHCESMLBX3.nih.gov>
Message-ID: <45B8C110.6040608@ebi.ac.uk>

Hi Jean,

> Peter, Thanks for reply. seqret can retrieve entry and select start/end
> points. But seqret does NOT display both strands. Does it?

Right. Seqret returns a sequence, so it can only rpeort one strand at a time.

> Showseq does that.
> 
> It has a bug at present (I noticed it this week - fixed in the next release)
> that makes it show additional bases up to the end of the last line.

Oops. Spoke too soon. showseq uses the dame display functions as remap and has
the same limitations.

I will see what we can do for the next release.

regards,

Peter


From gbottu at ben.vub.ac.be  Thu Jan 25 15:39:15 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Thu, 25 Jan 2007 16:39:15 +0100
Subject: [EMBOSS] question about display double-stranded DNA - Checked
	by An
In-Reply-To: <45B8BD44.9050509@ebi.ac.uk>
References: <001501c74086$9413a820$be4de780@CIT.NIH.GOV>
	<45B8BD44.9050509@ebi.ac.uk>
Message-ID: <20070125153915.GA30474@bigben.ulb.ac.be>

On Thu, Jan 25, 2007 at 02:23:00PM +0000, Peter Rice wrote:
> I am looking at remap changes at the moment, I will see what I can do.

Could you consider an option to reject restriction enzymes that cut within 
a certain range (or ranges). This feature existed in GCG and is really 
something we would like to have (back). Allows e.g. to select enzymes that 
cut around the gene you want to clone, but not inside.

	Guy Bottu,
	BEN


From golharam at umdnj.edu  Thu Jan 25 17:31:22 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Thu, 25 Jan 2007 12:31:22 -0500
Subject: [EMBOSS] request for a useful addition
Message-ID: <08780548D8B54EF086EAEEB74AE41C2D@PICO>

The program dottup (and other dotplot tools) takes the two sequence given
and displays a dotplot.  It would be useful if you could give it the option
to reverse complement one of the sequences then perform a dotplot.
 
I was comparing a mRNA with a genomic sequence and wasn't seeing the
similarity between the sequences.  Then I releazied I needed to rev-comp the
mRNA and it showed up fine.  Of course, one can do this using revseq, but to
have dottup do it for you would be even better. 
 
Ryan
 

From David.Bauer at SCHERING.DE  Fri Jan 26 06:46:43 2007
From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE)
Date: Fri, 26 Jan 2007 07:46:43 +0100
Subject: [EMBOSS] Antwort:  request for a useful addition
In-Reply-To: <08780548D8B54EF086EAEEB74AE41C2D@PICO>
Message-ID: <OF3B055E59.308DA05C-ONC125726F.0024FD44-C125726F.00253DF6@schering.de>

This can be done with the option -sreverse1 or -sreverse2 to use the
reverse complement of the firts or second sequence used as input for e.g.
dottup. It is a standard option available to all emboss programs. You can
get a list of those options with -help -verbose.

David.

emboss-bounces at lists.open-bio.org schrieb am 25/01/2007 18:31:22:

> The program dottup (and other dotplot tools) takes the two sequence
given
> and displays a dotplot.  It would be useful if you could give it the
option
> to reverse complement one of the sequences then perform a dotplot.
>
> I was comparing a mRNA with a genomic sequence and wasn't seeing the
> similarity between the sequences.  Then I releazied I needed to rev-comp
the
> mRNA and it showed up fine.  Of course, one can do this using revseq,
but to
> have dottup do it for you would be even better.
>
> Ryan
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From golharam at umdnj.edu  Fri Jan 26 17:19:53 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Fri, 26 Jan 2007 12:19:53 -0500
Subject: [EMBOSS] Antwort:  request for a useful addition
In-Reply-To: <A41E1A663B11424AA975D12187E3211C@PICO>
Message-ID: <92EAFE9D771342E99D302B254E3CC6F3@PICO>

Thanks.  I didn't see this in the list of options in 
EMBOSS-Explorer.  Luke, perhaps this can be added as an 
option to the interface?

Ryan


> > -----Original Message-----
> > From: David.Bauer at SCHERING.DE [mailto:David.Bauer at SCHERING.DE]
> > Sent: Friday, January 26, 2007 1:47 AM
> > To: golharam at umdnj.edu
> > Cc: emboss at lists.open-bio.org; emboss-bounces at lists.open-bio.org
> > Subject: Antwort: [EMBOSS] request for a useful addition
> > 
> > 
> > This can be done with the option -sreverse1 or -sreverse2 to
> > use the reverse complement of the firts or second sequence 
> > used as input for e.g. dottup. It is a standard option 
> > available to all emboss programs. You can get a list of those 
> > options with -help -verbose.
> > 
> > David.
> > 
> > emboss-bounces at lists.open-bio.org schrieb am 25/01/2007 18:31:22:
> > 
> > > The program dottup (and other dotplot tools) takes the 
> two sequence
> > given
> > > and displays a dotplot.  It would be useful if you could 
> give it the
> > option
> > > to reverse complement one of the sequences then perform a dotplot.
> > >
> > > I was comparing a mRNA with a genomic sequence and wasn't
> > seeing the
> > > similarity between the sequences.  Then I releazied I needed to
> > > rev-comp
> > the
> > > mRNA and it showed up fine.  Of course, one can do this
> > using revseq,
> > but to
> > > have dottup do it for you would be even better.
> > >
> > > Ryan
> > >
> > >
> > > _______________________________________________
> > > EMBOSS mailing list
> > > EMBOSS at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/emboss
> > 
> > 
> 
> 


From gbottu at ben.vub.ac.be  Tue Jan 30 09:31:35 2007
From: gbottu at ben.vub.ac.be (Guy Bottu)
Date: Tue, 30 Jan 2007 10:31:35 +0100
Subject: [EMBOSS] [dksamuel@gmail.com: Re: question about display
	double-strande
Message-ID: <20070130093135.GA10496@bigben.ulb.ac.be>

----- Forwarded message from Duleep Samuel <dksamuel at gmail.com> -----

DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=SAzlXlDHdfFJk9cAn6wcMj/Nn8r9SHt3gK528ZaV2wJy2V2yaFiRkPGz3LX4FUAWMl2/Xl582TcZ4BZE6lTi8wAL21S2mv5V4fiAYjp9LM0RHYGDLW9v/xSR8t3N7dvlzEyH0LGk7ejUlYOJQNo9/PYCJP0BJl5oATVEMq9B0xU=
Date: Sat, 27 Jan 2007 10:25:55 +0530
From: "Duleep Samuel" <dksamuel at gmail.com>
To: "Guy Bottu" <gbottu at ben.vub.ac.be>
Subject: Re: [EMBOSS] question about display double-stranded DNA - Checked by An
In-Reply-To: <20070125153915.GA30474 at bigben.ulb.ac.be>
X-AntiVirus: checked by AntiVir Milter 1.0.6; AVE 7.3.0.32; VDF 6.37.0.228

will be useful please add if possible, regards Samuel

On 1/25/07, Guy Bottu <gbottu at ben.vub.ac.be> wrote:
>On Thu, Jan 25, 2007 at 02:23:00PM +0000, Peter Rice wrote:
>> I am looking at remap changes at the moment, I will see what I can do.
>
>Could you consider an option to reject restriction enzymes that cut within
>a certain range (or ranges). This feature existed in GCG and is really
>something we would like to have (back). Allows e.g. to select enzymes that
>cut around the gene you want to clone, but not inside.
>
>        Guy Bottu,
>        BEN
>
>_______________________________________________
>EMBOSS mailing list
>EMBOSS at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/emboss
>

----- End forwarded message -----