From fernan at iib.unsam.edu.ar  Tue Oct  2 13:54:05 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Tue, 2 Oct 2007 14:54:05 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
Message-ID: <20071002175405.GA62945@iib.unsam.edu.ar>

Hi,

I've installed TrEMBL in EMBOSS and it seems like I'm having some
problems ... 

I've run dbiflat as follows:

dbiflat -dbname trembl -idformat EMBL -directory .
-filenames uniprot_trembl.dat -release '37.0' -date '24/07/07' 
-fields sv,acc,des,key,org

I've put an entry in my emboss.default configuration
file and the db is listed by showdb.

Also the db seems to works fine with, for example
'textsearch':

[fernan at alfa ~]$ textsearch trembl:* 'cyclase'
Search sequence documentation. Slow, use SRS and Entrez!
Output file [a0b532_mettp.textsearch]: stdout
# Search for: cyclase
trembl-id:A0B532_METTP  A0B532_METTP  A0B532	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A1RWP7_THEPD  A1RWP7_THEPD  A1RWP7	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A2SR85_METLZ  A2SR85_METLZ  A2SR85    Cyclase family protein.
trembl-id:A3H5Q9_9CREN  A3H5Q9_9CREN  A3H5Q9	Magnesium-protoporphyrin IX monomethyl ester (Oxidative) cyclase (EC 1.14.13.81).
trembl-id:A3H7Y6_9CREN  A3H7Y6_9CREN  A3H7Y6	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A6URB1_METVA  A6URB1_METVA  A6URB1    Cyclase family protein.
...

First, I've got a number of warnings when running dbiflat.
Because all of them were about null IDs ('') I've just
ignored them ... I mention it just in case,
Warning: Duplicate ID skipped: '' All hits will point to first ID found

Now, when using seqret, it seems like I'm not getting the
records I expect, for example if I search for the first ID
in the example above (A0B532), I get A0BDZ0 instead:

[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences
output sequence(s) [a0bdz0_parte.fasta]: stdout
>A0BDZ0_PARTE A0BDZ0 Chromosome undetermined scaffold_101, whole genome shotgun sequence.
MLNFPQNARDHFSCDCDPCEFAITHGEEIMPKRVPPQKPIQQIQDKDLGLLLRKLQAPNK
LTRSVRIRIPETCVCNEGEIKFIAYYDESEGFIKFIQKPTFQQTKQFLNERRPPDSLAVI
IKYIDNNMQVMTDMEFTILMMKRKIDPIWSQILYIQNFNSNKNYELQHYEFKHSFDSKYP
EFDLARIEILILNGEIARASSDFVPMVREEAYENSLSQDQYCRYMVYKMVHYADVFGGIQ
ITEGKFSFHKKTFISMEKMEYTDLDRKALFDSEILLRKKKMIDEDMFQFQKLIDQNVKKE
REYALKVYREILDMDNGLDQQSHLLKNKLSVIGYDLKKYSQSIQSNFQQVMVSKDPASTL
KELVIEQKVNEEKLTSILKPKKGEKTKKKM

But if I search for A0B532_METTP I get nothing:
[fernan at alfa ~]$ seqret trembl:A0B532_METTP
Reads and writes (returns) sequences
Error: Unable to read sequence 'trembl:A0B532_METTP'
Died: seqret terminated: Bad value for '-sequence' and no prompt


Now, if I search for A0BDZ0, I get A0BL81 instead:

[fernan at alfa ~]$ seqret trembl:A0BDZ0
Reads and writes (returns) sequences
output sequence(s) [a0bl81_parte.fasta]: stdout
>A0BL81_PARTE A0BL81 Chromosome undetermined scaffold_113, whole genome shotgun sequence.
MKQISESAHILQKVYNPNRMNKLFMTTHYQLQNETDLIFDKYMLMPLFGLSVANGISSNC
IKPKYLCSEYKKQELYDCNLILILSAYSDQAVYRSKTMYEKRNGLEQIFKYLASPNYTYN
IHISLLSYFVPQRVFYKQVLQALNIFELIDQKQIEELTKSSSIINQSVGEDNLDSILFKN
QEFIDYQKWRRMLKNNTIINLKTLHQHQLSQQIFCQYFLRYHYYQGCEEEINKLNKFLVD
DFDMFKFRSRLEHNEKKMKFYFLRMLKYFKLNEKLEIFLKFSFKSYSLDWNKELLREMKN
SLNQYKKQ

Any idea about what is wrong? I also have swissprot
installed (pretty much in the same way) and it works OK with
seqret, both using ACs (Q4U9M9) or IDs (104K_THEAN).

This is on a Linux cluster (Rocks 4.2, with EMBOSS installed from the
Bio roll)

[fernan at alfa ~]$ embossversion 
Writes the current EMBOSS version number
4.0.0

Thanks in advance for any pointer,

Fernan


From simon.andrews at bbsrc.ac.uk  Wed Oct  3 03:37:53 2007
From: simon.andrews at bbsrc.ac.uk (Simon Andrews)
Date: Wed, 3 Oct 2007 08:37:53 +0100
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071002175405.GA62945@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
Message-ID: <CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>


On 2 Oct 2007, at 18:54, Fernan Aguero wrote:

> Hi,
>
> I've installed TrEMBL in EMBOSS and it seems like I'm having some
> problems ...
>
> I've run dbiflat as follows:
[snip]
>
> Now, when using seqret, it seems like I'm not getting the
> records I expect, for example if I search for the first ID
> in the example above (A0B532), I get A0BDZ0 instead:

I suspect your problem is that your trembl file is >2Gb in size.   
Above this size dbiflat won't work properly and will give wacky  
results such as the ones you've shown.  This won't be a problem with  
uniprot_sprot.dat as this is still only about 1.1Gb.

Your choices are therefore:

1) You could split your trembl file into multiple files, each smaller  
than 2Gb.  This ends up being a complete pain, and you probably don't  
want to do it this way.

2) Use the newer dbx* family of indexing programs which can cope with  
larger file sizes.  In your case you'd use dbxflat instead of  
dbiflat.  There are some configuration differences between the two so  
you should read 'tfm dbxflat' first, but they work pretty much the  
same as the old versions.  We use the dbx programs for all of our  
databases and they work fine.

Hope this helps

Simon.


From gbottu at vub.ac.be  Thu Oct  4 06:01:45 2007
From: gbottu at vub.ac.be (Guy Bottu)
Date: Thu, 04 Oct 2007 12:01:45 +0200
Subject: [EMBOSS] Question about acidify
Message-ID: <4704BA09.1000905@vub.ac.be>

	Dear Peter, dear Alan, dear all,

I remember that there had been question of implementing a tool called 
acidify that would allow for the easy integration of software under 
EMBOSS (with the help of an ACD file but without elaborate EMBOSS 
"wrapper" progrm). Can someone tell me how far this has gone. I ask this 
question because my colleagues of the SIMDAT project have expressed 
their interest.

	Guy Bottu,
	BEN

From pmr at ebi.ac.uk  Thu Oct  4 06:40:48 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 04 Oct 2007 11:40:48 +0100
Subject: [EMBOSS] Question about acidify
In-Reply-To: <4704BA09.1000905@vub.ac.be>
References: <4704BA09.1000905@vub.ac.be>
Message-ID: <4704C330.6070102@ebi.ac.uk>

Guy Bottu wrote:
> I remember that there had been question of implementing a tool called 
> acidify that would allow for the easy integration of software under 
> EMBOSS (with the help of an ACD file but without elaborate EMBOSS 
> "wrapper" progrm). Can someone tell me how far this has gone. I ask this 
> question because my colleagues of the SIMDAT project have expressed 
> their interest.

We are working on making this easier in ACD. I added some functions when Alan 
was writing wrappers for MIRA.

We already have ACD extensions for SoapLab to provide additional definitions for 
external applications. These are used to generate the XML definitions used by 
SoapLab for non-EMBOSS applications, but can be generally useful.

Do you have examples of the ACD files that would be useful for SIMDAT? Are any 
new datatypes involved?

regards,

Peter

From fernan at iib.unsam.edu.ar  Thu Oct  4 10:08:22 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Thu, 4 Oct 2007 11:08:22 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
Message-ID: <20071004140822.GA96432@iib.unsam.edu.ar>


| On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
| 
| > Hi,
| >
| > I've installed TrEMBL in EMBOSS and it seems like I'm having some
| > problems ...
| >
| > I've run dbiflat as follows:
| [snip]
| >
| > Now, when using seqret, it seems like I'm not getting the
| > records I expect, for example if I search for the first ID
| > in the example above (A0B532), I get A0BDZ0 instead:
| 
| I suspect your problem is that your trembl file is >2Gb in size.   
| Above this size dbiflat won't work properly and will give wacky  
| results such as the ones you've shown.  This won't be a problem with  
| uniprot_sprot.dat as this is still only about 1.1Gb.
| 
| Your choices are therefore:
| 
| 1) You could split your trembl file into multiple files, each smaller  
| than 2Gb.  This ends up being a complete pain, and you probably don't  
| want to do it this way.
| 
| 2) Use the newer dbx* family of indexing programs which can cope with  
| larger file sizes.  In your case you'd use dbxflat instead of  
| dbiflat.  There are some configuration differences between the two so  
| you should read 'tfm dbxflat' first, but they work pretty much the  
| same as the old versions.  We use the dbx programs for all of our  
| databases and they work fine.
| 
| Hope this helps
| 
| Simon.
 
Simon,

thanks for your suggestions. I've been waiting for dbxflat
to finish before replying ... thus the delay.

You mention that there are some configuration
differences between db(x|i)flat  ... I guess I've got into those
now ... even after reading tfm for dbxflat, it seems I can't
just set it up right

===> Configuration
DB trembl [
        type: P
        comment: "TrEMBL 37.0"
        method: emblcd
        format: embl
        dbalias: trembl
        dir: /share/bio/emboss/trembl/
        file: uniprot_trembl.dat
        indexdirectory: /share/bio/emboss/trembl
]

With this configuration, I get this error:
[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences
Warning: Cannot open division file '<null>' for database 'trembl'
Warning: seqCdQry failed
Error: Unable to read sequence 'trembl:A0B532'
Died: seqret terminated: Bad value for '-sequence' and no prompt

If I change the 'method' to 'method: emboss'
as per the example in the dbxflat docs, I get this error:

[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences

   EMBOSS An error in ajindex.c at line 3028:
Cannot open param file /share/bio/emboss/trembl/trembl.pxid

This file does not exist (see result of indexing below):

===> Indexing
[root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
-directory . -filenames uniprot_trembl.dat -release "37.0"
-date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
indexing for flat file databases
Resource name: embl
Processing file ./uniprot_trembl.dat
[root at alfa trembl]# du -hc *
4.0K    dbxflat.command
4.0K    trembl.ent
4.0K    trembl.pxac
4.0K    trembl.pxde
4.0K    trembl.pxkw
4.0K    trembl.pxsv
4.0K    trembl.pxtx
572M    trembl.xac
4.2G    trembl.xde
381M    trembl.xkw
4.0K    trembl.xsv
3.0G    trembl.xtx
11G     uniprot_trembl.dat
19G     total

I've also tried other combinations of 'method' (emboss,
emblcd) and 'format' (swiss, embl) without success ...

Am I indexing the db with the right incantation for dbxflat?
If so, what am I missing in my configuration?

Thanks again for any pointer,

Fernan

PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
CentOS)


From georgios at biotek.uio.no  Thu Oct  4 10:53:38 2007
From: georgios at biotek.uio.no (George Magklaras)
Date: Thu, 04 Oct 2007 16:53:38 +0200
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071004140822.GA96432@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>	<CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
	<20071004140822.GA96432@iib.unsam.edu.ar>
Message-ID: <4704FE72.1090206@biotek.uio.no>

Maybe you are missing the resource record in the emboss.default file for 
the trembl databank and you have passed the wrong arguments to dbxflat. 
  You should choose the emboss method in the DB entry. Then, the 
emboss.default file should contain also a resource entry for trembl:

RES trembl [
    type: Index
    idlen:  15
    acclen: 15
    svlen:  20
    keylen: 30
    deslen: 25
    orglen: 25
]

 From your dbxflat output you quote I can see that the command points to 
the embl resource:

[root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL <--- Why EMBL?
-directory . -filenames uniprot_trembl.dat -release "37.0"
-date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
indexing for flat file databases
Resource name: embl  <--- That should say trembl, Why did you choose 
embl here?


When the dbxflat command asked you for a resource name, you really 
should have a trembl RES entry and I am not sure that your idformat 
(EMBL) is correct.


GM


-- 
--
George Magklaras

Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/

EMBnet Norway:	http://www.no.embnet.org/


Fernan Aguero wrote:
>  
> | On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
> | 
> | > Hi,
> | >
> | > I've installed TrEMBL in EMBOSS and it seems like I'm having some
> | > problems ...
> | >
> | > I've run dbiflat as follows:
> | [snip]
> | >
> | > Now, when using seqret, it seems like I'm not getting the
> | > records I expect, for example if I search for the first ID
> | > in the example above (A0B532), I get A0BDZ0 instead:
> | 
> | I suspect your problem is that your trembl file is >2Gb in size.   
> | Above this size dbiflat won't work properly and will give wacky  
> | results such as the ones you've shown.  This won't be a problem with  
> | uniprot_sprot.dat as this is still only about 1.1Gb.
> | 
> | Your choices are therefore:
> | 
> | 1) You could split your trembl file into multiple files, each smaller  
> | than 2Gb.  This ends up being a complete pain, and you probably don't  
> | want to do it this way.
> | 
> | 2) Use the newer dbx* family of indexing programs which can cope with  
> | larger file sizes.  In your case you'd use dbxflat instead of  
> | dbiflat.  There are some configuration differences between the two so  
> | you should read 'tfm dbxflat' first, but they work pretty much the  
> | same as the old versions.  We use the dbx programs for all of our  
> | databases and they work fine.
> | 
> | Hope this helps
> | 
> | Simon.
>  
> Simon,
> 
> thanks for your suggestions. I've been waiting for dbxflat
> to finish before replying ... thus the delay.
> 
> You mention that there are some configuration
> differences between db(x|i)flat  ... I guess I've got into those
> now ... even after reading tfm for dbxflat, it seems I can't
> just set it up right
> 
> ===> Configuration
> DB trembl [
>         type: P
>         comment: "TrEMBL 37.0"
>         method: emblcd
>         format: embl
>         dbalias: trembl
>         dir: /share/bio/emboss/trembl/
>         file: uniprot_trembl.dat
>         indexdirectory: /share/bio/emboss/trembl
> ]
> 
> With this configuration, I get this error:
> [fernan at alfa ~]$ seqret trembl:A0B532
> Reads and writes (returns) sequences
> Warning: Cannot open division file '<null>' for database 'trembl'
> Warning: seqCdQry failed
> Error: Unable to read sequence 'trembl:A0B532'
> Died: seqret terminated: Bad value for '-sequence' and no prompt
> 
> If I change the 'method' to 'method: emboss'
> as per the example in the dbxflat docs, I get this error:
> 
> [fernan at alfa ~]$ seqret trembl:A0B532
> Reads and writes (returns) sequences
> 
>    EMBOSS An error in ajindex.c at line 3028:
> Cannot open param file /share/bio/emboss/trembl/trembl.pxid
> 
> This file does not exist (see result of indexing below):
> 
> ===> Indexing
> [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
> -directory . -filenames uniprot_trembl.dat -release "37.0"
> -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
> indexing for flat file databases
> Resource name: embl
> Processing file ./uniprot_trembl.dat
> [root at alfa trembl]# du -hc *
> 4.0K    dbxflat.command
> 4.0K    trembl.ent
> 4.0K    trembl.pxac
> 4.0K    trembl.pxde
> 4.0K    trembl.pxkw
> 4.0K    trembl.pxsv
> 4.0K    trembl.pxtx
> 572M    trembl.xac
> 4.2G    trembl.xde
> 381M    trembl.xkw
> 4.0K    trembl.xsv
> 3.0G    trembl.xtx
> 11G     uniprot_trembl.dat
> 19G     total
> 
> I've also tried other combinations of 'method' (emboss,
> emblcd) and 'format' (swiss, embl) without success ...
> 
> Am I indexing the db with the right incantation for dbxflat?
> If so, what am I missing in my configuration?
> 
> Thanks again for any pointer,
> 
> Fernan
> 
> PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
> CentOS)
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 


From fernan at iib.unsam.edu.ar  Thu Oct  4 18:41:44 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Thu, 4 Oct 2007 19:41:44 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <4704FE72.1090206@biotek.uio.no>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<4704FE72.1090206@biotek.uio.no>
Message-ID: <20071004224144.GA98760@iib.unsam.edu.ar>

George, 

thanks for your points.

| Maybe you are missing the resource record in the emboss.default file for 
| the trembl databank and you have passed the wrong arguments to dbxflat. 

I have this resource record in my emboss.default conf

RES embl [ type: Index
  idlen:  15
  acclen: 15
  svlen:  15
  keylen: 25
  deslen: 25
  orglen: 25
]

|   You should choose the emboss method in the DB entry. 

OK

| Then, the 
| emboss.default file should contain also a resource entry for trembl:
| 
| RES trembl [
|     type: Index
|     idlen:  15
|     acclen: 15
|     svlen:  20
|     keylen: 30
|     deslen: 25
|     orglen: 25
| ]

Does the name of the resource matter? Mine is named 'embl' ...

|  From your dbxflat output you quote I can see that the command points to 
| the embl resource:
| 
| [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL <--- Why EMBL?

What other options are there SWISS? GCG? GENBANK? This is AFAIK an
EMBL formatted file. But maybe I'm wrong ...

| -directory . -filenames uniprot_trembl.dat -release "37.0"
| -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
| indexing for flat file databases
| Resource name: embl  <--- That should say trembl, Why did you choose 
| embl here?

Because the resource in my emboss.default file is named 'embl'.

| 
| When the dbxflat command asked you for a resource name, you really 
| should have a trembl RES entry and I am not sure that your idformat 
| (EMBL) is correct.
|
| GM
| --
| George Magklaras

Mmm ... maybe it's SWISS then?

>From the dbxflat docs:
      EMBL : EMBL
     SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
        GB : Genbank, DDBJ
    REFSEQ : Refseq
Entry format [SWISS]: 

Thanks for your questions and pointers. I'm running dbxflat
overnight again to see if this makes any difference
(-idformat SWISS -resource trembl, with a new trembl RES
line added to emboss.default). But so far, only 6 trembl.*
files are being produced and none of them is called
trembl.pxid (as per the error in my original message, see
below).

[root at alfa trembl]# ls trembl.*
trembl.ent  trembl.xac  trembl.xde  trembl.xkw  trembl.xsv trembl.xtx

Fernan

PS: this is the first entry in my uniprot_trembl.dat file

[fernan at alfa trembl]$ head -45 uniprot_trembl.dat 
ID   A0B532_METTP            Unreviewed;       337 AA.
AC   A0B532;
DT   28-NOV-2006, integrated into UniProtKB/TrEMBL.
DT   28-NOV-2006, sequence version 1.
DT   24-JUL-2007, entry version 6.
DE   RNA-3'-phosphate cyclase (EC 6.5.1.4).
GN   OrderedLocusNames=Mthe_0003;
OS   Methanosaeta thermophila (strain DSM 6194 / PT) (Methanothrix
OS   thermophila (strain DSM 6194 / PT)).
OC   Archaea; Euryarchaeota; Methanomicrobia; Methanosarcinales;
OC   Methanosaetaceae; Methanosaeta.
OX   NCBI_TaxID=349307;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RG   US DOE Joint Genome Institute;
RA   Copeland A., Lucas S., Lapidus A., Barry K., Detter J.C.,
RA   Glavina del Rio T., Hammon N., Israni S., Pitluck S., Chain P.,
RA   Malfatti S., Shin M., Vergez L., Schmutz J., Larimer F., Land M.,
RA   Hauser L., Kyrpides N., Kim E., Smith K.S., Ingram-Smith C.,
RA   Richardson P.;
RT   "Complete sequence of Methanosaeta thermophila PT.";
RL   Submitted (OCT-2006) to the EMBL/GenBank/DDBJ databases.
CC   -----------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution-NoDerivs License
CC   -----------------------------------------------------------------------
DR   EMBL; CP000477; ABK13806.1; -; Genomic_DNA.
DR   GenomeReviews; CP000477_GR; Mthe_0003.
DR   GO; GO:0003963; F:RNA-3'-phosphate cyclase activity; IEA:InterPro.
DR   InterPro; IPR000228; RNA3'_term_phos_cycl.
DR   InterPro; IPR013796; RNA3'_term_phos_cycl_insert.
DR   PANTHER; PTHR11096; RNA3'_term_phos_cycl; 1.
DR   Pfam; PF01137; RTC; 1.
DR   Pfam; PF05189; RTC_insert; 1.
DR   PROSITE; PS01287; RTC; 1.
PE   4: Predicted;
KW   Complete proteome; Ligase.
SQ   SEQUENCE   337 AA;  36340 MW;  69F26755A1B8DA03 CRC64;
     MNKPQMIEID GSYGEGGGQI VRTSVALSTL TGIPVRIKNI RRNRPRPGLA AQHVRAIEAL
     AQISRAETRG VHLGSEEIEF IPGRISAGSY DVDIGTAGSV TLLIQCLLPA LTAAEGPVTV
     TVRGGTDVRW SPTVDYLEHV ALPAMHLFGV TATFRCERRG YYPRGGGVVV LSTRPSRLRP
     ARLELIEEGI CGISHCGSLP EHVARRQADA ALELLKEKGY DARIDIQTMS SSSPGSGITL
     WSGFRGSSAL GERGVRAEDV GREAAKALID ELKSKASVDV HLADQLIPYI ALAGGEYTTR
     EISSHTRTNI WTAQRILRCR IDIDEGEVFR IHSTGSG
//


| Fernan Aguero wrote:
| >  
| > | On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
| > | 
| > | > Hi,
| > | >
| > | > I've installed TrEMBL in EMBOSS and it seems like I'm having some
| > | > problems ...
| > | >
| > | > I've run dbiflat as follows:
| > | [snip]
| > | >
| > | > Now, when using seqret, it seems like I'm not getting the
| > | > records I expect, for example if I search for the first ID
| > | > in the example above (A0B532), I get A0BDZ0 instead:
| > | 
| > | I suspect your problem is that your trembl file is >2Gb in size.   
| > | Above this size dbiflat won't work properly and will give wacky  
| > | results such as the ones you've shown.  This won't be a problem with  
| > | uniprot_sprot.dat as this is still only about 1.1Gb.
| > | 
| > | Your choices are therefore:
| > | 
| > | 1) You could split your trembl file into multiple files, each smaller  
| > | than 2Gb.  This ends up being a complete pain, and you probably don't  
| > | want to do it this way.
| > | 
| > | 2) Use the newer dbx* family of indexing programs which can cope with  
| > | larger file sizes.  In your case you'd use dbxflat instead of  
| > | dbiflat.  There are some configuration differences between the two so  
| > | you should read 'tfm dbxflat' first, but they work pretty much the  
| > | same as the old versions.  We use the dbx programs for all of our  
| > | databases and they work fine.
| > | 
| > | Hope this helps
| > | 
| > | Simon.
| >  
| > Simon,
| > 
| > thanks for your suggestions. I've been waiting for dbxflat
| > to finish before replying ... thus the delay.
| > 
| > You mention that there are some configuration
| > differences between db(x|i)flat  ... I guess I've got into those
| > now ... even after reading tfm for dbxflat, it seems I can't
| > just set it up right
| > 
| > ===> Configuration
| > DB trembl [
| >         type: P
| >         comment: "TrEMBL 37.0"
| >         method: emblcd
| >         format: embl
| >         dbalias: trembl
| >         dir: /share/bio/emboss/trembl/
| >         file: uniprot_trembl.dat
| >         indexdirectory: /share/bio/emboss/trembl
| > ]
| > 
| > With this configuration, I get this error:
| > [fernan at alfa ~]$ seqret trembl:A0B532
| > Reads and writes (returns) sequences
| > Warning: Cannot open division file '<null>' for database 'trembl'
| > Warning: seqCdQry failed
| > Error: Unable to read sequence 'trembl:A0B532'
| > Died: seqret terminated: Bad value for '-sequence' and no prompt
| > 
| > If I change the 'method' to 'method: emboss'
| > as per the example in the dbxflat docs, I get this error:
| > 
| > [fernan at alfa ~]$ seqret trembl:A0B532
| > Reads and writes (returns) sequences
| > 
| >    EMBOSS An error in ajindex.c at line 3028:
| > Cannot open param file /share/bio/emboss/trembl/trembl.pxid
| > 
| > This file does not exist (see result of indexing below):
| > 
| > ===> Indexing
| > [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
| > -directory . -filenames uniprot_trembl.dat -release "37.0"
| > -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
| > indexing for flat file databases
| > Resource name: embl
| > Processing file ./uniprot_trembl.dat
| > [root at alfa trembl]# du -hc *
| > 4.0K    dbxflat.command
| > 4.0K    trembl.ent
| > 4.0K    trembl.pxac
| > 4.0K    trembl.pxde
| > 4.0K    trembl.pxkw
| > 4.0K    trembl.pxsv
| > 4.0K    trembl.pxtx
| > 572M    trembl.xac
| > 4.2G    trembl.xde
| > 381M    trembl.xkw
| > 4.0K    trembl.xsv
| > 3.0G    trembl.xtx
| > 11G     uniprot_trembl.dat
| > 19G     total
| > 
| > I've also tried other combinations of 'method' (emboss,
| > emblcd) and 'format' (swiss, embl) without success ...
| > 
| > Am I indexing the db with the right incantation for dbxflat?
| > If so, what am I missing in my configuration?
| > 
| > Thanks again for any pointer,
| > 
| > Fernan
| > 
| > PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
| > CentOS)
| > 
| > _______________________________________________
| > EMBOSS mailing list
| > EMBOSS at lists.open-bio.org
| > http://lists.open-bio.org/mailman/listinfo/emboss
| > 
| 
| 
| 
| 
| _______________________________________________
| EMBOSS mailing list
| EMBOSS at lists.open-bio.org
| http://lists.open-bio.org/mailman/listinfo/emboss
| 
|
+----]


From sum732 at mail.usask.ca  Fri Oct  5 19:38:01 2007
From: sum732 at mail.usask.ca (Sudeep Mehrotra)
Date: Fri, 05 Oct 2007 17:38:01 -0600
Subject: [EMBOSS] Seqret and searching a database with entries in a file
Message-ID: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>

Hello,
I am wondering if  I can use "seqret" from EMBOSS to perform  
following action.

I have a database and I have a file which consists of list of protein  
IDs. I want use seqret to search each entry (in the given file) in  
the given database and output the search into another file.
for example:
seqret "path to the database":AAT37944.1.
If I use the above mentioned command on command line, I get the  
output (protein name, protein sequence etc) in fasta format  
consisting the entry. What I want to do is instead of giving one  
entry I want to give the whole file, which consists of similar entries.

Can some one help me here.
Thanks
Sudeep

From david.bauer at bayerhealthcare.com  Sat Oct  6 15:13:34 2007
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Sat, 6 Oct 2007 21:13:34 +0200
Subject: [EMBOSS] Seqret and searching a database with entries in a file
In-Reply-To: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
Message-ID: <OFDF8329EB.08CB3542-ONC125736C.0068FB65-C125736C.00699D5A@schering.de>

Hi Sudeep,

if you add a "@" character in front of a filename, EMBOSS interprets this 
as a "file of filenames".
So you can put all your IDs including the database name into a file (e.g. 
myseqs.fof).
Then you run "seqret @myseqs.fof".

Cheers,
David.

emboss-bounces at lists.open-bio.org schrieb am 06/10/2007 01:38:01:

> Hello,
> I am wondering if  I can use "seqret" from EMBOSS to perform 
> following action.
> 
> I have a database and I have a file which consists of list of protein 
> IDs. I want use seqret to search each entry (in the given file) in 
> the given database and output the search into another file.
> for example:
> seqret "path to the database":AAT37944.1.
> If I use the above mentioned command on command line, I get the 
> output (protein name, protein sequence etc) in fasta format 
> consisting the entry. What I want to do is instead of giving one 
> entry I want to give the whole file, which consists of similar entries.
> 
> Can some one help me here.
> Thanks
> Sudeep
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From gbottu at vub.ac.be  Mon Oct  8 03:12:18 2007
From: gbottu at vub.ac.be (Guy Bottu)
Date: Mon, 08 Oct 2007 09:12:18 +0200
Subject: [EMBOSS] Seqret and searching a database with entries in a file
In-Reply-To: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
References: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
Message-ID: <4709D852.20007@vub.ac.be>

Sudeep Mehrotra wrote:
> I have a database and I have a file which consists of list of protein  
> IDs. I want use seqret to search each entry (in the given file) in  
> the given database and output the search into another file.

	Dear Sudeep,

If you can, using some script, transform your file into format :

xxx:AC3355
xxx:CG6754
xxx:AV6754

with xxx the name of the databank (you might have to use bare accession 
numbers rather than version numbers), then it is easy, just run

seqret list::File

If you want the original entries rather than the entries in fastA 
format, use entret instead of seqret.

	Guy Bottu,
	Belgian EMBnet Node


From charles-listes-emboss at plessy.org  Mon Oct  8 02:30:50 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Mon, 8 Oct 2007 15:30:50 +0900
Subject: [EMBOSS] About the EMBOSS quick guide.
Message-ID: <20071008063047.GB9819@kunpuu.plessy.org>

Dear EMBOSS developpers,

I am member of a packaging team that takes care of integrating EMBOSS in
Debian. I just realised today that the Quick Guide to EMBOSS is
released under a "noncommercial" licence.

file:///usr/share/EMBOSS/doc/manuals/emboss_qg.pdf

Debian puts a strong emphasis on not mixing programs which do not meet
the "Debian Free Software Guidelines" (DFSG) with the ones which do. In
our case, EMBOSS is free according to the DFSG, but not the Quick Guide,
as restrictions on commercial use do not comply whith the guideline
number 6:

  No Discrimination Against Fields of Endeavor

  The license must not restrict anyone from making use of the program in a
  specific field of endeavor. For example, it may not restrict the program
  from being used in a business, or from being used for genetic research.

>From my packager point of view, the simplest way to solve this problem
would be that you relicence the Quick Guide under a free licence
according to the DFSG, such as BSD or GPL for instance. Unfortunately,
the guide's author, David Martin, left EMBnet and I do not know how to
contact him.

Importantly, the DFSG also require the sources of works distributed in
Debian to be available. If it is possible to relicence the Quick Guide,
could somebody send me its sources ? Debian integrates a bug reporting
and tracking system, and having the sources available in Debian could
bring opportunities to receive patches.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team.
http://www.debian.org/devel/debian-med
Wako, Saitama, Japan

From georgios at biotek.uio.no  Mon Oct  8 04:59:56 2007
From: georgios at biotek.uio.no (George Magklaras)
Date: Mon, 08 Oct 2007 10:59:56 +0200
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071004224144.GA98760@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<4704FE72.1090206@biotek.uio.no>
	<20071004224144.GA98760@iib.unsam.edu.ar>
Message-ID: <4709F18C.2070304@biotek.uio.no>

Hi Fernan,

Fernan Aguero wrote:
> George, 

> 
> Does the name of the resource matter? Mine is named 'embl' ...
> 
If you plan to have the same values for all databases, no. But I tend to 
choose different length values for different databanks, so in that case, 
I have a different RES entry for each databank.


> What other options are there SWISS? GCG? GENBANK? This is AFAIK an
> EMBL formatted file. But maybe I'm wrong ...
>
I believe that TrEMBL should be formatted with the SWISS entry format in 
dbxflat (-idformat SWISS).


-- 
--
George Magklaras

Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/

EMBnet Norway:	http://www.no.embnet.org/


From charles-listes-emboss at plessy.org  Mon Oct  8 19:38:28 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Tue, 9 Oct 2007 08:38:28 +0900
Subject: [EMBOSS] Bug in degapseq ?
Message-ID: <20071008233828.GA32069@kunpuu.plessy.org>

Dear developpers,

If I use degaspeq on a file, it prompts me for the name of the output, but if
the data comes from stdin, degaspeq crashes. It does not happen if the name of
the output is given.

chouca?~?$ cat toto
>Xenopus-1a
-----MVLLKCEYRDEEEDLTS---ASPCSV--TSSFRSPAT----QTCSSDDEQLLSPT
SP--------------GQHQGEE---NS----------------------------PRCR
RSRGRA-QGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGD-----P-VHRS--AS-----TPAAAI---LV---QDSSS
SQSP-----SWS--CSSSPSS-----S-------CCSFS--PASP----ASST--SDSIE
SWQ---PSELHLNPFMSASSA---FI----
>Xenopus-1b
-----MVLLKCEYRDEVSELTS---VSPCSVSSSSSHPSPAM----QTCSSDDEQLHSPT
SPTL-------THLQQGRDQGEE---NS----------------------------PRCR
RSRAR------GDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAKLTKI
ETLRFAHNYIWALSETLRLAD-----Q-LHGS--TS-----TPAAAI---LV---QDSYP
SLSP-----SWS--CSSSPSS----NS-------CDSFS--PTSP----ASST--SDSIE
YWQ---PSELRLNPFMSAL-----------
>Gallus-2
------MPVKAESPAPAAEDE--L-LLLRLASPAPSASLP-------SSAGEEDEDEEDG
RP-------------RRLQEGA----------------------------------RRAG
RQRGPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKI
ETLRFAHNYIWALTETLRL----AGAARLGGA--AD-A---APGAA-----A---EG-SP
SPAS-----SWS--GGASPAP-----SA---SPYACTLS--PGSP----AGSA--SD-AE
HW---PPPRGRFAPPPPPHR----CL----

chouca?~?$ cat toto | degapseq stdin
Removes gap characters from sequences
output sequence(s) [xenopus-1a.fasta]: 
   EMBOSS An error in ajmess.c at line 1662:
END-OF-FILE reading from user

chouca?~?$ cat toto | degapseq stdin stdout
Removes gap characters from sequences
>Xenopus-1a
MVLLKCEYRDEEEDLTSASPCSVTSSFRSPATQTCSSDDEQLLSPTSPGQHQGEENSPRC
RRSRGRAQGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGDPVHRSASTPAAAILVQDSSSSQSPSWSCSSSPSSSCCSF
SPASPASSTSDSIESWQPSELHLNPFMSASSAFI
>Xenopus-1b
MVLLKCEYRDEVSELTSVSPCSVSSSSSHPSPAMQTCSSDDEQLHSPTSPTLTHLQQGRD
QGEENSPRCRRSRARGDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAK
LTKIETLRFAHNYIWALSETLRLADQLHGSTSTPAAAILVQDSYPSLSPSWSCSSSPSSN
SCDSFSPTSPASSTSDSIEYWQPSELRLNPFMSAL
>Gallus-2
MPVKAESPAPAAEDELLLLRLASPAPSASLPSSAGEEDEDEEDGRPRRLQEGARRAGRQR
GPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKIETL
RFAHNYIWALTETLRLAGAARLGGAADAAPGAAAEGSPSPASSWSGGASPAPSASPYACT
LSPGSPAGSASDAEHWPPPRGRFAPPPPPHRCL

chouca?~?$ degapseq toto
Removes gap characters from sequences
output sequence(s) [xenopus-1a.fasta]: stdout
>Xenopus-1a
MVLLKCEYRDEEEDLTSASPCSVTSSFRSPATQTCSSDDEQLLSPTSPGQHQGEENSPRC
RRSRGRAQGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGDPVHRSASTPAAAILVQDSSSSQSPSWSCSSSPSSSCCSF
SPASPASSTSDSIESWQPSELHLNPFMSASSAFI
>Xenopus-1b
MVLLKCEYRDEVSELTSVSPCSVSSSSSHPSPAMQTCSSDDEQLHSPTSPTLTHLQQGRD
QGEENSPRCRRSRARGDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAK
LTKIETLRFAHNYIWALSETLRLADQLHGSTSTPAAAILVQDSYPSLSPSWSCSSSPSSN
SCDSFSPTSPASSTSDSIEYWQPSELRLNPFMSAL
>Gallus-2
MPVKAESPAPAAEDELLLLRLASPAPSASLPSSAGEEDEDEEDGRPRRLQEGARRAGRQR
GPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKIETL
RFAHNYIWALTETLRLAGAARLGGAADAAPGAAAEGSPSPASSWSGGASPAPSASPYACT
LSPGSPAGSASDAEHWPPPRGRFAPPPPPHRCL

Have a nice day,

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan

From david at compbio.dundee.ac.uk  Tue Oct  9 11:56:57 2007
From: david at compbio.dundee.ac.uk (David Martin)
Date: Tue, 09 Oct 2007 16:56:57 +0100
Subject: [EMBOSS] Updating the Quick Guide
Message-ID: <C3316359.2C38C%david@compbio.dundee.ac.uk>

Prompted by charles' request yesterday I am in the process of updating the
EMBOSS quick guide. it was last touched about 8 years ago so comments and
suggestions on what is new, and what should be dropped would be much
appreciated.

..d


From andrespinzon at gmail.com  Tue Oct  9 12:32:09 2007
From: andrespinzon at gmail.com (Andres Pinzon)
Date: Tue, 9 Oct 2007 11:32:09 -0500
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <C3316359.2C38C%david@compbio.dundee.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
Message-ID: <8968fc7e0710090932g63b77a9k7d83bea25c176349@mail.gmail.com>

David,
I am in the process of writing an EMBOSS book, called "An?lisis de
secuencias usando EMBOSS", (" Molecular sequence analysis using
EMBOSS", in english),  it will be released under a CC license (and of
course Open Source), maybe some of the book content can be used.
Please, if you need help on the "old" quick guide update please let me
know it, Ill be more than glad on helping.

Regards,

On 10/9/07, David Martin <david at compbio.dundee.ac.uk> wrote:
> Prompted by charles' request yesterday I am in the process of updating the
> EMBOSS quick guide. it was last touched about 8 years ago so comments and
> suggestions on what is new, and what should be dropped would be much
> appreciated.
>
> ..d
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
Andr?s Pinz?n
http://bioinf.ibun.unal.edu.co/~apinzon/
Bioinformatics Center, Colombia EMBnet node
http://bioinf.ibun.unal.edu.co
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
http://bioinf.uniandes.edu.co
Tel +571 3394949 ext. 2768


From michael.watson at bbsrc.ac.uk  Wed Oct 10 09:02:49 2007
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Wed, 10 Oct 2007 14:02:49 +0100
Subject: [EMBOSS] XFree86 vs xorg
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>

Hi

My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
see this is part of XFree86-devel.  However, as a red hat enterprise
linux 4 user, my X windows seems to be the x.org branch rather than
XFree86....

So, is there a workaround, or should I overwrite my xorg libraries with
XFree86 ones?

Thanks
Mick

The information contained in this message may be confidential or legally
privileged and is intended solely for the addressee. If you have
received this message in error please delete it & notify the originator
immediately.
Unauthorised use, disclosure, copying or alteration of this message is
forbidden & may be unlawful. 
The contents of this e-mail are the views of the sender and do not
necessarily represent the views of the Institute. 
This email and associated attachments has been checked locally for
viruses but we can accept no responsibility once it has left our
systems.
Communications on Institute computers are monitored to secure the
effective operation of the systems and for other lawful purposes. 


From dalloliogm at gmail.com  Wed Oct 10 09:23:01 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Wed, 10 Oct 2007 15:23:01 +0200
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <C3316359.2C38C%david@compbio.dundee.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
Message-ID: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>

You should update the guide on how to install emboss.

In particular, explain how to use the .deb and .rpm packages, since a
lot of people still try to install emboss by compiling it, and it is a
pain.


2007/10/9, David Martin <david at compbio.dundee.ac.uk>:
> Prompted by charles' request yesterday I am in the process of updating the
> EMBOSS quick guide. it was last touched about 8 years ago so comments and
> suggestions on what is new, and what should be dropped would be much
> appreciated.
>
> ..d
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com

From ajb at ebi.ac.uk  Wed Oct 10 09:30:09 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 10 Oct 2007 14:30:09 +0100 (BST)
Subject: [EMBOSS] XFree86 vs xorg
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <50101.81.98.241.17.1192023009.squirrel@webmail.ebi.ac.uk>

Hello Mick,

For xorg all you need to do is to install the xorg-x11-proto-devel RPM
and then, in EMBOSS-5.0.0, do a 'make clean' and configure again.

You might want to install the gd-devel RPM at the same time (to get PNG
support). If you install them both using 'yum' then all the dependencies
will be pulled-in.

HTH

Alan


> Hi
>
> My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
> see this is part of XFree86-devel.  However, as a red hat enterprise
> linux 4 user, my X windows seems to be the x.org branch rather than
> XFree86....
>
> So, is there a workaround, or should I overwrite my xorg libraries with
> XFree86 ones?
>
> Thanks
> Mick
>
> The information contained in this message may be confidential or legally
> privileged and is intended solely for the addressee. If you have
> received this message in error please delete it & notify the originator
> immediately.
> Unauthorised use, disclosure, copying or alteration of this message is
> forbidden & may be unlawful.
> The contents of this e-mail are the views of the sender and do not
> necessarily represent the views of the Institute.
> This email and associated attachments has been checked locally for
> viruses but we can accept no responsibility once it has left our
> systems.
> Communications on Institute computers are monitored to secure the
> effective operation of the systems and for other lawful purposes.
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From ajb at ebi.ac.uk  Wed Oct 10 09:39:26 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 10 Oct 2007 14:39:26 +0100 (BST)
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
Message-ID: <42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>

> You should update the guide on how to install emboss.
>
> In particular, explain how to use the .deb and .rpm packages, since a
> lot of people still try to install emboss by compiling it, and it is a
> pain.

I'll leave that up to David to decide but the information is in the new
FAQ which, yesterday, I submitted to my colleagues for approval and
will then appear in CVS and later online. There was already some RPM
info around but no .deb stuff. The info will also be in the books
which Jon mentioned recently.

Alan


From charles-listes-emboss at plessy.org  Wed Oct 10 09:24:08 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 10 Oct 2007 22:24:08 +0900
Subject: [EMBOSS] XFree86 vs xorg
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <20071010132408.GJ990@kunpuu.plessy.org>

Le Wed, Oct 10, 2007 at 02:02:49PM +0100, michael watson (IAH-C) a ?crit :
> Hi
> 
> My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
> see this is part of XFree86-devel.  However, as a red hat enterprise
> linux 4 user, my X windows seems to be the x.org branch rather than
> XFree86....

In Xorg, the libraries have been separated in individual packages. I
think that you can find Xlib.h in a package named libx11-devel, or
something like this.

Have a nice day,

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan

From dalloliogm at gmail.com  Wed Oct 10 10:06:16 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Wed, 10 Oct 2007 16:06:16 +0200
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
Message-ID: <5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>

2007/10/10, ajb at ebi.ac.uk <ajb at ebi.ac.uk>:
> There was already some RPM
> info around but no .deb stuff. The info will also be in the books
> which Jon mentioned recently.
>

hi,
there is an emboss 5.0 package in debian sid.

You just have to add something like this:
"""
If you are a debian/ubuntu user, you can install emboss by giving the command:
>>> sudo aptitude install emboss
to install the package.
"""

Actually, this would work only for Debian Sid, but I believe the
package will be included also in Ubuntu 7/10 and in debian etch in the
short time.


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com

From charles-listes-emboss at plessy.org  Wed Oct 10 10:55:35 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 10 Oct 2007 23:55:35 +0900
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
Message-ID: <20071010145535.GK990@kunpuu.plessy.org>

Le Wed, Oct 10, 2007 at 04:06:16PM +0200, Giovanni Marco Dall'Olio a ?crit :
> hi,
> there is an emboss 5.0 package in debian sid.
> 
> Actually, this would work only for Debian Sid, but I believe the
> package will be included also in Ubuntu 7/10 and in debian etch in the
> short time.

Dear Giovanni,

Because Debian Etch is the stable version, it does not receive new
packages unless they fix security issues or grave bugs. The emboss
package for Debian will never be part of Etch nor its updates.

However, some Debian developpers provides a separate repository in which
only official developers upload recent packages recompiled for Etch. The
site is called backports.org.

If you or another reader is interested, we can prepare such a backport
for Etch.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan

From david at compbio.dundee.ac.uk  Wed Oct 10 10:30:24 2007
From: david at compbio.dundee.ac.uk (David Martin)
Date: Wed, 10 Oct 2007 15:30:24 +0100
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
Message-ID: <C332A090.2C3D0%david@compbio.dundee.ac.uk>

On 10/10/07 14:23, "Giovanni Marco Dall'Olio" <dalloliogm at gmail.com> wrote:

> You should update the guide on how to install emboss.
> 
> In particular, explain how to use the .deb and .rpm packages, since a
> lot of people still try to install emboss by compiling it, and it is a
> pain.
> 
> 
> 2007/10/9, David Martin <david at compbio.dundee.ac.uk>:
>> Prompted by charles' request yesterday I am in the process of updating the
>> EMBOSS quick guide. it was last touched about 8 years ago so comments and
>> suggestions on what is new, and what should be dropped would be much
>> appreciated.
>> 
>> ..d


The aim of the Quick Guide is to provide a one sheet of A4 (two sides) quick
reference guide to the common programs and command line arguments that are
used with EMBOSS. I found it very useful when teaching as an aide memoire
for myself and the students.

Explaining how to install EMBOSS on each architecture is NOT the aim - for
that read the admin guide, the maintenance of which Alan and others have
taken off my hands. I will however reference the admin guide for
installation info.

If you haven't seen the quick guide a somewhat dated pdf is available in
emboss/docs/manuals/emboss_qg.pdf

regards

..d
  

From Veronique.Martin at jouy.inra.fr  Thu Oct 11 03:39:44 2007
From: Veronique.Martin at jouy.inra.fr (Veronique.Martin at jouy.inra.fr)
Date: Thu, 11 Oct 2007 09:39:44 +0200 (CEST)
Subject: [EMBOSS] prosextract option?
Message-ID: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>


Hi,

I want to run prosextract, but I would like build prosite motif in 
directory of my choice. Now the only possibility is in this path : 
emboss/share/EMBOSS/data/PROSITE
Is it possbile to have got an option for choosing the output directory?

I had tried by using the .embossrc file but only for this database 
(prosite) this file is not considered, prosextract used  the 
emboss/share/EMBOSS/emboss.default file.

Regards,

VM

-------------------------------------------------
V?ronique MARTIN
INRA - Unit? Math?matique, Informatique et G?nome
78352 Jouy-en Josas cedex
tel.: 01 34 65 29 74
-------------------------------------------------

From dalloliogm at gmail.com  Thu Oct 11 04:36:04 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Thu, 11 Oct 2007 10:36:04 +0200
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <20071010145535.GK990@kunpuu.plessy.org>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
	<20071010145535.GK990@kunpuu.plessy.org>
Message-ID: <5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>

2007/10/10, Charles Plessy <charles-listes-emboss at plessy.org>:
>
> Because Debian Etch is the stable version, it does not receive new
> packages unless they fix security issues or grave bugs. The emboss
> package for Debian will never be part of Etch nor its updates.
>

Really?
I didn't know emboss had grave bugs.

Are you saying they can't be fixed?
I can't find many references to bugs in emboss, but maybe you are
referring to bugs like this:
- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=427439 ?


> However, some Debian developpers provides a separate repository in which
> only official developers upload recent packages recompiled for Etch. The
> site is called backports.org.
>
> If you or another reader is interested, we can prepare such a backport
> for Etch.

Thank you very much: I think many people are interested, expecially
from the Ubuntu users community.
Emboss is seen as a educational package to learn bioinformatics: so,
it would be better if people can install it easily by themselves,
instead of asking to a system manager.
 Maybe you can just add the link to debian backports in the help page.


> Have a nice day,
>

and to you, too!

> --
> Charles Plessy
> Debian-Med packaging team
> Wako, Saitama, Japan
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com

From charles-listes-emboss at plessy.org  Thu Oct 11 05:06:13 2007
From: charles-listes-emboss at plessy.org (charles-listes-emboss at plessy.org)
Date: Thu, 11 Oct 2007 18:06:13 +0900
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
	<20071010145535.GK990@kunpuu.plessy.org>
	<5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>
Message-ID: <20071011090613.GA31072@kunpuu.plessy.org>

Le Thu, Oct 11, 2007 at 10:36:04AM +0200, Giovanni Marco Dall'Olio a ?crit :
> 2007/10/10, Charles Plessy <charles-listes-emboss at plessy.org>:
> >
> > Because Debian Etch is the stable version, it does not receive new
> > packages unless they fix security issues or grave bugs. The emboss
> > package for Debian will never be part of Etch nor its updates.
> >
> 
> Really?
> I didn't know emboss had grave bugs.

Dear Giovanni,

I have been unclear. The reason why EMBOSS is not in Debian Etch is
because its Debian package was not ready when Etch has been released.
Furthermore, it is the policy of Debian to only accept changes related
to security or grave bugs. Therefore, Debian Etch will never contain the
Debian packages we prepared for EMBOSS.

I will announce on this list when the package will be available through
backports.org.


> Are you saying they can't be fixed?
> I can't find many references to bugs in emboss, but maybe you are
> referring to bugs like this:
> - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=427439 ?

Yes, the current package still has some quality issues. However, all the
ones reported so far are solved in our SVN repository. I hope that I can
update the Debian package of EMBOSS in Debian Sid soon.

http://svn.debian.org/wsvn/pkg-emboss/emboss/trunk/debian/changelog?op=file&rev=0&sc=0

(If one explores a bit this repository, he can have a glimpse of what we
have in the pipeline...).


> Thank you very much: I think many people are interested, expecially
> from the Ubuntu users community.

By the way, if you ask to a MOTU Science, I think that it is possible to
fast-track the emboss packages into Ubuntu...


>  Maybe you can just add the link to debian backports in the help page.

The new package.debian.org website advertises the backports. See for
example the page for OpenOffice.org: http://packages.debian.org/openoffice.org


Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team.
Wako, Saitama, Japan

From Laurence.Amilhat at toulouse.inra.fr  Thu Oct 11 05:44:40 2007
From: Laurence.Amilhat at toulouse.inra.fr (Laurence Amilhat)
Date: Thu, 11 Oct 2007 11:44:40 +0200
Subject: [EMBOSS] plcore.c error when compiling
Message-ID: <470DF088.4020303@toulouse.inra.fr>

Dear Emboss users,


I am tryin to install emboss on Linux Ubuntu 7.04 Feisty Fawn
I downloaded the following tar.gz : EMBOSS-5.0.0.tar.gz 
<ftp://emboss.open-bio.org/pub/EMBOSS/EMBOSS-5.0.0.tar.gz>

I made the ./configure, (I have the grphics lib z, png and gd)
But when I maunch the make, I get the following message.
Does anyone have an idea why? Did I miss a lib or something?

Thank you for your help,

Best regards,

Laurence


plcore.c: In function 'int text2fci(const char*, unsigned char*, 
unsigned char*)':
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c: In function 'void difilt(PLINT*, PLINT*, PLINT, PLINT*, 
PLINT*, PLINT*, PLINT*)':
plcore.c:887: warning: converting to 'int' from 'PLFLT'
plcore.c:888: warning: converting to 'int' from 'PLFLT'
plcore.c:897: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:899: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:909: warning: converting to 'int' from 'PLFLT'
plcore.c:910: warning: converting to 'int' from 'PLFLT'
plcore.c:919: warning: converting to 'int' from 'PLFLT'
plcore.c:920: warning: converting to 'int' from 'PLFLT'
plcore.c: In function 'void sdifilt(short int*, short int*, PLINT, 
PLINT*, PLINT*, PLINT*, PLINT*)':
plcore.c:946: warning: converting to 'short int' from 'PLFLT'
plcore.c:947: warning: converting to 'short int' from 'PLFLT'
plcore.c:955: warning: converting to 'short int' from 'PLFLT'
plcore.c:956: warning: converting to 'short int' from 'PLFLT'
plcore.c:966: warning: converting to 'short int' from 'PLFLT'
plcore.c:967: warning: converting to 'short int' from 'PLFLT'
plcore.c:976: warning: converting to 'short int' from 'PLFLT'
plcore.c:977: warning: converting to 'short int' from 'PLFLT'
plcore.c: In function 'void pldid2pc(PLFLT*, PLFLT*, PLFLT*, PLFLT*)':
plcore.c:1079: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1080: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c:1081: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1082: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c: In function 'void pldip2dc(PLFLT*, PLFLT*, PLFLT*, PLFLT*)':
plcore.c:1125: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1126: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c:1127: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1128: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c: In function 'void calc_didev()':
plcore.c:1345: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1346: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1347: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1348: warning: converting to 'PLINT' from 'PLFLT'
plcore.c: In function 'void plP_setpxl(PLFLT, PLFLT)':
plcore.c:3264: warning: converting to 'PLINT' from 'double'
plcore.c:3265: warning: converting to 'PLINT' from 'double'
make[2]: *** [plcore.lo] Erreur 1
make[2]: quittant le r?pertoire ? /tmp/EMBOSS-5.0.0/plplot ?
make[1]: *** [all-recursive] Erreur 1
make[1]: quittant le r?pertoire ? /tmp/EMBOSS-5.0.0/plplot ?
make: *** [all-recursive] Erreur 1
Exit 2

-- 
====================================================================
= Laurence Amilhat    INRA Toulouse 31326 Castanet-Tolosan     	   = 
= Tel: 33 5 61 28 53 34   Email: laurence.amilhat at toulouse.inra.fr =
====================================================================


From jison at ebi.ac.uk  Thu Oct 11 08:16:19 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Thu, 11 Oct 2007 13:16:19 +0100 (BST)
Subject: [EMBOSS] prosextract option?
In-Reply-To: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>
References: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>
Message-ID: <48865.84.92.187.247.1192104979.squirrel@webmail.ebi.ac.uk>

Hi Veronique

prosextract is indeed hard-coded to write to the EMBOSS data directory
(defined by the EMBOSS environment variable EMBOSS_DATA).

You could always copy the file to your current working directory or into
a directory called ".embossdata" in either your home or current working
directory and the file could still be read by EMBOSS.

If that doesn't help an option to write to any specified directory could
easily be added - please advise.

Cheers

Jon


>
> Hi,
>
> I want to run prosextract, but I would like build prosite motif in
> directory of my choice. Now the only possibility is in this path :
> emboss/share/EMBOSS/data/PROSITE
> Is it possbile to have got an option for choosing the output directory?
>
> I had tried by using the .embossrc file but only for this database
> (prosite) this file is not considered, prosextract used  the
> emboss/share/EMBOSS/emboss.default file.
>
> Regards,
>
> VM
>
> -------------------------------------------------
> V?ronique MARTIN
> INRA - Unit? Math?matique, Informatique et G?nome
> 78352 Jouy-en Josas cedex
> tel.: 01 34 65 29 74
> -------------------------------------------------_______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From pmr at ebi.ac.uk  Wed Oct 24 04:07:21 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Wed, 24 Oct 2007 09:07:21 +0100
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <20071008233828.GA32069@kunpuu.plessy.org>
References: <20071008233828.GA32069@kunpuu.plessy.org>
Message-ID: <471EFD39.5060202@ebi.ac.uk>

Charles Plessy wrote:
> If I use degaspeq on a file, it prompts me for the name of the output, but if
> the data comes from stdin, degaspeq crashes. It does not happen if the name of
> the output is given.

> chouca?~?$ cat toto | degapseq stdin
> Removes gap characters from sequences
> output sequence(s) [xenopus-1a.fasta]: 
>    EMBOSS An error in ajmess.c at line 1662:
> END-OF-FILE reading from user

This is because you are reading from stdin, but then degapseq tries to 
read the output filename from stdin.

You do need to specify the output filename, or use -auto to accept the 
default (or -filter to use stdout and to read from stdin).

With -auto and -filter the program will no longer be using stdin for 
user replies.

Hmmm ... maybe we could catch these cases ... tricky though as really it 
is an explicit search for "stdin" as an input file/sequence. I could 
invent examples where we would guess wrongly.

Hope that helps,

Peter

From charles-listes-emboss at plessy.org  Wed Oct 24 10:37:06 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 24 Oct 2007 23:37:06 +0900
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <471EFD39.5060202@ebi.ac.uk>
References: <20071008233828.GA32069@kunpuu.plessy.org>
	<471EFD39.5060202@ebi.ac.uk>
Message-ID: <20071024143706.GB24491@kunpuu.plessy.org>

Le Wed, Oct 24, 2007 at 09:07:21AM +0100, Peter Rice a ?crit :
> 
> You do need to specify the output filename, or use -auto to accept the 
> default (or -filter to use stdout and to read from stdin).
> 
> With -auto and -filter the program will no longer be using stdin for 
> user replies.

Oh, I completely overlooked the fact that the emboss programs can take
their user replies from stdin. Maybe then the most straightforward to
inform users from this mistake would be to change the error message to
something like : "Error: could not open file '...............', in which
the name of the file would be truncated to the end of the line. The user
would quickly understand if the file name is someting like AGTCCAGGTA...

Have a nice day,

-- 
Charles Plessy
Wako, Saitama, Japan

From pmr at ebi.ac.uk  Wed Oct 24 12:53:27 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Wed, 24 Oct 2007 17:53:27 +0100
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <20071024143706.GB24491@kunpuu.plessy.org>
References: <20071008233828.GA32069@kunpuu.plessy.org>	<471EFD39.5060202@ebi.ac.uk>
	<20071024143706.GB24491@kunpuu.plessy.org>
Message-ID: <471F7887.5050004@ebi.ac.uk>

Charles Plessy wrote:

> Oh, I completely overlooked the fact that the emboss programs can take
> their user replies from stdin. Maybe then the most straightforward to
> inform users from this mistake would be to change the error message to
> something like : "Error: could not open file '...............', in which
> the name of the file would be truncated to the end of the line. The user
> would quickly understand if the file name is someting like AGTCCAGGTA...

Or perhaps they would not quickly understand ... because it took me a 
few runs before I realised that was the problem :-)

I think we can keep track of stdin being opened in EMBOSS and refuse to 
prompt for input.

regards,

Peter

From staffa at niehs.nih.gov  Wed Oct 24 13:21:37 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 24 Oct 2007 13:21:37 -0400
Subject: [EMBOSS] GUI interfaces
Message-ID: <C344F761.674B%staffa@niehs.nih.gov>

Friends
    We are preparing for if ever GCG goes away by seriously pushing EMBOSS
with our users. 
This page
http://emboss.sourceforge.net/interfaces/
lists 15 GUIs.
apparently ColiMate is not an existing GUI to EMBOSS,
but a developement tool.
Please tell me:
Which of the 15 GUIs listed are complete and available?
Which do you think is best?

Thank you
 
Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From andrespinzon at gmail.com  Wed Oct 24 14:11:18 2007
From: andrespinzon at gmail.com (Andres Pinzon)
Date: Wed, 24 Oct 2007 13:11:18 -0500
Subject: [EMBOSS] GUI interfaces
In-Reply-To: <C344F761.674B%staffa@niehs.nih.gov>
References: <C344F761.674B%staffa@niehs.nih.gov>
Message-ID: <8968fc7e0710241111odff847dge2d0d16889c16e32@mail.gmail.com>

In my experience: [1] wEMBOSS and EMBOSS-Explorer are really easy to
configure and provide different user experience that complement each
other.

[1] http://bioinf.ibun.unal.edu.co/wEMBOSS/

Regards,


On 10/24/07, Staffa, Nick (NIH/NIEHS) <staffa at niehs.nih.gov> wrote:
> Friends
>     We are preparing for if ever GCG goes away by seriously pushing EMBOSS
> with our users.
> This page
> http://emboss.sourceforge.net/interfaces/
> lists 15 GUIs.
> apparently ColiMate is not an existing GUI to EMBOSS,
> but a developement tool.
> Please tell me:
> Which of the 15 GUIs listed are complete and available?
> Which do you think is best?
>
> Thank you
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
Andr?s Pinz?n
http://bioinf.ibun.unal.edu.co/~apinzon/
Bioinformatics Center, Colombia EMBnet node
http://bioinf.ibun.unal.edu.co
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
http://bioinf.uniandes.edu.co
Tel +571 3394949 ext. 2768


From golharam at umdnj.edu  Wed Oct 24 13:58:08 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Wed, 24 Oct 2007 13:58:08 -0400
Subject: [EMBOSS] GUI interfaces
In-Reply-To: <C344F761.674B%staffa@niehs.nih.gov>
References: <C344F761.674B%staffa@niehs.nih.gov>
Message-ID: <471F87B0.8030308@umdnj.edu>

Hi Nich,

We (UMDNJ) migrated off of GCG several years ago.  We found most of our 
users prefer the command-line interface for shell scripting or a web 
interface for GUI access from their own computers.

We use EMBOSS-Explorer for the web interface.  Its (much) cleaner and 
faster than SeqWeb ever was and doesn't rely on the server storing user 
data.  We removed our responsibility of backing user data by moving off 
a server storages system to the user instead.  There are no issues with 
user account management (username/passwords) with this system either.
With GCG, we would have at least 1 or 2 user issues per month.  Since 
the switch, I can honestly say our user issues are maybe 1 or 2 per year.

If you have any questions about this, feel free to email me,

Ryan

----------------
Ryan Golhar, PhD
golharam at umdnj.edu
Computational Biologst
Informatics Institute at UMDNJ


Staffa, Nick (NIH/NIEHS) wrote:
> Friends
>     We are preparing for if ever GCG goes away by seriously pushing EMBOSS
> with our users. 
> This page
> http://emboss.sourceforge.net/interfaces/
> lists 15 GUIs.
> apparently ColiMate is not an existing GUI to EMBOSS,
> but a developement tool.
> Please tell me:
> Which of the 15 GUIs listed are complete and available?
> Which do you think is best?
> 
> Thank you
>  
> Nick Staffa 
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
> 
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 
> 

From kann.vearasilp at mu.edu  Thu Oct 25 15:07:37 2007
From: kann.vearasilp at mu.edu (Kann Vearasilp)
Date: Thu, 25 Oct 2007 14:07:37 -0500
Subject: [EMBOSS] Cannot open division file
Message-ID: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>

Hello everyone,

I just finish indexing a genbank database for my lab using dbiflat  
command. I set up an emboss.default file referenced from  
emboss.default.template as it was provided. "seqret" is a command  
that is used to test the system, and it seems that EMBOSS could not  
find the division file.

I can see from the archive that there was this kind of problem with  
test database provided from emboss as well. (http://emboss.open- 
bio.org/pipermail/emboss/2005-November/002323.html). However, I am  
pretty sure that I correctly pointed the path to my database.  
However, here is my configuration.

The system is Mac OS 10.4

1. Emboss was installed from fink at /sw/share/EMBOSS

2. All database was installed in /lab/data/databases/genbank/*.seq

3. Index files are in /lab/data/indices/genbank/??? Here is an  
example of one of the index directory from my lab.

xxx at yyy/lab/data/indices/genbank/mam:
acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
acnum.trg     division.lkp  keyword.trg   seqvn.trg
des.hit       entrynam.idx  mam.dbiflat   taxon.hit

4. Here is a fraction from my emboss.default file:

# Set location of acd files that describe each program
SET emboss_acdroot /sw/share/EMBOSS/acd


# Set location of Genbank flatfiles in protein
SET  emboss_database_dir /lab/data/databases

# Set location of Genbank flatfiles indices in protein
set emboss_index_dir /lab/data/indices

# Set a log file that user can append their records and EMBOSS  
automatically write log information
SET emboss_logfile /sw/share/EMBOSS/log/log

# Set Paper size of disc page and is required by the 'dbx' indexing  
program and 'method: "emblcd" emboss'
# Recommended value is 2048
SET PAGESIZE 2048

# Set Caches size required for 'dbx' indexing and 'method emboss'.
# It is a page size number to cache. Recommended value is 200
SET CACHESIZE 200

# Set parameter for flat file indices that we have created in
# /lab/data/indices/genbank
.
.
.
.
.
DB gbmam [
# required parameters
    method: "emblcd"
    format: "GB"
    type: "N"
    dir: "\$emboss_database_dir/genbank"
    file: "gbmam*.seq"
# optional parameters
    fields: "sv des key org"
    release: "161.0"
    comment: "Genbank database for mam sequences"
    indexdir: "\$emboss_index_dir/genbank/mam"
]

5. I run this seqret command to test the system, but it throw error  
and you can see:

xxx at yyy~:seqret gbmam:BC102801
Reads and writes (returns) sequences
Warning: Cannot open division file '<null>' for database 'gbmam'
Warning: seqCdQry failed
Error: Unable to read sequence 'gbmam:BC102801'
Died: seqret terminated: Bad value for '-sequence' and no prompt

6. I also run the seqret command in debug mode and this is its log  
from the command.

Debug file seqret.dbg buffered:No
ajAcdInitP pgm 'seqret' package ''
ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
closing file '/sw/share/EMBOSS/acd/seqret.acd'
ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
closing file '/sw/share/EMBOSS/acd/codes.english'
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
Set acdprotein value '$(sequence.protein)'
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed

It seems that the emboss could not find the division file. I still  
don't know what the problem is. Do you have any recommendation?

Thank you so much in advance for any help!

Kann


From ajb at ebi.ac.uk  Thu Oct 25 16:22:18 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 25 Oct 2007 21:22:18 +0100 (BST)
Subject: [EMBOSS] Cannot open division file
In-Reply-To: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
References: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
Message-ID: <33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>

Dear Kann,

One major problem is your DB entry:

DB gbmam [
# required parameters
    method: "emblcd"
    format: "GB"
    type: "N"
    dir: "\$emboss_database_dir/genbank"
    file: "gbmam*.seq"
# optional parameters
    fields: "sv des key org"
    release: "161.0"
    comment: "Genbank database for mam sequences"
    indexdir: "\$emboss_index_dir/genbank/mam"
]

You should remove the two backquote characters before the '$'
characters. I believe they mistakenly appeared in some documentation
in the past (possibly as a result of some automatic formatting).
It'd be useful if you'd email me off-list and tell me which documentation
contained the error (if my guess is correct).


Alan


> Hello everyone,
>
> I just finish indexing a genbank database for my lab using dbiflat
> command. I set up an emboss.default file referenced from
> emboss.default.template as it was provided. "seqret" is a command
> that is used to test the system, and it seems that EMBOSS could not
> find the division file.
>
> I can see from the archive that there was this kind of problem with
> test database provided from emboss as well. (http://emboss.open-
> bio.org/pipermail/emboss/2005-November/002323.html). However, I am
> pretty sure that I correctly pointed the path to my database.
> However, here is my configuration.
>
> The system is Mac OS 10.4
>
> 1. Emboss was installed from fink at /sw/share/EMBOSS
>
> 2. All database was installed in /lab/data/databases/genbank/*.seq
>
> 3. Index files are in /lab/data/indices/genbank/??? Here is an
> example of one of the index directory from my lab.
>
> xxx at yyy/lab/data/indices/genbank/mam:
> acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
> acnum.trg     division.lkp  keyword.trg   seqvn.trg
> des.hit       entrynam.idx  mam.dbiflat   taxon.hit
>
> 4. Here is a fraction from my emboss.default file:
>
> # Set location of acd files that describe each program
> SET emboss_acdroot /sw/share/EMBOSS/acd
>
>
> # Set location of Genbank flatfiles in protein
> SET  emboss_database_dir /lab/data/databases
>
> # Set location of Genbank flatfiles indices in protein
> set emboss_index_dir /lab/data/indices
>
> # Set a log file that user can append their records and EMBOSS
> automatically write log information
> SET emboss_logfile /sw/share/EMBOSS/log/log
>
> # Set Paper size of disc page and is required by the 'dbx' indexing
> program and 'method: "emblcd" emboss'
> # Recommended value is 2048
> SET PAGESIZE 2048
>
> # Set Caches size required for 'dbx' indexing and 'method emboss'.
> # It is a page size number to cache. Recommended value is 200
> SET CACHESIZE 200
>
> # Set parameter for flat file indices that we have created in
> # /lab/data/indices/genbank
> .
> .
> .
> .
> .
> DB gbmam [
> # required parameters
>     method: "emblcd"
>     format: "GB"
>     type: "N"
>     dir: "\$emboss_database_dir/genbank"
>     file: "gbmam*.seq"
> # optional parameters
>     fields: "sv des key org"
>     release: "161.0"
>     comment: "Genbank database for mam sequences"
>     indexdir: "\$emboss_index_dir/genbank/mam"
> ]
>
> 5. I run this seqret command to test the system, but it throw error
> and you can see:
>
> xxx at yyy~:seqret gbmam:BC102801
> Reads and writes (returns) sequences
> Warning: Cannot open division file '<null>' for database 'gbmam'
> Warning: seqCdQry failed
> Error: Unable to read sequence 'gbmam:BC102801'
> Died: seqret terminated: Bad value for '-sequence' and no prompt
>
> 6. I also run the seqret command in debug mode and this is its log
> from the command.
>
> Debug file seqret.dbg buffered:No
> ajAcdInitP pgm 'seqret' package ''
> ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
> closing file '/sw/share/EMBOSS/acd/seqret.acd'
> ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
> closing file '/sw/share/EMBOSS/acd/codes.english'
> ajTableNewFunctionLen hint 25 size 251
> ajTableNewFunctionLen hint 25 size 251
> ajTableNewFunctionLen hint 25 size 251
> ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
> closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
> Set acdprotein value '$(sequence.protein)'
> ajSeqinClear called
> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
> USA to test: 'gbmam:BC102801'
>
> format regexp: No list:No
> no format specified in USA
>
> ...input format not set
> dbname dbexp: Yes
> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
> seqQueryFieldC usa 'sv' fields 'sv des key org'
> seqQueryField test 'sv'
> seqQueryField match 'sv'
> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
> '' org '' key ''
> wild (has) query Sv 'BC102801'
> database type: 'N' format 'GB'
> use access method 'emblcd'
> Matched seqAccess[2] 'emblcd'
> seqAccessEmblcd type 2
> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
> 'BC102801' hasacc:Yes
> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
> Database 'gbmam' : access method 'emblcd' failed
> ajSeqinClear called
> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
> USA to test: 'gbmam:BC102801'
>
> format regexp: No list:No
> no format specified in USA
>
> ...input format not set
> dbname dbexp: Yes
> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
> seqQueryFieldC usa 'sv' fields 'sv des key org'
> seqQueryField test 'sv'
> seqQueryField match 'sv'
> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
> '' org '' key ''
> wild (has) query Sv 'BC102801'
> database type: 'N' format 'GB'
> use access method 'emblcd'
> Matched seqAccess[2] 'emblcd'
> seqAccessEmblcd type 2
> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
> 'BC102801' hasacc:Yes
> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
> Database 'gbmam' : access method 'emblcd' failed
>
> It seems that the emboss could not find the division file. I still
> don't know what the problem is. Do you have any recommendation?
>
> Thank you so much in advance for any help!
>
> Kann
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From kann.vearasilp at mu.edu  Thu Oct 25 18:06:01 2007
From: kann.vearasilp at mu.edu (Kann Vearasilp)
Date: Thu, 25 Oct 2007 17:06:01 -0500
Subject: [EMBOSS] Cannot open division file
In-Reply-To: <33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>
References: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
	<33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>
Message-ID: <FBE50134-81F9-4A02-8655-2A0904A5D3D9@mu.edu>

Hello Alan,

Thank you so much for fast response! It seems that this backslash  
cause me all the problems. Once I removed them, the program works  
flawlessly. :)

Kann

PS. I can find the document and will mail you once I know the version  
of this emboss tutorial.

On Oct 25, 2007, at 3:22 PM, ajb at ebi.ac.uk wrote:

> Dear Kann,
>
> One major problem is your DB entry:
>
> DB gbmam [
> # required parameters
>     method: "emblcd"
>     format: "GB"
>     type: "N"
>     dir: "\$emboss_database_dir/genbank"
>     file: "gbmam*.seq"
> # optional parameters
>     fields: "sv des key org"
>     release: "161.0"
>     comment: "Genbank database for mam sequences"
>     indexdir: "\$emboss_index_dir/genbank/mam"
> ]
>
> You should remove the two backquote characters before the '$'
> characters. I believe they mistakenly appeared in some documentation
> in the past (possibly as a result of some automatic formatting).
> It'd be useful if you'd email me off-list and tell me which  
> documentation
> contained the error (if my guess is correct).
>
>
> Alan
>
>
>> Hello everyone,
>>
>> I just finish indexing a genbank database for my lab using dbiflat
>> command. I set up an emboss.default file referenced from
>> emboss.default.template as it was provided. "seqret" is a command
>> that is used to test the system, and it seems that EMBOSS could not
>> find the division file.
>>
>> I can see from the archive that there was this kind of problem with
>> test database provided from emboss as well. (http://emboss.open-
>> bio.org/pipermail/emboss/2005-November/002323.html). However, I am
>> pretty sure that I correctly pointed the path to my database.
>> However, here is my configuration.
>>
>> The system is Mac OS 10.4
>>
>> 1. Emboss was installed from fink at /sw/share/EMBOSS
>>
>> 2. All database was installed in /lab/data/databases/genbank/*.seq
>>
>> 3. Index files are in /lab/data/indices/genbank/??? Here is an
>> example of one of the index directory from my lab.
>>
>> xxx at yyy/lab/data/indices/genbank/mam:
>> acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
>> acnum.trg     division.lkp  keyword.trg   seqvn.trg
>> des.hit       entrynam.idx  mam.dbiflat   taxon.hit
>>
>> 4. Here is a fraction from my emboss.default file:
>>
>> # Set location of acd files that describe each program
>> SET emboss_acdroot /sw/share/EMBOSS/acd
>>
>>
>> # Set location of Genbank flatfiles in protein
>> SET  emboss_database_dir /lab/data/databases
>>
>> # Set location of Genbank flatfiles indices in protein
>> set emboss_index_dir /lab/data/indices
>>
>> # Set a log file that user can append their records and EMBOSS
>> automatically write log information
>> SET emboss_logfile /sw/share/EMBOSS/log/log
>>
>> # Set Paper size of disc page and is required by the 'dbx' indexing
>> program and 'method: "emblcd" emboss'
>> # Recommended value is 2048
>> SET PAGESIZE 2048
>>
>> # Set Caches size required for 'dbx' indexing and 'method emboss'.
>> # It is a page size number to cache. Recommended value is 200
>> SET CACHESIZE 200
>>
>> # Set parameter for flat file indices that we have created in
>> # /lab/data/indices/genbank
>> .
>> .
>> .
>> .
>> .
>> DB gbmam [
>> # required parameters
>>     method: "emblcd"
>>     format: "GB"
>>     type: "N"
>>     dir: "\$emboss_database_dir/genbank"
>>     file: "gbmam*.seq"
>> # optional parameters
>>     fields: "sv des key org"
>>     release: "161.0"
>>     comment: "Genbank database for mam sequences"
>>     indexdir: "\$emboss_index_dir/genbank/mam"
>> ]
>>
>> 5. I run this seqret command to test the system, but it throw error
>> and you can see:
>>
>> xxx at yyy~:seqret gbmam:BC102801
>> Reads and writes (returns) sequences
>> Warning: Cannot open division file '<null>' for database 'gbmam'
>> Warning: seqCdQry failed
>> Error: Unable to read sequence 'gbmam:BC102801'
>> Died: seqret terminated: Bad value for '-sequence' and no prompt
>>
>> 6. I also run the seqret command in debug mode and this is its log
>> from the command.
>>
>> Debug file seqret.dbg buffered:No
>> ajAcdInitP pgm 'seqret' package ''
>> ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
>> closing file '/sw/share/EMBOSS/acd/seqret.acd'
>> ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
>> closing file '/sw/share/EMBOSS/acd/codes.english'
>> ajTableNewFunctionLen hint 25 size 251
>> ajTableNewFunctionLen hint 25 size 251
>> ajTableNewFunctionLen hint 25 size 251
>> ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
>> closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
>> Set acdprotein value '$(sequence.protein)'
>> ajSeqinClear called
>> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
>> USA to test: 'gbmam:BC102801'
>>
>> format regexp: No list:No
>> no format specified in USA
>>
>> ...input format not set
>> dbname dbexp: Yes
>> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
>> seqQueryFieldC usa 'sv' fields 'sv des key org'
>> seqQueryField test 'sv'
>> seqQueryField match 'sv'
>> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
>> '' org '' key ''
>> wild (has) query Sv 'BC102801'
>> database type: 'N' format 'GB'
>> use access method 'emblcd'
>> Matched seqAccess[2] 'emblcd'
>> seqAccessEmblcd type 2
>> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
>> 'BC102801' hasacc:Yes
>> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
>> Database 'gbmam' : access method 'emblcd' failed
>> ajSeqinClear called
>> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
>> USA to test: 'gbmam:BC102801'
>>
>> format regexp: No list:No
>> no format specified in USA
>>
>> ...input format not set
>> dbname dbexp: Yes
>> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
>> seqQueryFieldC usa 'sv' fields 'sv des key org'
>> seqQueryField test 'sv'
>> seqQueryField match 'sv'
>> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
>> '' org '' key ''
>> wild (has) query Sv 'BC102801'
>> database type: 'N' format 'GB'
>> use access method 'emblcd'
>> Matched seqAccess[2] 'emblcd'
>> seqAccessEmblcd type 2
>> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
>> 'BC102801' hasacc:Yes
>> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
>> Database 'gbmam' : access method 'emblcd' failed
>>
>> It seems that the emboss could not find the division file. I still
>> don't know what the problem is. Do you have any recommendation?
>>
>> Thank you so much in advance for any help!
>>
>> Kann
>>
>> _______________________________________________
>> EMBOSS mailing list
>> EMBOSS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/emboss
>>
>
>


From kertib at linuxlap.hu  Tue Oct 30 06:25:36 2007
From: kertib at linuxlap.hu (kerti =?ISO-8859-1?Q?Bal=E1zs_G=E1bor?=)
Date: Tue, 30 Oct 2007 11:25:36 +0100
Subject: [EMBOSS] make error
Message-ID: <1193739936.5962.28.camel@genotech>

Hello,

There is some problem make EMBOSS. The "configure" has ran well, no made
error, or missing componenet, but the "make" exit run with message
attacted make.err file.

How solve the problem?

Thank you!

Balazs Kerti
Szent Istvan University,
Institute of Genetics and Biotechnology
HUN-2103 Godollo, Pater Karoly u. 1.
-------------- next part --------------
Making all in plplot
make[1]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot'
Making all in lib
make[2]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot/lib'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot/lib'
make[2]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot'
/bin/bash ../libtool --tag=CC   --mode=compile gcc -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"EMBOSS\" -DVERSION=\"5.0.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DX_DISPLAY_MISSING=1 -DHAVE_DIRENT_H=1 -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DGETPGRP_VOID=1 -DHAVE_STRFTIME=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_VPRINTF=1 -DHAVE_MEMMOVE=1 -DHAVE_LIBM=1 -I.  -I./ -I/usr/include/gd -DPREFIX=\"/usr/local\" -DBUILD_DIR=\".\" -DDRV_DIR=\".\" -DEMBOSS_TOP=\"/usr/src/EMBOSS-5.0.0\"  -DAJ_LinuxLF -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -DLENDIAN -DNO_AUTH  -O2 -MT xwin.lo -MD -MP -MF .deps/xwin.Tpo -c -o xwin.lo xwin.c
 gcc -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"EMBOSS\" -DVERSION=\"5.0.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DX_DISPLAY_MISSING=1 -DHAVE_DIRENT_H=1 -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DGETPGRP_VOID=1 -DHAVE_STRFTIME=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_VPRINTF=1 -DHAVE_MEMMOVE=1 -DHAVE_LIBM=1 -I. -I./ -I/usr/include/gd -DPREFIX=\"/usr/local\" -DBUILD_DIR=\".\" -DDRV_DIR=\".\" -DEMBOSS_TOP=\"/usr/src/EMBOSS-5.0.0\" -DAJ_LinuxLF -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DLENDIAN -DNO_AUTH -O2 -MT xwin.lo -MD -MP -MF .deps/xwin.Tpo -c xwin.c  -fPIC -DPIC -o .libs/xwin.o
make[2]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot'
make[1]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot'

From jerome.laroche at bioinfo.ulaval.ca  Wed Oct 31 16:46:50 2007
From: jerome.laroche at bioinfo.ulaval.ca (=?ISO-8859-1?Q?J=E9r=F4me_Laroche?=)
Date: Wed, 31 Oct 2007 16:46:50 -0400
Subject: [EMBOSS] dbxflat and size of index files
Message-ID: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>

Hello,

I use dbxflat to index uniprot (sprot and trembl) flat files for  
which the size is 1.2 G for sprot and 11 G for trembl. The resulting  
files are amazingly huge: 11 G. Is it normal?

Another example with Genbank flat files: the division gbsts has a  
size of 3.3 G. Indexing with dbxflat give 6.8 G of index files but  
with dbiflat give only 199 M of index files. I know its not necessary  
to index genbank flat files with dbxflat because each individual file  
is not bigger than 300 M. I did this just for the demonstration.

Apart of this, all is working very well.

Thank you in advance.


J?r?me Laroche

Centre de bioinformatique et de biologie computationnelle
Universit? Laval


From ajb at ebi.ac.uk  Wed Oct 31 18:07:24 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 31 Oct 2007 22:07:24 -0000 (GMT)
Subject: [EMBOSS] dbxflat and size of index files
In-Reply-To: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>
References: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>
Message-ID: <33217.81.98.241.17.1193868444.squirrel@webmail.ebi.ac.uk>

Hello J?r?me,

Yes, it is normal. It is a combination of three things. First, it is a
tree structure, secondly the tree isn't tightly packed and thirdly
64-bit pointers are used throughout. The first will
allow on-the-fly updating of the index, the second is for speed of
construction/updating and the third is obvious. Another
consideration is that, in some cases, the indexes are trees-of-trees
to allow duplicate codes to be indexed (e.g. keywords).

Coincidentally I'm on the lookout for new indexing algorithms at the
moment so, if you have a favourite one then we're always open
for suggestions.

Alan


> Hello,
>
> I use dbxflat to index uniprot (sprot and trembl) flat files for
> which the size is 1.2 G for sprot and 11 G for trembl. The resulting
> files are amazingly huge: 11 G. Is it normal?
>
> Another example with Genbank flat files: the division gbsts has a
> size of 3.3 G. Indexing with dbxflat give 6.8 G of index files but
> with dbiflat give only 199 M of index files. I know its not necessary
> to index genbank flat files with dbxflat because each individual file
> is not bigger than 300 M. I did this just for the demonstration.
>
> Apart of this, all is working very well.
>
> Thank you in advance.
>
>
> J?r?me Laroche
>
> Centre de bioinformatique et de biologie computationnelle
> Universit? Laval
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From fernan at iib.unsam.edu.ar  Tue Oct  2 17:54:05 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Tue, 2 Oct 2007 14:54:05 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
Message-ID: <20071002175405.GA62945@iib.unsam.edu.ar>

Hi,

I've installed TrEMBL in EMBOSS and it seems like I'm having some
problems ... 

I've run dbiflat as follows:

dbiflat -dbname trembl -idformat EMBL -directory .
-filenames uniprot_trembl.dat -release '37.0' -date '24/07/07' 
-fields sv,acc,des,key,org

I've put an entry in my emboss.default configuration
file and the db is listed by showdb.

Also the db seems to works fine with, for example
'textsearch':

[fernan at alfa ~]$ textsearch trembl:* 'cyclase'
Search sequence documentation. Slow, use SRS and Entrez!
Output file [a0b532_mettp.textsearch]: stdout
# Search for: cyclase
trembl-id:A0B532_METTP  A0B532_METTP  A0B532	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A1RWP7_THEPD  A1RWP7_THEPD  A1RWP7	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A2SR85_METLZ  A2SR85_METLZ  A2SR85    Cyclase family protein.
trembl-id:A3H5Q9_9CREN  A3H5Q9_9CREN  A3H5Q9	Magnesium-protoporphyrin IX monomethyl ester (Oxidative) cyclase (EC 1.14.13.81).
trembl-id:A3H7Y6_9CREN  A3H7Y6_9CREN  A3H7Y6	RNA-3'-phosphate cyclase (EC 6.5.1.4).
trembl-id:A6URB1_METVA  A6URB1_METVA  A6URB1    Cyclase family protein.
...

First, I've got a number of warnings when running dbiflat.
Because all of them were about null IDs ('') I've just
ignored them ... I mention it just in case,
Warning: Duplicate ID skipped: '' All hits will point to first ID found

Now, when using seqret, it seems like I'm not getting the
records I expect, for example if I search for the first ID
in the example above (A0B532), I get A0BDZ0 instead:

[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences
output sequence(s) [a0bdz0_parte.fasta]: stdout
>A0BDZ0_PARTE A0BDZ0 Chromosome undetermined scaffold_101, whole genome shotgun sequence.
MLNFPQNARDHFSCDCDPCEFAITHGEEIMPKRVPPQKPIQQIQDKDLGLLLRKLQAPNK
LTRSVRIRIPETCVCNEGEIKFIAYYDESEGFIKFIQKPTFQQTKQFLNERRPPDSLAVI
IKYIDNNMQVMTDMEFTILMMKRKIDPIWSQILYIQNFNSNKNYELQHYEFKHSFDSKYP
EFDLARIEILILNGEIARASSDFVPMVREEAYENSLSQDQYCRYMVYKMVHYADVFGGIQ
ITEGKFSFHKKTFISMEKMEYTDLDRKALFDSEILLRKKKMIDEDMFQFQKLIDQNVKKE
REYALKVYREILDMDNGLDQQSHLLKNKLSVIGYDLKKYSQSIQSNFQQVMVSKDPASTL
KELVIEQKVNEEKLTSILKPKKGEKTKKKM

But if I search for A0B532_METTP I get nothing:
[fernan at alfa ~]$ seqret trembl:A0B532_METTP
Reads and writes (returns) sequences
Error: Unable to read sequence 'trembl:A0B532_METTP'
Died: seqret terminated: Bad value for '-sequence' and no prompt


Now, if I search for A0BDZ0, I get A0BL81 instead:

[fernan at alfa ~]$ seqret trembl:A0BDZ0
Reads and writes (returns) sequences
output sequence(s) [a0bl81_parte.fasta]: stdout
>A0BL81_PARTE A0BL81 Chromosome undetermined scaffold_113, whole genome shotgun sequence.
MKQISESAHILQKVYNPNRMNKLFMTTHYQLQNETDLIFDKYMLMPLFGLSVANGISSNC
IKPKYLCSEYKKQELYDCNLILILSAYSDQAVYRSKTMYEKRNGLEQIFKYLASPNYTYN
IHISLLSYFVPQRVFYKQVLQALNIFELIDQKQIEELTKSSSIINQSVGEDNLDSILFKN
QEFIDYQKWRRMLKNNTIINLKTLHQHQLSQQIFCQYFLRYHYYQGCEEEINKLNKFLVD
DFDMFKFRSRLEHNEKKMKFYFLRMLKYFKLNEKLEIFLKFSFKSYSLDWNKELLREMKN
SLNQYKKQ

Any idea about what is wrong? I also have swissprot
installed (pretty much in the same way) and it works OK with
seqret, both using ACs (Q4U9M9) or IDs (104K_THEAN).

This is on a Linux cluster (Rocks 4.2, with EMBOSS installed from the
Bio roll)

[fernan at alfa ~]$ embossversion 
Writes the current EMBOSS version number
4.0.0

Thanks in advance for any pointer,

Fernan


From simon.andrews at bbsrc.ac.uk  Wed Oct  3 07:37:53 2007
From: simon.andrews at bbsrc.ac.uk (Simon Andrews)
Date: Wed, 3 Oct 2007 08:37:53 +0100
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071002175405.GA62945@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
Message-ID: <CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>


On 2 Oct 2007, at 18:54, Fernan Aguero wrote:

> Hi,
>
> I've installed TrEMBL in EMBOSS and it seems like I'm having some
> problems ...
>
> I've run dbiflat as follows:
[snip]
>
> Now, when using seqret, it seems like I'm not getting the
> records I expect, for example if I search for the first ID
> in the example above (A0B532), I get A0BDZ0 instead:

I suspect your problem is that your trembl file is >2Gb in size.   
Above this size dbiflat won't work properly and will give wacky  
results such as the ones you've shown.  This won't be a problem with  
uniprot_sprot.dat as this is still only about 1.1Gb.

Your choices are therefore:

1) You could split your trembl file into multiple files, each smaller  
than 2Gb.  This ends up being a complete pain, and you probably don't  
want to do it this way.

2) Use the newer dbx* family of indexing programs which can cope with  
larger file sizes.  In your case you'd use dbxflat instead of  
dbiflat.  There are some configuration differences between the two so  
you should read 'tfm dbxflat' first, but they work pretty much the  
same as the old versions.  We use the dbx programs for all of our  
databases and they work fine.

Hope this helps

Simon.


From gbottu at vub.ac.be  Thu Oct  4 10:01:45 2007
From: gbottu at vub.ac.be (Guy Bottu)
Date: Thu, 04 Oct 2007 12:01:45 +0200
Subject: [EMBOSS] Question about acidify
Message-ID: <4704BA09.1000905@vub.ac.be>

	Dear Peter, dear Alan, dear all,

I remember that there had been question of implementing a tool called 
acidify that would allow for the easy integration of software under 
EMBOSS (with the help of an ACD file but without elaborate EMBOSS 
"wrapper" progrm). Can someone tell me how far this has gone. I ask this 
question because my colleagues of the SIMDAT project have expressed 
their interest.

	Guy Bottu,
	BEN


From pmr at ebi.ac.uk  Thu Oct  4 10:40:48 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 04 Oct 2007 11:40:48 +0100
Subject: [EMBOSS] Question about acidify
In-Reply-To: <4704BA09.1000905@vub.ac.be>
References: <4704BA09.1000905@vub.ac.be>
Message-ID: <4704C330.6070102@ebi.ac.uk>

Guy Bottu wrote:
> I remember that there had been question of implementing a tool called 
> acidify that would allow for the easy integration of software under 
> EMBOSS (with the help of an ACD file but without elaborate EMBOSS 
> "wrapper" progrm). Can someone tell me how far this has gone. I ask this 
> question because my colleagues of the SIMDAT project have expressed 
> their interest.

We are working on making this easier in ACD. I added some functions when Alan 
was writing wrappers for MIRA.

We already have ACD extensions for SoapLab to provide additional definitions for 
external applications. These are used to generate the XML definitions used by 
SoapLab for non-EMBOSS applications, but can be generally useful.

Do you have examples of the ACD files that would be useful for SIMDAT? Are any 
new datatypes involved?

regards,

Peter


From fernan at iib.unsam.edu.ar  Thu Oct  4 14:08:22 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Thu, 4 Oct 2007 11:08:22 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
Message-ID: <20071004140822.GA96432@iib.unsam.edu.ar>


| On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
| 
| > Hi,
| >
| > I've installed TrEMBL in EMBOSS and it seems like I'm having some
| > problems ...
| >
| > I've run dbiflat as follows:
| [snip]
| >
| > Now, when using seqret, it seems like I'm not getting the
| > records I expect, for example if I search for the first ID
| > in the example above (A0B532), I get A0BDZ0 instead:
| 
| I suspect your problem is that your trembl file is >2Gb in size.   
| Above this size dbiflat won't work properly and will give wacky  
| results such as the ones you've shown.  This won't be a problem with  
| uniprot_sprot.dat as this is still only about 1.1Gb.
| 
| Your choices are therefore:
| 
| 1) You could split your trembl file into multiple files, each smaller  
| than 2Gb.  This ends up being a complete pain, and you probably don't  
| want to do it this way.
| 
| 2) Use the newer dbx* family of indexing programs which can cope with  
| larger file sizes.  In your case you'd use dbxflat instead of  
| dbiflat.  There are some configuration differences between the two so  
| you should read 'tfm dbxflat' first, but they work pretty much the  
| same as the old versions.  We use the dbx programs for all of our  
| databases and they work fine.
| 
| Hope this helps
| 
| Simon.
 
Simon,

thanks for your suggestions. I've been waiting for dbxflat
to finish before replying ... thus the delay.

You mention that there are some configuration
differences between db(x|i)flat  ... I guess I've got into those
now ... even after reading tfm for dbxflat, it seems I can't
just set it up right

===> Configuration
DB trembl [
        type: P
        comment: "TrEMBL 37.0"
        method: emblcd
        format: embl
        dbalias: trembl
        dir: /share/bio/emboss/trembl/
        file: uniprot_trembl.dat
        indexdirectory: /share/bio/emboss/trembl
]

With this configuration, I get this error:
[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences
Warning: Cannot open division file '<null>' for database 'trembl'
Warning: seqCdQry failed
Error: Unable to read sequence 'trembl:A0B532'
Died: seqret terminated: Bad value for '-sequence' and no prompt

If I change the 'method' to 'method: emboss'
as per the example in the dbxflat docs, I get this error:

[fernan at alfa ~]$ seqret trembl:A0B532
Reads and writes (returns) sequences

   EMBOSS An error in ajindex.c at line 3028:
Cannot open param file /share/bio/emboss/trembl/trembl.pxid

This file does not exist (see result of indexing below):

===> Indexing
[root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
-directory . -filenames uniprot_trembl.dat -release "37.0"
-date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
indexing for flat file databases
Resource name: embl
Processing file ./uniprot_trembl.dat
[root at alfa trembl]# du -hc *
4.0K    dbxflat.command
4.0K    trembl.ent
4.0K    trembl.pxac
4.0K    trembl.pxde
4.0K    trembl.pxkw
4.0K    trembl.pxsv
4.0K    trembl.pxtx
572M    trembl.xac
4.2G    trembl.xde
381M    trembl.xkw
4.0K    trembl.xsv
3.0G    trembl.xtx
11G     uniprot_trembl.dat
19G     total

I've also tried other combinations of 'method' (emboss,
emblcd) and 'format' (swiss, embl) without success ...

Am I indexing the db with the right incantation for dbxflat?
If so, what am I missing in my configuration?

Thanks again for any pointer,

Fernan

PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
CentOS)


From georgios at biotek.uio.no  Thu Oct  4 14:53:38 2007
From: georgios at biotek.uio.no (George Magklaras)
Date: Thu, 04 Oct 2007 16:53:38 +0200
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071004140822.GA96432@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>	<CB748C86-38DF-4402-B677-79C174B734C9@bbsrc.ac.uk>
	<20071004140822.GA96432@iib.unsam.edu.ar>
Message-ID: <4704FE72.1090206@biotek.uio.no>

Maybe you are missing the resource record in the emboss.default file for 
the trembl databank and you have passed the wrong arguments to dbxflat. 
  You should choose the emboss method in the DB entry. Then, the 
emboss.default file should contain also a resource entry for trembl:

RES trembl [
    type: Index
    idlen:  15
    acclen: 15
    svlen:  20
    keylen: 30
    deslen: 25
    orglen: 25
]

 From your dbxflat output you quote I can see that the command points to 
the embl resource:

[root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL <--- Why EMBL?
-directory . -filenames uniprot_trembl.dat -release "37.0"
-date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
indexing for flat file databases
Resource name: embl  <--- That should say trembl, Why did you choose 
embl here?


When the dbxflat command asked you for a resource name, you really 
should have a trembl RES entry and I am not sure that your idformat 
(EMBL) is correct.


GM


-- 
--
George Magklaras

Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/

EMBnet Norway:	http://www.no.embnet.org/


Fernan Aguero wrote:
>  
> | On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
> | 
> | > Hi,
> | >
> | > I've installed TrEMBL in EMBOSS and it seems like I'm having some
> | > problems ...
> | >
> | > I've run dbiflat as follows:
> | [snip]
> | >
> | > Now, when using seqret, it seems like I'm not getting the
> | > records I expect, for example if I search for the first ID
> | > in the example above (A0B532), I get A0BDZ0 instead:
> | 
> | I suspect your problem is that your trembl file is >2Gb in size.   
> | Above this size dbiflat won't work properly and will give wacky  
> | results such as the ones you've shown.  This won't be a problem with  
> | uniprot_sprot.dat as this is still only about 1.1Gb.
> | 
> | Your choices are therefore:
> | 
> | 1) You could split your trembl file into multiple files, each smaller  
> | than 2Gb.  This ends up being a complete pain, and you probably don't  
> | want to do it this way.
> | 
> | 2) Use the newer dbx* family of indexing programs which can cope with  
> | larger file sizes.  In your case you'd use dbxflat instead of  
> | dbiflat.  There are some configuration differences between the two so  
> | you should read 'tfm dbxflat' first, but they work pretty much the  
> | same as the old versions.  We use the dbx programs for all of our  
> | databases and they work fine.
> | 
> | Hope this helps
> | 
> | Simon.
>  
> Simon,
> 
> thanks for your suggestions. I've been waiting for dbxflat
> to finish before replying ... thus the delay.
> 
> You mention that there are some configuration
> differences between db(x|i)flat  ... I guess I've got into those
> now ... even after reading tfm for dbxflat, it seems I can't
> just set it up right
> 
> ===> Configuration
> DB trembl [
>         type: P
>         comment: "TrEMBL 37.0"
>         method: emblcd
>         format: embl
>         dbalias: trembl
>         dir: /share/bio/emboss/trembl/
>         file: uniprot_trembl.dat
>         indexdirectory: /share/bio/emboss/trembl
> ]
> 
> With this configuration, I get this error:
> [fernan at alfa ~]$ seqret trembl:A0B532
> Reads and writes (returns) sequences
> Warning: Cannot open division file '<null>' for database 'trembl'
> Warning: seqCdQry failed
> Error: Unable to read sequence 'trembl:A0B532'
> Died: seqret terminated: Bad value for '-sequence' and no prompt
> 
> If I change the 'method' to 'method: emboss'
> as per the example in the dbxflat docs, I get this error:
> 
> [fernan at alfa ~]$ seqret trembl:A0B532
> Reads and writes (returns) sequences
> 
>    EMBOSS An error in ajindex.c at line 3028:
> Cannot open param file /share/bio/emboss/trembl/trembl.pxid
> 
> This file does not exist (see result of indexing below):
> 
> ===> Indexing
> [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
> -directory . -filenames uniprot_trembl.dat -release "37.0"
> -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
> indexing for flat file databases
> Resource name: embl
> Processing file ./uniprot_trembl.dat
> [root at alfa trembl]# du -hc *
> 4.0K    dbxflat.command
> 4.0K    trembl.ent
> 4.0K    trembl.pxac
> 4.0K    trembl.pxde
> 4.0K    trembl.pxkw
> 4.0K    trembl.pxsv
> 4.0K    trembl.pxtx
> 572M    trembl.xac
> 4.2G    trembl.xde
> 381M    trembl.xkw
> 4.0K    trembl.xsv
> 3.0G    trembl.xtx
> 11G     uniprot_trembl.dat
> 19G     total
> 
> I've also tried other combinations of 'method' (emboss,
> emblcd) and 'format' (swiss, embl) without success ...
> 
> Am I indexing the db with the right incantation for dbxflat?
> If so, what am I missing in my configuration?
> 
> Thanks again for any pointer,
> 
> Fernan
> 
> PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
> CentOS)
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 


From fernan at iib.unsam.edu.ar  Thu Oct  4 22:41:44 2007
From: fernan at iib.unsam.edu.ar (Fernan Aguero)
Date: Thu, 4 Oct 2007 19:41:44 -0300
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <4704FE72.1090206@biotek.uio.no>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<4704FE72.1090206@biotek.uio.no>
Message-ID: <20071004224144.GA98760@iib.unsam.edu.ar>

George, 

thanks for your points.

| Maybe you are missing the resource record in the emboss.default file for 
| the trembl databank and you have passed the wrong arguments to dbxflat. 

I have this resource record in my emboss.default conf

RES embl [ type: Index
  idlen:  15
  acclen: 15
  svlen:  15
  keylen: 25
  deslen: 25
  orglen: 25
]

|   You should choose the emboss method in the DB entry. 

OK

| Then, the 
| emboss.default file should contain also a resource entry for trembl:
| 
| RES trembl [
|     type: Index
|     idlen:  15
|     acclen: 15
|     svlen:  20
|     keylen: 30
|     deslen: 25
|     orglen: 25
| ]

Does the name of the resource matter? Mine is named 'embl' ...

|  From your dbxflat output you quote I can see that the command points to 
| the embl resource:
| 
| [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL <--- Why EMBL?

What other options are there SWISS? GCG? GENBANK? This is AFAIK an
EMBL formatted file. But maybe I'm wrong ...

| -directory . -filenames uniprot_trembl.dat -release "37.0"
| -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
| indexing for flat file databases
| Resource name: embl  <--- That should say trembl, Why did you choose 
| embl here?

Because the resource in my emboss.default file is named 'embl'.

| 
| When the dbxflat command asked you for a resource name, you really 
| should have a trembl RES entry and I am not sure that your idformat 
| (EMBL) is correct.
|
| GM
| --
| George Magklaras

Mmm ... maybe it's SWISS then?

>From the dbxflat docs:
      EMBL : EMBL
     SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
        GB : Genbank, DDBJ
    REFSEQ : Refseq
Entry format [SWISS]: 

Thanks for your questions and pointers. I'm running dbxflat
overnight again to see if this makes any difference
(-idformat SWISS -resource trembl, with a new trembl RES
line added to emboss.default). But so far, only 6 trembl.*
files are being produced and none of them is called
trembl.pxid (as per the error in my original message, see
below).

[root at alfa trembl]# ls trembl.*
trembl.ent  trembl.xac  trembl.xde  trembl.xkw  trembl.xsv trembl.xtx

Fernan

PS: this is the first entry in my uniprot_trembl.dat file

[fernan at alfa trembl]$ head -45 uniprot_trembl.dat 
ID   A0B532_METTP            Unreviewed;       337 AA.
AC   A0B532;
DT   28-NOV-2006, integrated into UniProtKB/TrEMBL.
DT   28-NOV-2006, sequence version 1.
DT   24-JUL-2007, entry version 6.
DE   RNA-3'-phosphate cyclase (EC 6.5.1.4).
GN   OrderedLocusNames=Mthe_0003;
OS   Methanosaeta thermophila (strain DSM 6194 / PT) (Methanothrix
OS   thermophila (strain DSM 6194 / PT)).
OC   Archaea; Euryarchaeota; Methanomicrobia; Methanosarcinales;
OC   Methanosaetaceae; Methanosaeta.
OX   NCBI_TaxID=349307;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RG   US DOE Joint Genome Institute;
RA   Copeland A., Lucas S., Lapidus A., Barry K., Detter J.C.,
RA   Glavina del Rio T., Hammon N., Israni S., Pitluck S., Chain P.,
RA   Malfatti S., Shin M., Vergez L., Schmutz J., Larimer F., Land M.,
RA   Hauser L., Kyrpides N., Kim E., Smith K.S., Ingram-Smith C.,
RA   Richardson P.;
RT   "Complete sequence of Methanosaeta thermophila PT.";
RL   Submitted (OCT-2006) to the EMBL/GenBank/DDBJ databases.
CC   -----------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution-NoDerivs License
CC   -----------------------------------------------------------------------
DR   EMBL; CP000477; ABK13806.1; -; Genomic_DNA.
DR   GenomeReviews; CP000477_GR; Mthe_0003.
DR   GO; GO:0003963; F:RNA-3'-phosphate cyclase activity; IEA:InterPro.
DR   InterPro; IPR000228; RNA3'_term_phos_cycl.
DR   InterPro; IPR013796; RNA3'_term_phos_cycl_insert.
DR   PANTHER; PTHR11096; RNA3'_term_phos_cycl; 1.
DR   Pfam; PF01137; RTC; 1.
DR   Pfam; PF05189; RTC_insert; 1.
DR   PROSITE; PS01287; RTC; 1.
PE   4: Predicted;
KW   Complete proteome; Ligase.
SQ   SEQUENCE   337 AA;  36340 MW;  69F26755A1B8DA03 CRC64;
     MNKPQMIEID GSYGEGGGQI VRTSVALSTL TGIPVRIKNI RRNRPRPGLA AQHVRAIEAL
     AQISRAETRG VHLGSEEIEF IPGRISAGSY DVDIGTAGSV TLLIQCLLPA LTAAEGPVTV
     TVRGGTDVRW SPTVDYLEHV ALPAMHLFGV TATFRCERRG YYPRGGGVVV LSTRPSRLRP
     ARLELIEEGI CGISHCGSLP EHVARRQADA ALELLKEKGY DARIDIQTMS SSSPGSGITL
     WSGFRGSSAL GERGVRAEDV GREAAKALID ELKSKASVDV HLADQLIPYI ALAGGEYTTR
     EISSHTRTNI WTAQRILRCR IDIDEGEVFR IHSTGSG
//


| Fernan Aguero wrote:
| >  
| > | On 2 Oct 2007, at 18:54, Fernan Aguero wrote:
| > | 
| > | > Hi,
| > | >
| > | > I've installed TrEMBL in EMBOSS and it seems like I'm having some
| > | > problems ...
| > | >
| > | > I've run dbiflat as follows:
| > | [snip]
| > | >
| > | > Now, when using seqret, it seems like I'm not getting the
| > | > records I expect, for example if I search for the first ID
| > | > in the example above (A0B532), I get A0BDZ0 instead:
| > | 
| > | I suspect your problem is that your trembl file is >2Gb in size.   
| > | Above this size dbiflat won't work properly and will give wacky  
| > | results such as the ones you've shown.  This won't be a problem with  
| > | uniprot_sprot.dat as this is still only about 1.1Gb.
| > | 
| > | Your choices are therefore:
| > | 
| > | 1) You could split your trembl file into multiple files, each smaller  
| > | than 2Gb.  This ends up being a complete pain, and you probably don't  
| > | want to do it this way.
| > | 
| > | 2) Use the newer dbx* family of indexing programs which can cope with  
| > | larger file sizes.  In your case you'd use dbxflat instead of  
| > | dbiflat.  There are some configuration differences between the two so  
| > | you should read 'tfm dbxflat' first, but they work pretty much the  
| > | same as the old versions.  We use the dbx programs for all of our  
| > | databases and they work fine.
| > | 
| > | Hope this helps
| > | 
| > | Simon.
| >  
| > Simon,
| > 
| > thanks for your suggestions. I've been waiting for dbxflat
| > to finish before replying ... thus the delay.
| > 
| > You mention that there are some configuration
| > differences between db(x|i)flat  ... I guess I've got into those
| > now ... even after reading tfm for dbxflat, it seems I can't
| > just set it up right
| > 
| > ===> Configuration
| > DB trembl [
| >         type: P
| >         comment: "TrEMBL 37.0"
| >         method: emblcd
| >         format: embl
| >         dbalias: trembl
| >         dir: /share/bio/emboss/trembl/
| >         file: uniprot_trembl.dat
| >         indexdirectory: /share/bio/emboss/trembl
| > ]
| > 
| > With this configuration, I get this error:
| > [fernan at alfa ~]$ seqret trembl:A0B532
| > Reads and writes (returns) sequences
| > Warning: Cannot open division file '<null>' for database 'trembl'
| > Warning: seqCdQry failed
| > Error: Unable to read sequence 'trembl:A0B532'
| > Died: seqret terminated: Bad value for '-sequence' and no prompt
| > 
| > If I change the 'method' to 'method: emboss'
| > as per the example in the dbxflat docs, I get this error:
| > 
| > [fernan at alfa ~]$ seqret trembl:A0B532
| > Reads and writes (returns) sequences
| > 
| >    EMBOSS An error in ajindex.c at line 3028:
| > Cannot open param file /share/bio/emboss/trembl/trembl.pxid
| > 
| > This file does not exist (see result of indexing below):
| > 
| > ===> Indexing
| > [root at alfa trembl]# dbxflat -dbname trembl -idformat EMBL
| > -directory . -filenames uniprot_trembl.dat -release "37.0"
| > -date "24/07/07" -fields sv,acc,des,key,orgDatabase b+tree
| > indexing for flat file databases
| > Resource name: embl
| > Processing file ./uniprot_trembl.dat
| > [root at alfa trembl]# du -hc *
| > 4.0K    dbxflat.command
| > 4.0K    trembl.ent
| > 4.0K    trembl.pxac
| > 4.0K    trembl.pxde
| > 4.0K    trembl.pxkw
| > 4.0K    trembl.pxsv
| > 4.0K    trembl.pxtx
| > 572M    trembl.xac
| > 4.2G    trembl.xde
| > 381M    trembl.xkw
| > 4.0K    trembl.xsv
| > 3.0G    trembl.xtx
| > 11G     uniprot_trembl.dat
| > 19G     total
| > 
| > I've also tried other combinations of 'method' (emboss,
| > emblcd) and 'format' (swiss, embl) without success ...
| > 
| > Am I indexing the db with the right incantation for dbxflat?
| > If so, what am I missing in my configuration?
| > 
| > Thanks again for any pointer,
| > 
| > Fernan
| > 
| > PS: this is on emboss-4.0.0 running on a Rocks Cluster (4.2,
| > CentOS)
| > 
| > _______________________________________________
| > EMBOSS mailing list
| > EMBOSS at lists.open-bio.org
| > http://lists.open-bio.org/mailman/listinfo/emboss
| > 
| 
| 
| 
| 
| _______________________________________________
| EMBOSS mailing list
| EMBOSS at lists.open-bio.org
| http://lists.open-bio.org/mailman/listinfo/emboss
| 
|
+----]


From sum732 at mail.usask.ca  Fri Oct  5 23:38:01 2007
From: sum732 at mail.usask.ca (Sudeep Mehrotra)
Date: Fri, 05 Oct 2007 17:38:01 -0600
Subject: [EMBOSS] Seqret and searching a database with entries in a file
Message-ID: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>

Hello,
I am wondering if  I can use "seqret" from EMBOSS to perform  
following action.

I have a database and I have a file which consists of list of protein  
IDs. I want use seqret to search each entry (in the given file) in  
the given database and output the search into another file.
for example:
seqret "path to the database":AAT37944.1.
If I use the above mentioned command on command line, I get the  
output (protein name, protein sequence etc) in fasta format  
consisting the entry. What I want to do is instead of giving one  
entry I want to give the whole file, which consists of similar entries.

Can some one help me here.
Thanks
Sudeep


From david.bauer at bayerhealthcare.com  Sat Oct  6 19:13:34 2007
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Sat, 6 Oct 2007 21:13:34 +0200
Subject: [EMBOSS] Seqret and searching a database with entries in a file
In-Reply-To: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
Message-ID: <OFDF8329EB.08CB3542-ONC125736C.0068FB65-C125736C.00699D5A@schering.de>

Hi Sudeep,

if you add a "@" character in front of a filename, EMBOSS interprets this 
as a "file of filenames".
So you can put all your IDs including the database name into a file (e.g. 
myseqs.fof).
Then you run "seqret @myseqs.fof".

Cheers,
David.

emboss-bounces at lists.open-bio.org schrieb am 06/10/2007 01:38:01:

> Hello,
> I am wondering if  I can use "seqret" from EMBOSS to perform 
> following action.
> 
> I have a database and I have a file which consists of list of protein 
> IDs. I want use seqret to search each entry (in the given file) in 
> the given database and output the search into another file.
> for example:
> seqret "path to the database":AAT37944.1.
> If I use the above mentioned command on command line, I get the 
> output (protein name, protein sequence etc) in fasta format 
> consisting the entry. What I want to do is instead of giving one 
> entry I want to give the whole file, which consists of similar entries.
> 
> Can some one help me here.
> Thanks
> Sudeep
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From gbottu at vub.ac.be  Mon Oct  8 07:12:18 2007
From: gbottu at vub.ac.be (Guy Bottu)
Date: Mon, 08 Oct 2007 09:12:18 +0200
Subject: [EMBOSS] Seqret and searching a database with entries in a file
In-Reply-To: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
References: <986A5EE0-8709-4657-B7CB-84A43513D308@mail.usask.ca>
Message-ID: <4709D852.20007@vub.ac.be>

Sudeep Mehrotra wrote:
> I have a database and I have a file which consists of list of protein  
> IDs. I want use seqret to search each entry (in the given file) in  
> the given database and output the search into another file.

	Dear Sudeep,

If you can, using some script, transform your file into format :

xxx:AC3355
xxx:CG6754
xxx:AV6754

with xxx the name of the databank (you might have to use bare accession 
numbers rather than version numbers), then it is easy, just run

seqret list::File

If you want the original entries rather than the entries in fastA 
format, use entret instead of seqret.

	Guy Bottu,
	Belgian EMBnet Node


From charles-listes-emboss at plessy.org  Mon Oct  8 06:30:50 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Mon, 8 Oct 2007 15:30:50 +0900
Subject: [EMBOSS] About the EMBOSS quick guide.
Message-ID: <20071008063047.GB9819@kunpuu.plessy.org>

Dear EMBOSS developpers,

I am member of a packaging team that takes care of integrating EMBOSS in
Debian. I just realised today that the Quick Guide to EMBOSS is
released under a "noncommercial" licence.

file:///usr/share/EMBOSS/doc/manuals/emboss_qg.pdf

Debian puts a strong emphasis on not mixing programs which do not meet
the "Debian Free Software Guidelines" (DFSG) with the ones which do. In
our case, EMBOSS is free according to the DFSG, but not the Quick Guide,
as restrictions on commercial use do not comply whith the guideline
number 6:

  No Discrimination Against Fields of Endeavor

  The license must not restrict anyone from making use of the program in a
  specific field of endeavor. For example, it may not restrict the program
  from being used in a business, or from being used for genetic research.

>From my packager point of view, the simplest way to solve this problem
would be that you relicence the Quick Guide under a free licence
according to the DFSG, such as BSD or GPL for instance. Unfortunately,
the guide's author, David Martin, left EMBnet and I do not know how to
contact him.

Importantly, the DFSG also require the sources of works distributed in
Debian to be available. If it is possible to relicence the Quick Guide,
could somebody send me its sources ? Debian integrates a bug reporting
and tracking system, and having the sources available in Debian could
bring opportunities to receive patches.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team.
http://www.debian.org/devel/debian-med
Wako, Saitama, Japan


From georgios at biotek.uio.no  Mon Oct  8 08:59:56 2007
From: georgios at biotek.uio.no (George Magklaras)
Date: Mon, 08 Oct 2007 10:59:56 +0200
Subject: [EMBOSS] problems installing/using TrEMBL
In-Reply-To: <20071004224144.GA98760@iib.unsam.edu.ar>
References: <20071002175405.GA62945@iib.unsam.edu.ar>
	<4704FE72.1090206@biotek.uio.no>
	<20071004224144.GA98760@iib.unsam.edu.ar>
Message-ID: <4709F18C.2070304@biotek.uio.no>

Hi Fernan,

Fernan Aguero wrote:
> George, 

> 
> Does the name of the resource matter? Mine is named 'embl' ...
> 
If you plan to have the same values for all databases, no. But I tend to 
choose different length values for different databanks, so in that case, 
I have a different RES entry for each databank.


> What other options are there SWISS? GCG? GENBANK? This is AFAIK an
> EMBL formatted file. But maybe I'm wrong ...
>
I believe that TrEMBL should be formatted with the SWISS entry format in 
dbxflat (-idformat SWISS).


-- 
--
George Magklaras

Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/

EMBnet Norway:	http://www.no.embnet.org/


From charles-listes-emboss at plessy.org  Mon Oct  8 23:38:28 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Tue, 9 Oct 2007 08:38:28 +0900
Subject: [EMBOSS] Bug in degapseq ?
Message-ID: <20071008233828.GA32069@kunpuu.plessy.org>

Dear developpers,

If I use degaspeq on a file, it prompts me for the name of the output, but if
the data comes from stdin, degaspeq crashes. It does not happen if the name of
the output is given.

chouca?~?$ cat toto
>Xenopus-1a
-----MVLLKCEYRDEEEDLTS---ASPCSV--TSSFRSPAT----QTCSSDDEQLLSPT
SP--------------GQHQGEE---NS----------------------------PRCR
RSRGRA-QGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGD-----P-VHRS--AS-----TPAAAI---LV---QDSSS
SQSP-----SWS--CSSSPSS-----S-------CCSFS--PASP----ASST--SDSIE
SWQ---PSELHLNPFMSASSA---FI----
>Xenopus-1b
-----MVLLKCEYRDEVSELTS---VSPCSVSSSSSHPSPAM----QTCSSDDEQLHSPT
SPTL-------THLQQGRDQGEE---NS----------------------------PRCR
RSRAR------GDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAKLTKI
ETLRFAHNYIWALSETLRLAD-----Q-LHGS--TS-----TPAAAI---LV---QDSYP
SLSP-----SWS--CSSSPSS----NS-------CDSFS--PTSP----ASST--SDSIE
YWQ---PSELRLNPFMSAL-----------
>Gallus-2
------MPVKAESPAPAAEDE--L-LLLRLASPAPSASLP-------SSAGEEDEDEEDG
RP-------------RRLQEGA----------------------------------RRAG
RQRGPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKI
ETLRFAHNYIWALTETLRL----AGAARLGGA--AD-A---APGAA-----A---EG-SP
SPAS-----SWS--GGASPAP-----SA---SPYACTLS--PGSP----AGSA--SD-AE
HW---PPPRGRFAPPPPPHR----CL----

chouca?~?$ cat toto | degapseq stdin
Removes gap characters from sequences
output sequence(s) [xenopus-1a.fasta]: 
   EMBOSS An error in ajmess.c at line 1662:
END-OF-FILE reading from user

chouca?~?$ cat toto | degapseq stdin stdout
Removes gap characters from sequences
>Xenopus-1a
MVLLKCEYRDEEEDLTSASPCSVTSSFRSPATQTCSSDDEQLLSPTSPGQHQGEENSPRC
RRSRGRAQGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGDPVHRSASTPAAAILVQDSSSSQSPSWSCSSSPSSSCCSF
SPASPASSTSDSIESWQPSELHLNPFMSASSAFI
>Xenopus-1b
MVLLKCEYRDEVSELTSVSPCSVSSSSSHPSPAMQTCSSDDEQLHSPTSPTLTHLQQGRD
QGEENSPRCRRSRARGDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAK
LTKIETLRFAHNYIWALSETLRLADQLHGSTSTPAAAILVQDSYPSLSPSWSCSSSPSSN
SCDSFSPTSPASSTSDSIEYWQPSELRLNPFMSAL
>Gallus-2
MPVKAESPAPAAEDELLLLRLASPAPSASLPSSAGEEDEDEEDGRPRRLQEGARRAGRQR
GPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKIETL
RFAHNYIWALTETLRLAGAARLGGAADAAPGAAAEGSPSPASSWSGGASPAPSASPYACT
LSPGSPAGSASDAEHWPPPRGRFAPPPPPHRCL

chouca?~?$ degapseq toto
Removes gap characters from sequences
output sequence(s) [xenopus-1a.fasta]: stdout
>Xenopus-1a
MVLLKCEYRDEEEDLTSASPCSVTSSFRSPATQTCSSDDEQLLSPTSPGQHQGEENSPRC
RRSRGRAQGKSGETVLKIKKTRRVKANNRERNRMHNLNSALDSLREVLPSLPEDAKLTKI
ETLRFAYNYIWALSETLRLGDPVHRSASTPAAAILVQDSSSSQSPSWSCSSSPSSSCCSF
SPASPASSTSDSIESWQPSELHLNPFMSASSAFI
>Xenopus-1b
MVLLKCEYRDEVSELTSVSPCSVSSSSSHPSPAMQTCSSDDEQLHSPTSPTLTHLQQGRD
QGEENSPRCRRSRARGDTVLKIKKTRRVKANNRERNRMHHLNYALDSLREVLPSLPEDAK
LTKIETLRFAHNYIWALSETLRLADQLHGSTSTPAAAILVQDSYPSLSPSWSCSSSPSSN
SCDSFSPTSPASSTSDSIEYWQPSELRLNPFMSAL
>Gallus-2
MPVKAESPAPAAEDELLLLRLASPAPSASLPSSAGEEDEDEEDGRPRRLQEGARRAGRQR
GPPRAARTAETAQRIKRSRRLKANNRERNRMHNLNAALDALRDVLPTFPEDAKLTKIETL
RFAHNYIWALTETLRLAGAARLGGAADAAPGAAAEGSPSPASSWSGGASPAPSASPYACT
LSPGSPAGSASDAEHWPPPRGRFAPPPPPHRCL

Have a nice day,

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan


From david at compbio.dundee.ac.uk  Tue Oct  9 15:56:57 2007
From: david at compbio.dundee.ac.uk (David Martin)
Date: Tue, 09 Oct 2007 16:56:57 +0100
Subject: [EMBOSS] Updating the Quick Guide
Message-ID: <C3316359.2C38C%david@compbio.dundee.ac.uk>

Prompted by charles' request yesterday I am in the process of updating the
EMBOSS quick guide. it was last touched about 8 years ago so comments and
suggestions on what is new, and what should be dropped would be much
appreciated.

..d


From andrespinzon at gmail.com  Tue Oct  9 16:32:09 2007
From: andrespinzon at gmail.com (Andres Pinzon)
Date: Tue, 9 Oct 2007 11:32:09 -0500
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <C3316359.2C38C%david@compbio.dundee.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
Message-ID: <8968fc7e0710090932g63b77a9k7d83bea25c176349@mail.gmail.com>

David,
I am in the process of writing an EMBOSS book, called "An?lisis de
secuencias usando EMBOSS", (" Molecular sequence analysis using
EMBOSS", in english),  it will be released under a CC license (and of
course Open Source), maybe some of the book content can be used.
Please, if you need help on the "old" quick guide update please let me
know it, Ill be more than glad on helping.

Regards,

On 10/9/07, David Martin <david at compbio.dundee.ac.uk> wrote:
> Prompted by charles' request yesterday I am in the process of updating the
> EMBOSS quick guide. it was last touched about 8 years ago so comments and
> suggestions on what is new, and what should be dropped would be much
> appreciated.
>
> ..d
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
Andr?s Pinz?n
http://bioinf.ibun.unal.edu.co/~apinzon/
Bioinformatics Center, Colombia EMBnet node
http://bioinf.ibun.unal.edu.co
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
http://bioinf.uniandes.edu.co
Tel +571 3394949 ext. 2768


From michael.watson at bbsrc.ac.uk  Wed Oct 10 13:02:49 2007
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Wed, 10 Oct 2007 14:02:49 +0100
Subject: [EMBOSS] XFree86 vs xorg
Message-ID: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>

Hi

My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
see this is part of XFree86-devel.  However, as a red hat enterprise
linux 4 user, my X windows seems to be the x.org branch rather than
XFree86....

So, is there a workaround, or should I overwrite my xorg libraries with
XFree86 ones?

Thanks
Mick

The information contained in this message may be confidential or legally
privileged and is intended solely for the addressee. If you have
received this message in error please delete it & notify the originator
immediately.
Unauthorised use, disclosure, copying or alteration of this message is
forbidden & may be unlawful. 
The contents of this e-mail are the views of the sender and do not
necessarily represent the views of the Institute. 
This email and associated attachments has been checked locally for
viruses but we can accept no responsibility once it has left our
systems.
Communications on Institute computers are monitored to secure the
effective operation of the systems and for other lawful purposes. 


From dalloliogm at gmail.com  Wed Oct 10 13:23:01 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Wed, 10 Oct 2007 15:23:01 +0200
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <C3316359.2C38C%david@compbio.dundee.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
Message-ID: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>

You should update the guide on how to install emboss.

In particular, explain how to use the .deb and .rpm packages, since a
lot of people still try to install emboss by compiling it, and it is a
pain.


2007/10/9, David Martin <david at compbio.dundee.ac.uk>:
> Prompted by charles' request yesterday I am in the process of updating the
> EMBOSS quick guide. it was last touched about 8 years ago so comments and
> suggestions on what is new, and what should be dropped would be much
> appreciated.
>
> ..d
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com


From ajb at ebi.ac.uk  Wed Oct 10 13:30:09 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 10 Oct 2007 14:30:09 +0100 (BST)
Subject: [EMBOSS] XFree86 vs xorg
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <50101.81.98.241.17.1192023009.squirrel@webmail.ebi.ac.uk>

Hello Mick,

For xorg all you need to do is to install the xorg-x11-proto-devel RPM
and then, in EMBOSS-5.0.0, do a 'make clean' and configure again.

You might want to install the gd-devel RPM at the same time (to get PNG
support). If you install them both using 'yum' then all the dependencies
will be pulled-in.

HTH

Alan


> Hi
>
> My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
> see this is part of XFree86-devel.  However, as a red hat enterprise
> linux 4 user, my X windows seems to be the x.org branch rather than
> XFree86....
>
> So, is there a workaround, or should I overwrite my xorg libraries with
> XFree86 ones?
>
> Thanks
> Mick
>
> The information contained in this message may be confidential or legally
> privileged and is intended solely for the addressee. If you have
> received this message in error please delete it & notify the originator
> immediately.
> Unauthorised use, disclosure, copying or alteration of this message is
> forbidden & may be unlawful.
> The contents of this e-mail are the views of the sender and do not
> necessarily represent the views of the Institute.
> This email and associated attachments has been checked locally for
> viruses but we can accept no responsibility once it has left our
> systems.
> Communications on Institute computers are monitored to secure the
> effective operation of the systems and for other lawful purposes.
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From ajb at ebi.ac.uk  Wed Oct 10 13:39:26 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 10 Oct 2007 14:39:26 +0100 (BST)
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
Message-ID: <42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>

> You should update the guide on how to install emboss.
>
> In particular, explain how to use the .deb and .rpm packages, since a
> lot of people still try to install emboss by compiling it, and it is a
> pain.

I'll leave that up to David to decide but the information is in the new
FAQ which, yesterday, I submitted to my colleagues for approval and
will then appear in CVS and later online. There was already some RPM
info around but no .deb stuff. The info will also be in the books
which Jon mentioned recently.

Alan


From charles-listes-emboss at plessy.org  Wed Oct 10 13:24:08 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 10 Oct 2007 22:24:08 +0900
Subject: [EMBOSS] XFree86 vs xorg
In-Reply-To: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
References: <8975119BCD0AC5419D61A9CF1A923E9505A4F2EA@iahce2ksrv1.iah.bbsrc.ac.uk>
Message-ID: <20071010132408.GJ990@kunpuu.plessy.org>

Le Wed, Oct 10, 2007 at 02:02:49PM +0100, michael watson (IAH-C) a ?crit :
> Hi
> 
> My EMBOSS 5.0 install failed as it couldn't find Xlib.h.  On googling, I
> see this is part of XFree86-devel.  However, as a red hat enterprise
> linux 4 user, my X windows seems to be the x.org branch rather than
> XFree86....

In Xorg, the libraries have been separated in individual packages. I
think that you can find Xlib.h in a package named libx11-devel, or
something like this.

Have a nice day,

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan


From dalloliogm at gmail.com  Wed Oct 10 14:06:16 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Wed, 10 Oct 2007 16:06:16 +0200
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
Message-ID: <5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>

2007/10/10, ajb at ebi.ac.uk <ajb at ebi.ac.uk>:
> There was already some RPM
> info around but no .deb stuff. The info will also be in the books
> which Jon mentioned recently.
>

hi,
there is an emboss 5.0 package in debian sid.

You just have to add something like this:
"""
If you are a debian/ubuntu user, you can install emboss by giving the command:
>>> sudo aptitude install emboss
to install the package.
"""

Actually, this would work only for Debian Sid, but I believe the
package will be included also in Ubuntu 7/10 and in debian etch in the
short time.


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com


From charles-listes-emboss at plessy.org  Wed Oct 10 14:55:35 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 10 Oct 2007 23:55:35 +0900
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
Message-ID: <20071010145535.GK990@kunpuu.plessy.org>

Le Wed, Oct 10, 2007 at 04:06:16PM +0200, Giovanni Marco Dall'Olio a ?crit :
> hi,
> there is an emboss 5.0 package in debian sid.
> 
> Actually, this would work only for Debian Sid, but I believe the
> package will be included also in Ubuntu 7/10 and in debian etch in the
> short time.

Dear Giovanni,

Because Debian Etch is the stable version, it does not receive new
packages unless they fix security issues or grave bugs. The emboss
package for Debian will never be part of Etch nor its updates.

However, some Debian developpers provides a separate repository in which
only official developers upload recent packages recompiled for Etch. The
site is called backports.org.

If you or another reader is interested, we can prepare such a backport
for Etch.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan


From david at compbio.dundee.ac.uk  Wed Oct 10 14:30:24 2007
From: david at compbio.dundee.ac.uk (David Martin)
Date: Wed, 10 Oct 2007 15:30:24 +0100
Subject: [EMBOSS] Updating the Quick Guide
In-Reply-To: <5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
Message-ID: <C332A090.2C3D0%david@compbio.dundee.ac.uk>

On 10/10/07 14:23, "Giovanni Marco Dall'Olio" <dalloliogm at gmail.com> wrote:

> You should update the guide on how to install emboss.
> 
> In particular, explain how to use the .deb and .rpm packages, since a
> lot of people still try to install emboss by compiling it, and it is a
> pain.
> 
> 
> 2007/10/9, David Martin <david at compbio.dundee.ac.uk>:
>> Prompted by charles' request yesterday I am in the process of updating the
>> EMBOSS quick guide. it was last touched about 8 years ago so comments and
>> suggestions on what is new, and what should be dropped would be much
>> appreciated.
>> 
>> ..d


The aim of the Quick Guide is to provide a one sheet of A4 (two sides) quick
reference guide to the common programs and command line arguments that are
used with EMBOSS. I found it very useful when teaching as an aide memoire
for myself and the students.

Explaining how to install EMBOSS on each architecture is NOT the aim - for
that read the admin guide, the maintenance of which Alan and others have
taken off my hands. I will however reference the admin guide for
installation info.

If you haven't seen the quick guide a somewhat dated pdf is available in
emboss/docs/manuals/emboss_qg.pdf

regards

..d
  

From Veronique.Martin at jouy.inra.fr  Thu Oct 11 07:39:44 2007
From: Veronique.Martin at jouy.inra.fr (Veronique.Martin at jouy.inra.fr)
Date: Thu, 11 Oct 2007 09:39:44 +0200 (CEST)
Subject: [EMBOSS] prosextract option?
Message-ID: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>


Hi,

I want to run prosextract, but I would like build prosite motif in 
directory of my choice. Now the only possibility is in this path : 
emboss/share/EMBOSS/data/PROSITE
Is it possbile to have got an option for choosing the output directory?

I had tried by using the .embossrc file but only for this database 
(prosite) this file is not considered, prosextract used  the 
emboss/share/EMBOSS/emboss.default file.

Regards,

VM

-------------------------------------------------
V?ronique MARTIN
INRA - Unit? Math?matique, Informatique et G?nome
78352 Jouy-en Josas cedex
tel.: 01 34 65 29 74
-------------------------------------------------

From dalloliogm at gmail.com  Thu Oct 11 08:36:04 2007
From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio)
Date: Thu, 11 Oct 2007 10:36:04 +0200
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <20071010145535.GK990@kunpuu.plessy.org>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
	<20071010145535.GK990@kunpuu.plessy.org>
Message-ID: <5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>

2007/10/10, Charles Plessy <charles-listes-emboss at plessy.org>:
>
> Because Debian Etch is the stable version, it does not receive new
> packages unless they fix security issues or grave bugs. The emboss
> package for Debian will never be part of Etch nor its updates.
>

Really?
I didn't know emboss had grave bugs.

Are you saying they can't be fixed?
I can't find many references to bugs in emboss, but maybe you are
referring to bugs like this:
- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=427439 ?


> However, some Debian developpers provides a separate repository in which
> only official developers upload recent packages recompiled for Etch. The
> site is called backports.org.
>
> If you or another reader is interested, we can prepare such a backport
> for Etch.

Thank you very much: I think many people are interested, expecially
from the Ubuntu users community.
Emboss is seen as a educational package to learn bioinformatics: so,
it would be better if people can install it easily by themselves,
instead of asking to a system manager.
 Maybe you can just add the link to debian backports in the help page.


> Have a nice day,
>

and to you, too!

> --
> Charles Plessy
> Debian-Med packaging team
> Wako, Saitama, Japan
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
-----------------------------------------------------------

My Blog on Bioinformatics (italian): http://dalloliogm.wordpress.com


From charles-listes-emboss at plessy.org  Thu Oct 11 09:06:13 2007
From: charles-listes-emboss at plessy.org (charles-listes-emboss at plessy.org)
Date: Thu, 11 Oct 2007 18:06:13 +0900
Subject: [EMBOSS] possibility of packages for Debian Etch.
In-Reply-To: <5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>
References: <C3316359.2C38C%david@compbio.dundee.ac.uk>
	<5aa3b3570710100623v42107a31we6af4cab8d1bdb80@mail.gmail.com>
	<42689.81.98.241.17.1192023566.squirrel@webmail.ebi.ac.uk>
	<5aa3b3570710100706s7ab0a28tebbd30523733826@mail.gmail.com>
	<20071010145535.GK990@kunpuu.plessy.org>
	<5aa3b3570710110136y2c32b6e8v614e13cbfd12de44@mail.gmail.com>
Message-ID: <20071011090613.GA31072@kunpuu.plessy.org>

Le Thu, Oct 11, 2007 at 10:36:04AM +0200, Giovanni Marco Dall'Olio a ?crit :
> 2007/10/10, Charles Plessy <charles-listes-emboss at plessy.org>:
> >
> > Because Debian Etch is the stable version, it does not receive new
> > packages unless they fix security issues or grave bugs. The emboss
> > package for Debian will never be part of Etch nor its updates.
> >
> 
> Really?
> I didn't know emboss had grave bugs.

Dear Giovanni,

I have been unclear. The reason why EMBOSS is not in Debian Etch is
because its Debian package was not ready when Etch has been released.
Furthermore, it is the policy of Debian to only accept changes related
to security or grave bugs. Therefore, Debian Etch will never contain the
Debian packages we prepared for EMBOSS.

I will announce on this list when the package will be available through
backports.org.


> Are you saying they can't be fixed?
> I can't find many references to bugs in emboss, but maybe you are
> referring to bugs like this:
> - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=427439 ?

Yes, the current package still has some quality issues. However, all the
ones reported so far are solved in our SVN repository. I hope that I can
update the Debian package of EMBOSS in Debian Sid soon.

http://svn.debian.org/wsvn/pkg-emboss/emboss/trunk/debian/changelog?op=file&rev=0&sc=0

(If one explores a bit this repository, he can have a glimpse of what we
have in the pipeline...).


> Thank you very much: I think many people are interested, expecially
> from the Ubuntu users community.

By the way, if you ask to a MOTU Science, I think that it is possible to
fast-track the emboss packages into Ubuntu...


>  Maybe you can just add the link to debian backports in the help page.

The new package.debian.org website advertises the backports. See for
example the page for OpenOffice.org: http://packages.debian.org/openoffice.org


Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team.
Wako, Saitama, Japan


From Laurence.Amilhat at toulouse.inra.fr  Thu Oct 11 09:44:40 2007
From: Laurence.Amilhat at toulouse.inra.fr (Laurence Amilhat)
Date: Thu, 11 Oct 2007 11:44:40 +0200
Subject: [EMBOSS] plcore.c error when compiling
Message-ID: <470DF088.4020303@toulouse.inra.fr>

Dear Emboss users,


I am tryin to install emboss on Linux Ubuntu 7.04 Feisty Fawn
I downloaded the following tar.gz : EMBOSS-5.0.0.tar.gz 
<ftp://emboss.open-bio.org/pub/EMBOSS/EMBOSS-5.0.0.tar.gz>

I made the ./configure, (I have the grphics lib z, png and gd)
But when I maunch the make, I get the following message.
Does anyone have an idea why? Did I miss a lib or something?

Thank you for your help,

Best regards,

Laurence


plcore.c: In function 'int text2fci(const char*, unsigned char*, 
unsigned char*)':
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c:459: error: initializer-string for array of chars is too long
plcore.c: In function 'void difilt(PLINT*, PLINT*, PLINT, PLINT*, 
PLINT*, PLINT*, PLINT*)':
plcore.c:887: warning: converting to 'int' from 'PLFLT'
plcore.c:888: warning: converting to 'int' from 'PLFLT'
plcore.c:897: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:899: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:909: warning: converting to 'int' from 'PLFLT'
plcore.c:910: warning: converting to 'int' from 'PLFLT'
plcore.c:919: warning: converting to 'int' from 'PLFLT'
plcore.c:920: warning: converting to 'int' from 'PLFLT'
plcore.c: In function 'void sdifilt(short int*, short int*, PLINT, 
PLINT*, PLINT*, PLINT*, PLINT*)':
plcore.c:946: warning: converting to 'short int' from 'PLFLT'
plcore.c:947: warning: converting to 'short int' from 'PLFLT'
plcore.c:955: warning: converting to 'short int' from 'PLFLT'
plcore.c:956: warning: converting to 'short int' from 'PLFLT'
plcore.c:966: warning: converting to 'short int' from 'PLFLT'
plcore.c:967: warning: converting to 'short int' from 'PLFLT'
plcore.c:976: warning: converting to 'short int' from 'PLFLT'
plcore.c:977: warning: converting to 'short int' from 'PLFLT'
plcore.c: In function 'void pldid2pc(PLFLT*, PLFLT*, PLFLT*, PLFLT*)':
plcore.c:1079: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1080: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c:1081: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1082: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c: In function 'void pldip2dc(PLFLT*, PLFLT*, PLFLT*, PLFLT*)':
plcore.c:1125: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1126: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c:1127: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcx(PLINT)'
plcore.c:1128: warning: passing 'PLFLT' for argument 1 to 'PLFLT 
plP_pcdcy(PLINT)'
plcore.c: In function 'void calc_didev()':
plcore.c:1345: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1346: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1347: warning: converting to 'PLINT' from 'PLFLT'
plcore.c:1348: warning: converting to 'PLINT' from 'PLFLT'
plcore.c: In function 'void plP_setpxl(PLFLT, PLFLT)':
plcore.c:3264: warning: converting to 'PLINT' from 'double'
plcore.c:3265: warning: converting to 'PLINT' from 'double'
make[2]: *** [plcore.lo] Erreur 1
make[2]: quittant le r?pertoire ? /tmp/EMBOSS-5.0.0/plplot ?
make[1]: *** [all-recursive] Erreur 1
make[1]: quittant le r?pertoire ? /tmp/EMBOSS-5.0.0/plplot ?
make: *** [all-recursive] Erreur 1
Exit 2

-- 
====================================================================
= Laurence Amilhat    INRA Toulouse 31326 Castanet-Tolosan     	   = 
= Tel: 33 5 61 28 53 34   Email: laurence.amilhat at toulouse.inra.fr =
====================================================================


From jison at ebi.ac.uk  Thu Oct 11 12:16:19 2007
From: jison at ebi.ac.uk (Jon Ison)
Date: Thu, 11 Oct 2007 13:16:19 +0100 (BST)
Subject: [EMBOSS] prosextract option?
In-Reply-To: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>
References: <Pine.SOC.4.64.0710110905350.6049@diamant.jouy.inra.fr>
Message-ID: <48865.84.92.187.247.1192104979.squirrel@webmail.ebi.ac.uk>

Hi Veronique

prosextract is indeed hard-coded to write to the EMBOSS data directory
(defined by the EMBOSS environment variable EMBOSS_DATA).

You could always copy the file to your current working directory or into
a directory called ".embossdata" in either your home or current working
directory and the file could still be read by EMBOSS.

If that doesn't help an option to write to any specified directory could
easily be added - please advise.

Cheers

Jon


>
> Hi,
>
> I want to run prosextract, but I would like build prosite motif in
> directory of my choice. Now the only possibility is in this path :
> emboss/share/EMBOSS/data/PROSITE
> Is it possbile to have got an option for choosing the output directory?
>
> I had tried by using the .embossrc file but only for this database
> (prosite) this file is not considered, prosextract used  the
> emboss/share/EMBOSS/emboss.default file.
>
> Regards,
>
> VM
>
> -------------------------------------------------
> V?ronique MARTIN
> INRA - Unit? Math?matique, Informatique et G?nome
> 78352 Jouy-en Josas cedex
> tel.: 01 34 65 29 74
> -------------------------------------------------_______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From pmr at ebi.ac.uk  Wed Oct 24 08:07:21 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Wed, 24 Oct 2007 09:07:21 +0100
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <20071008233828.GA32069@kunpuu.plessy.org>
References: <20071008233828.GA32069@kunpuu.plessy.org>
Message-ID: <471EFD39.5060202@ebi.ac.uk>

Charles Plessy wrote:
> If I use degaspeq on a file, it prompts me for the name of the output, but if
> the data comes from stdin, degaspeq crashes. It does not happen if the name of
> the output is given.

> chouca?~?$ cat toto | degapseq stdin
> Removes gap characters from sequences
> output sequence(s) [xenopus-1a.fasta]: 
>    EMBOSS An error in ajmess.c at line 1662:
> END-OF-FILE reading from user

This is because you are reading from stdin, but then degapseq tries to 
read the output filename from stdin.

You do need to specify the output filename, or use -auto to accept the 
default (or -filter to use stdout and to read from stdin).

With -auto and -filter the program will no longer be using stdin for 
user replies.

Hmmm ... maybe we could catch these cases ... tricky though as really it 
is an explicit search for "stdin" as an input file/sequence. I could 
invent examples where we would guess wrongly.

Hope that helps,

Peter


From charles-listes-emboss at plessy.org  Wed Oct 24 14:37:06 2007
From: charles-listes-emboss at plessy.org (Charles Plessy)
Date: Wed, 24 Oct 2007 23:37:06 +0900
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <471EFD39.5060202@ebi.ac.uk>
References: <20071008233828.GA32069@kunpuu.plessy.org>
	<471EFD39.5060202@ebi.ac.uk>
Message-ID: <20071024143706.GB24491@kunpuu.plessy.org>

Le Wed, Oct 24, 2007 at 09:07:21AM +0100, Peter Rice a ?crit :
> 
> You do need to specify the output filename, or use -auto to accept the 
> default (or -filter to use stdout and to read from stdin).
> 
> With -auto and -filter the program will no longer be using stdin for 
> user replies.

Oh, I completely overlooked the fact that the emboss programs can take
their user replies from stdin. Maybe then the most straightforward to
inform users from this mistake would be to change the error message to
something like : "Error: could not open file '...............', in which
the name of the file would be truncated to the end of the line. The user
would quickly understand if the file name is someting like AGTCCAGGTA...

Have a nice day,

-- 
Charles Plessy
Wako, Saitama, Japan


From pmr at ebi.ac.uk  Wed Oct 24 16:53:27 2007
From: pmr at ebi.ac.uk (Peter Rice)
Date: Wed, 24 Oct 2007 17:53:27 +0100
Subject: [EMBOSS] Bug in degapseq ?
In-Reply-To: <20071024143706.GB24491@kunpuu.plessy.org>
References: <20071008233828.GA32069@kunpuu.plessy.org>	<471EFD39.5060202@ebi.ac.uk>
	<20071024143706.GB24491@kunpuu.plessy.org>
Message-ID: <471F7887.5050004@ebi.ac.uk>

Charles Plessy wrote:

> Oh, I completely overlooked the fact that the emboss programs can take
> their user replies from stdin. Maybe then the most straightforward to
> inform users from this mistake would be to change the error message to
> something like : "Error: could not open file '...............', in which
> the name of the file would be truncated to the end of the line. The user
> would quickly understand if the file name is someting like AGTCCAGGTA...

Or perhaps they would not quickly understand ... because it took me a 
few runs before I realised that was the problem :-)

I think we can keep track of stdin being opened in EMBOSS and refuse to 
prompt for input.

regards,

Peter


From staffa at niehs.nih.gov  Wed Oct 24 17:21:37 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 24 Oct 2007 13:21:37 -0400
Subject: [EMBOSS] GUI interfaces
Message-ID: <C344F761.674B%staffa@niehs.nih.gov>

Friends
    We are preparing for if ever GCG goes away by seriously pushing EMBOSS
with our users. 
This page
http://emboss.sourceforge.net/interfaces/
lists 15 GUIs.
apparently ColiMate is not an existing GUI to EMBOSS,
but a developement tool.
Please tell me:
Which of the 15 GUIs listed are complete and available?
Which do you think is best?

Thank you
 
Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From andrespinzon at gmail.com  Wed Oct 24 18:11:18 2007
From: andrespinzon at gmail.com (Andres Pinzon)
Date: Wed, 24 Oct 2007 13:11:18 -0500
Subject: [EMBOSS] GUI interfaces
In-Reply-To: <C344F761.674B%staffa@niehs.nih.gov>
References: <C344F761.674B%staffa@niehs.nih.gov>
Message-ID: <8968fc7e0710241111odff847dge2d0d16889c16e32@mail.gmail.com>

In my experience: [1] wEMBOSS and EMBOSS-Explorer are really easy to
configure and provide different user experience that complement each
other.

[1] http://bioinf.ibun.unal.edu.co/wEMBOSS/

Regards,


On 10/24/07, Staffa, Nick (NIH/NIEHS) <staffa at niehs.nih.gov> wrote:
> Friends
>     We are preparing for if ever GCG goes away by seriously pushing EMBOSS
> with our users.
> This page
> http://emboss.sourceforge.net/interfaces/
> lists 15 GUIs.
> apparently ColiMate is not an existing GUI to EMBOSS,
> but a developement tool.
> Please tell me:
> Which of the 15 GUIs listed are complete and available?
> Which do you think is best?
>
> Thank you
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


-- 
Andr?s Pinz?n
http://bioinf.ibun.unal.edu.co/~apinzon/
Bioinformatics Center, Colombia EMBnet node
http://bioinf.ibun.unal.edu.co
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
http://bioinf.uniandes.edu.co
Tel +571 3394949 ext. 2768


From golharam at umdnj.edu  Wed Oct 24 17:58:08 2007
From: golharam at umdnj.edu (Ryan Golhar)
Date: Wed, 24 Oct 2007 13:58:08 -0400
Subject: [EMBOSS] GUI interfaces
In-Reply-To: <C344F761.674B%staffa@niehs.nih.gov>
References: <C344F761.674B%staffa@niehs.nih.gov>
Message-ID: <471F87B0.8030308@umdnj.edu>

Hi Nich,

We (UMDNJ) migrated off of GCG several years ago.  We found most of our 
users prefer the command-line interface for shell scripting or a web 
interface for GUI access from their own computers.

We use EMBOSS-Explorer for the web interface.  Its (much) cleaner and 
faster than SeqWeb ever was and doesn't rely on the server storing user 
data.  We removed our responsibility of backing user data by moving off 
a server storages system to the user instead.  There are no issues with 
user account management (username/passwords) with this system either.
With GCG, we would have at least 1 or 2 user issues per month.  Since 
the switch, I can honestly say our user issues are maybe 1 or 2 per year.

If you have any questions about this, feel free to email me,

Ryan

----------------
Ryan Golhar, PhD
golharam at umdnj.edu
Computational Biologst
Informatics Institute at UMDNJ


Staffa, Nick (NIH/NIEHS) wrote:
> Friends
>     We are preparing for if ever GCG goes away by seriously pushing EMBOSS
> with our users. 
> This page
> http://emboss.sourceforge.net/interfaces/
> lists 15 GUIs.
> apparently ColiMate is not an existing GUI to EMBOSS,
> but a developement tool.
> Please tell me:
> Which of the 15 GUIs listed are complete and available?
> Which do you think is best?
> 
> Thank you
>  
> Nick Staffa 
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: Roy W. Reter (reter at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
> 
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 
> 


From kann.vearasilp at mu.edu  Thu Oct 25 19:07:37 2007
From: kann.vearasilp at mu.edu (Kann Vearasilp)
Date: Thu, 25 Oct 2007 14:07:37 -0500
Subject: [EMBOSS] Cannot open division file
Message-ID: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>

Hello everyone,

I just finish indexing a genbank database for my lab using dbiflat  
command. I set up an emboss.default file referenced from  
emboss.default.template as it was provided. "seqret" is a command  
that is used to test the system, and it seems that EMBOSS could not  
find the division file.

I can see from the archive that there was this kind of problem with  
test database provided from emboss as well. (http://emboss.open- 
bio.org/pipermail/emboss/2005-November/002323.html). However, I am  
pretty sure that I correctly pointed the path to my database.  
However, here is my configuration.

The system is Mac OS 10.4

1. Emboss was installed from fink at /sw/share/EMBOSS

2. All database was installed in /lab/data/databases/genbank/*.seq

3. Index files are in /lab/data/indices/genbank/??? Here is an  
example of one of the index directory from my lab.

xxx at yyy/lab/data/indices/genbank/mam:
acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
acnum.trg     division.lkp  keyword.trg   seqvn.trg
des.hit       entrynam.idx  mam.dbiflat   taxon.hit

4. Here is a fraction from my emboss.default file:

# Set location of acd files that describe each program
SET emboss_acdroot /sw/share/EMBOSS/acd


# Set location of Genbank flatfiles in protein
SET  emboss_database_dir /lab/data/databases

# Set location of Genbank flatfiles indices in protein
set emboss_index_dir /lab/data/indices

# Set a log file that user can append their records and EMBOSS  
automatically write log information
SET emboss_logfile /sw/share/EMBOSS/log/log

# Set Paper size of disc page and is required by the 'dbx' indexing  
program and 'method: "emblcd" emboss'
# Recommended value is 2048
SET PAGESIZE 2048

# Set Caches size required for 'dbx' indexing and 'method emboss'.
# It is a page size number to cache. Recommended value is 200
SET CACHESIZE 200

# Set parameter for flat file indices that we have created in
# /lab/data/indices/genbank
.
.
.
.
.
DB gbmam [
# required parameters
    method: "emblcd"
    format: "GB"
    type: "N"
    dir: "\$emboss_database_dir/genbank"
    file: "gbmam*.seq"
# optional parameters
    fields: "sv des key org"
    release: "161.0"
    comment: "Genbank database for mam sequences"
    indexdir: "\$emboss_index_dir/genbank/mam"
]

5. I run this seqret command to test the system, but it throw error  
and you can see:

xxx at yyy~:seqret gbmam:BC102801
Reads and writes (returns) sequences
Warning: Cannot open division file '<null>' for database 'gbmam'
Warning: seqCdQry failed
Error: Unable to read sequence 'gbmam:BC102801'
Died: seqret terminated: Bad value for '-sequence' and no prompt

6. I also run the seqret command in debug mode and this is its log  
from the command.

Debug file seqret.dbg buffered:No
ajAcdInitP pgm 'seqret' package ''
ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
closing file '/sw/share/EMBOSS/acd/seqret.acd'
ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
closing file '/sw/share/EMBOSS/acd/codes.english'
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
Set acdprotein value '$(sequence.protein)'
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed

It seems that the emboss could not find the division file. I still  
don't know what the problem is. Do you have any recommendation?

Thank you so much in advance for any help!

Kann


From ajb at ebi.ac.uk  Thu Oct 25 20:22:18 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Thu, 25 Oct 2007 21:22:18 +0100 (BST)
Subject: [EMBOSS] Cannot open division file
In-Reply-To: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
References: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
Message-ID: <33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>

Dear Kann,

One major problem is your DB entry:

DB gbmam [
# required parameters
    method: "emblcd"
    format: "GB"
    type: "N"
    dir: "\$emboss_database_dir/genbank"
    file: "gbmam*.seq"
# optional parameters
    fields: "sv des key org"
    release: "161.0"
    comment: "Genbank database for mam sequences"
    indexdir: "\$emboss_index_dir/genbank/mam"
]

You should remove the two backquote characters before the '$'
characters. I believe they mistakenly appeared in some documentation
in the past (possibly as a result of some automatic formatting).
It'd be useful if you'd email me off-list and tell me which documentation
contained the error (if my guess is correct).


Alan


> Hello everyone,
>
> I just finish indexing a genbank database for my lab using dbiflat
> command. I set up an emboss.default file referenced from
> emboss.default.template as it was provided. "seqret" is a command
> that is used to test the system, and it seems that EMBOSS could not
> find the division file.
>
> I can see from the archive that there was this kind of problem with
> test database provided from emboss as well. (http://emboss.open-
> bio.org/pipermail/emboss/2005-November/002323.html). However, I am
> pretty sure that I correctly pointed the path to my database.
> However, here is my configuration.
>
> The system is Mac OS 10.4
>
> 1. Emboss was installed from fink at /sw/share/EMBOSS
>
> 2. All database was installed in /lab/data/databases/genbank/*.seq
>
> 3. Index files are in /lab/data/indices/genbank/??? Here is an
> example of one of the index directory from my lab.
>
> xxx at yyy/lab/data/indices/genbank/mam:
> acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
> acnum.trg     division.lkp  keyword.trg   seqvn.trg
> des.hit       entrynam.idx  mam.dbiflat   taxon.hit
>
> 4. Here is a fraction from my emboss.default file:
>
> # Set location of acd files that describe each program
> SET emboss_acdroot /sw/share/EMBOSS/acd
>
>
> # Set location of Genbank flatfiles in protein
> SET  emboss_database_dir /lab/data/databases
>
> # Set location of Genbank flatfiles indices in protein
> set emboss_index_dir /lab/data/indices
>
> # Set a log file that user can append their records and EMBOSS
> automatically write log information
> SET emboss_logfile /sw/share/EMBOSS/log/log
>
> # Set Paper size of disc page and is required by the 'dbx' indexing
> program and 'method: "emblcd" emboss'
> # Recommended value is 2048
> SET PAGESIZE 2048
>
> # Set Caches size required for 'dbx' indexing and 'method emboss'.
> # It is a page size number to cache. Recommended value is 200
> SET CACHESIZE 200
>
> # Set parameter for flat file indices that we have created in
> # /lab/data/indices/genbank
> .
> .
> .
> .
> .
> DB gbmam [
> # required parameters
>     method: "emblcd"
>     format: "GB"
>     type: "N"
>     dir: "\$emboss_database_dir/genbank"
>     file: "gbmam*.seq"
> # optional parameters
>     fields: "sv des key org"
>     release: "161.0"
>     comment: "Genbank database for mam sequences"
>     indexdir: "\$emboss_index_dir/genbank/mam"
> ]
>
> 5. I run this seqret command to test the system, but it throw error
> and you can see:
>
> xxx at yyy~:seqret gbmam:BC102801
> Reads and writes (returns) sequences
> Warning: Cannot open division file '<null>' for database 'gbmam'
> Warning: seqCdQry failed
> Error: Unable to read sequence 'gbmam:BC102801'
> Died: seqret terminated: Bad value for '-sequence' and no prompt
>
> 6. I also run the seqret command in debug mode and this is its log
> from the command.
>
> Debug file seqret.dbg buffered:No
> ajAcdInitP pgm 'seqret' package ''
> ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
> closing file '/sw/share/EMBOSS/acd/seqret.acd'
> ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
> closing file '/sw/share/EMBOSS/acd/codes.english'
> ajTableNewFunctionLen hint 25 size 251
> ajTableNewFunctionLen hint 25 size 251
> ajTableNewFunctionLen hint 25 size 251
> ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
> EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
> closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
> Set acdprotein value '$(sequence.protein)'
> ajSeqinClear called
> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
> USA to test: 'gbmam:BC102801'
>
> format regexp: No list:No
> no format specified in USA
>
> ...input format not set
> dbname dbexp: Yes
> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
> seqQueryFieldC usa 'sv' fields 'sv des key org'
> seqQueryField test 'sv'
> seqQueryField match 'sv'
> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
> '' org '' key ''
> wild (has) query Sv 'BC102801'
> database type: 'N' format 'GB'
> use access method 'emblcd'
> Matched seqAccess[2] 'emblcd'
> seqAccessEmblcd type 2
> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
> 'BC102801' hasacc:Yes
> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
> Database 'gbmam' : access method 'emblcd' failed
> ajSeqinClear called
> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
> USA to test: 'gbmam:BC102801'
>
> format regexp: No list:No
> no format specified in USA
>
> ...input format not set
> dbname dbexp: Yes
> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
> seqQueryFieldC usa 'sv' fields 'sv des key org'
> seqQueryField test 'sv'
> seqQueryField match 'sv'
> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
> '' org '' key ''
> wild (has) query Sv 'BC102801'
> database type: 'N' format 'GB'
> use access method 'emblcd'
> Matched seqAccess[2] 'emblcd'
> seqAccessEmblcd type 2
> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
> 'BC102801' hasacc:Yes
> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
> Database 'gbmam' : access method 'emblcd' failed
>
> It seems that the emboss could not find the division file. I still
> don't know what the problem is. Do you have any recommendation?
>
> Thank you so much in advance for any help!
>
> Kann
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From kann.vearasilp at mu.edu  Thu Oct 25 22:06:01 2007
From: kann.vearasilp at mu.edu (Kann Vearasilp)
Date: Thu, 25 Oct 2007 17:06:01 -0500
Subject: [EMBOSS] Cannot open division file
In-Reply-To: <33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>
References: <80455327-9F8B-49EC-801F-3A5DFDE09DD3@mu.edu>
	<33572.81.98.241.17.1193343738.squirrel@webmail.ebi.ac.uk>
Message-ID: <FBE50134-81F9-4A02-8655-2A0904A5D3D9@mu.edu>

Hello Alan,

Thank you so much for fast response! It seems that this backslash  
cause me all the problems. Once I removed them, the program works  
flawlessly. :)

Kann

PS. I can find the document and will mail you once I know the version  
of this emboss tutorial.

On Oct 25, 2007, at 3:22 PM, ajb at ebi.ac.uk wrote:

> Dear Kann,
>
> One major problem is your DB entry:
>
> DB gbmam [
> # required parameters
>     method: "emblcd"
>     format: "GB"
>     type: "N"
>     dir: "\$emboss_database_dir/genbank"
>     file: "gbmam*.seq"
> # optional parameters
>     fields: "sv des key org"
>     release: "161.0"
>     comment: "Genbank database for mam sequences"
>     indexdir: "\$emboss_index_dir/genbank/mam"
> ]
>
> You should remove the two backquote characters before the '$'
> characters. I believe they mistakenly appeared in some documentation
> in the past (possibly as a result of some automatic formatting).
> It'd be useful if you'd email me off-list and tell me which  
> documentation
> contained the error (if my guess is correct).
>
>
> Alan
>
>
>> Hello everyone,
>>
>> I just finish indexing a genbank database for my lab using dbiflat
>> command. I set up an emboss.default file referenced from
>> emboss.default.template as it was provided. "seqret" is a command
>> that is used to test the system, and it seems that EMBOSS could not
>> find the division file.
>>
>> I can see from the archive that there was this kind of problem with
>> test database provided from emboss as well. (http://emboss.open-
>> bio.org/pipermail/emboss/2005-November/002323.html). However, I am
>> pretty sure that I correctly pointed the path to my database.
>> However, here is my configuration.
>>
>> The system is Mac OS 10.4
>>
>> 1. Emboss was installed from fink at /sw/share/EMBOSS
>>
>> 2. All database was installed in /lab/data/databases/genbank/*.seq
>>
>> 3. Index files are in /lab/data/indices/genbank/??? Here is an
>> example of one of the index directory from my lab.
>>
>> xxx at yyy/lab/data/indices/genbank/mam:
>> acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
>> acnum.trg     division.lkp  keyword.trg   seqvn.trg
>> des.hit       entrynam.idx  mam.dbiflat   taxon.hit
>>
>> 4. Here is a fraction from my emboss.default file:
>>
>> # Set location of acd files that describe each program
>> SET emboss_acdroot /sw/share/EMBOSS/acd
>>
>>
>> # Set location of Genbank flatfiles in protein
>> SET  emboss_database_dir /lab/data/databases
>>
>> # Set location of Genbank flatfiles indices in protein
>> set emboss_index_dir /lab/data/indices
>>
>> # Set a log file that user can append their records and EMBOSS
>> automatically write log information
>> SET emboss_logfile /sw/share/EMBOSS/log/log
>>
>> # Set Paper size of disc page and is required by the 'dbx' indexing
>> program and 'method: "emblcd" emboss'
>> # Recommended value is 2048
>> SET PAGESIZE 2048
>>
>> # Set Caches size required for 'dbx' indexing and 'method emboss'.
>> # It is a page size number to cache. Recommended value is 200
>> SET CACHESIZE 200
>>
>> # Set parameter for flat file indices that we have created in
>> # /lab/data/indices/genbank
>> .
>> .
>> .
>> .
>> .
>> DB gbmam [
>> # required parameters
>>     method: "emblcd"
>>     format: "GB"
>>     type: "N"
>>     dir: "\$emboss_database_dir/genbank"
>>     file: "gbmam*.seq"
>> # optional parameters
>>     fields: "sv des key org"
>>     release: "161.0"
>>     comment: "Genbank database for mam sequences"
>>     indexdir: "\$emboss_index_dir/genbank/mam"
>> ]
>>
>> 5. I run this seqret command to test the system, but it throw error
>> and you can see:
>>
>> xxx at yyy~:seqret gbmam:BC102801
>> Reads and writes (returns) sequences
>> Warning: Cannot open division file '<null>' for database 'gbmam'
>> Warning: seqCdQry failed
>> Error: Unable to read sequence 'gbmam:BC102801'
>> Died: seqret terminated: Bad value for '-sequence' and no prompt
>>
>> 6. I also run the seqret command in debug mode and this is its log
>> from the command.
>>
>> Debug file seqret.dbg buffered:No
>> ajAcdInitP pgm 'seqret' package ''
>> ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
>> closing file '/sw/share/EMBOSS/acd/seqret.acd'
>> ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
>> closing file '/sw/share/EMBOSS/acd/codes.english'
>> ajTableNewFunctionLen hint 25 size 251
>> ajTableNewFunctionLen hint 25 size 251
>> ajTableNewFunctionLen hint 25 size 251
>> ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
>> EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
>> closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
>> Set acdprotein value '$(sequence.protein)'
>> ajSeqinClear called
>> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
>> USA to test: 'gbmam:BC102801'
>>
>> format regexp: No list:No
>> no format specified in USA
>>
>> ...input format not set
>> dbname dbexp: Yes
>> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
>> seqQueryFieldC usa 'sv' fields 'sv des key org'
>> seqQueryField test 'sv'
>> seqQueryField match 'sv'
>> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
>> '' org '' key ''
>> wild (has) query Sv 'BC102801'
>> database type: 'N' format 'GB'
>> use access method 'emblcd'
>> Matched seqAccess[2] 'emblcd'
>> seqAccessEmblcd type 2
>> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
>> 'BC102801' hasacc:Yes
>> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
>> Database 'gbmam' : access method 'emblcd' failed
>> ajSeqinClear called
>> ++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
>> USA to test: 'gbmam:BC102801'
>>
>> format regexp: No list:No
>> no format specified in USA
>>
>> ...input format not set
>> dbname dbexp: Yes
>> found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
>> seqQueryFieldC usa 'sv' fields 'sv des key org'
>> seqQueryField test 'sv'
>> seqQueryField match 'sv'
>> ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des
>> '' org '' key ''
>> wild (has) query Sv 'BC102801'
>> database type: 'N' format 'GB'
>> use access method 'emblcd'
>> Matched seqAccess[2] 'emblcd'
>> seqAccessEmblcd type 2
>> directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc
>> 'BC102801' hasacc:Yes
>> ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
>> Database 'gbmam' : access method 'emblcd' failed
>>
>> It seems that the emboss could not find the division file. I still
>> don't know what the problem is. Do you have any recommendation?
>>
>> Thank you so much in advance for any help!
>>
>> Kann
>>
>> _______________________________________________
>> EMBOSS mailing list
>> EMBOSS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/emboss
>>
>
>


From kertib at linuxlap.hu  Tue Oct 30 10:25:36 2007
From: kertib at linuxlap.hu (kerti =?ISO-8859-1?Q?Bal=E1zs_G=E1bor?=)
Date: Tue, 30 Oct 2007 11:25:36 +0100
Subject: [EMBOSS] make error
Message-ID: <1193739936.5962.28.camel@genotech>

Hello,

There is some problem make EMBOSS. The "configure" has ran well, no made
error, or missing componenet, but the "make" exit run with message
attacted make.err file.

How solve the problem?

Thank you!

Balazs Kerti
Szent Istvan University,
Institute of Genetics and Biotechnology
HUN-2103 Godollo, Pater Karoly u. 1.
-------------- next part --------------
Making all in plplot
make[1]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot'
Making all in lib
make[2]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot/lib'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot/lib'
make[2]: Entering directory `/usr/src/EMBOSS-5.0.0/plplot'
/bin/bash ../libtool --tag=CC   --mode=compile gcc -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"EMBOSS\" -DVERSION=\"5.0.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DX_DISPLAY_MISSING=1 -DHAVE_DIRENT_H=1 -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DGETPGRP_VOID=1 -DHAVE_STRFTIME=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_VPRINTF=1 -DHAVE_MEMMOVE=1 -DHAVE_LIBM=1 -I.  -I./ -I/usr/include/gd -DPREFIX=\"/usr/local\" -DBUILD_DIR=\".\" -DDRV_DIR=\".\" -DEMBOSS_TOP=\"/usr/src/EMBOSS-5.0.0\"  -DAJ_LinuxLF -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -DLENDIAN -DNO_AUTH  -O2 -MT xwin.lo -MD -MP -MF .deps/xwin.Tpo -c -o xwin.lo xwin.c
 gcc -DPACKAGE_NAME=\"\" -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"EMBOSS\" -DVERSION=\"5.0.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DX_DISPLAY_MISSING=1 -DHAVE_DIRENT_H=1 -DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1 -DGETPGRP_VOID=1 -DHAVE_STRFTIME=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_VPRINTF=1 -DHAVE_MEMMOVE=1 -DHAVE_LIBM=1 -I. -I./ -I/usr/include/gd -DPREFIX=\"/usr/local\" -DBUILD_DIR=\".\" -DDRV_DIR=\".\" -DEMBOSS_TOP=\"/usr/src/EMBOSS-5.0.0\" -DAJ_LinuxLF -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DLENDIAN -DNO_AUTH -O2 -MT xwin.lo -MD -MP -MF .deps/xwin.Tpo -c xwin.c  -fPIC -DPIC -o .libs/xwin.o
make[2]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot'
make[1]: Leaving directory `/usr/src/EMBOSS-5.0.0/plplot'

From jerome.laroche at bioinfo.ulaval.ca  Wed Oct 31 20:46:50 2007
From: jerome.laroche at bioinfo.ulaval.ca (=?ISO-8859-1?Q?J=E9r=F4me_Laroche?=)
Date: Wed, 31 Oct 2007 16:46:50 -0400
Subject: [EMBOSS] dbxflat and size of index files
Message-ID: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>

Hello,

I use dbxflat to index uniprot (sprot and trembl) flat files for  
which the size is 1.2 G for sprot and 11 G for trembl. The resulting  
files are amazingly huge: 11 G. Is it normal?

Another example with Genbank flat files: the division gbsts has a  
size of 3.3 G. Indexing with dbxflat give 6.8 G of index files but  
with dbiflat give only 199 M of index files. I know its not necessary  
to index genbank flat files with dbxflat because each individual file  
is not bigger than 300 M. I did this just for the demonstration.

Apart of this, all is working very well.

Thank you in advance.


J?r?me Laroche

Centre de bioinformatique et de biologie computationnelle
Universit? Laval


From ajb at ebi.ac.uk  Wed Oct 31 22:07:24 2007
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Wed, 31 Oct 2007 22:07:24 -0000 (GMT)
Subject: [EMBOSS] dbxflat and size of index files
In-Reply-To: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>
References: <FCDE3349-B423-4DF7-B68A-C496E0AB0BB6@bioinfo.ulaval.ca>
Message-ID: <33217.81.98.241.17.1193868444.squirrel@webmail.ebi.ac.uk>

Hello J?r?me,

Yes, it is normal. It is a combination of three things. First, it is a
tree structure, secondly the tree isn't tightly packed and thirdly
64-bit pointers are used throughout. The first will
allow on-the-fly updating of the index, the second is for speed of
construction/updating and the third is obvious. Another
consideration is that, in some cases, the indexes are trees-of-trees
to allow duplicate codes to be indexed (e.g. keywords).

Coincidentally I'm on the lookout for new indexing algorithms at the
moment so, if you have a favourite one then we're always open
for suggestions.

Alan


> Hello,
>
> I use dbxflat to index uniprot (sprot and trembl) flat files for
> which the size is 1.2 G for sprot and 11 G for trembl. The resulting
> files are amazingly huge: 11 G. Is it normal?
>
> Another example with Genbank flat files: the division gbsts has a
> size of 3.3 G. Indexing with dbxflat give 6.8 G of index files but
> with dbiflat give only 199 M of index files. I know its not necessary
> to index genbank flat files with dbxflat because each individual file
> is not bigger than 300 M. I did this just for the demonstration.
>
> Apart of this, all is working very well.
>
> Thank you in advance.
>
>
> J?r?me Laroche
>
> Centre de bioinformatique et de biologie computationnelle
> Universit? Laval
>
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>