From david.martin at biotek.uio.no Thu Feb 1 03:45:53 2001 From: david.martin at biotek.uio.no (David Martin) Date: Thu, 1 Feb 2001 09:45:53 +0100 Subject: Updated Admin guide to 1.9.1 Message-ID: An updated version of the admin guide is now available. Main changes: Explaination of the use of EMBOSS_DATA in specifying the base directory in which data files are installed by the third party database processing programs rebaseextract, prosextract, printsextract and tfextract. Acknowledgement of the fact that I cannot persuade EMBOSS configure to look for include files in non-standard locations (eg with EMNU so I had to compile by hand). Placeholders for GUI installation. Fix to the style file New/updated files: emboss.sty admin.tex admin.ps admin.pdf At ftp://ftp.no.embnet.org/pub/EMBOSS-extras Comments, suggestions, bugfixes etc. to me. ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- From flavio at area.ba.cnr.it Wed Feb 7 05:13:09 2001 From: flavio at area.ba.cnr.it (Vito Flavio Licciulli) Date: Wed, 07 Feb 2001 11:13:09 +0100 (MET) Subject: EMNU 1.0.3 segmentation fault Message-ID: <4.1.20010207103600.00bc9d70@pop.ba.cnr.it> Hi to all, I have a problem in EMNU rel. 1.0.3. We have a Compaq Tru64 Unix (ex Digital Unix, OSF1) server. Emnu goes in "segmentation fault" when the length of a program description is longer than the terminal column. For example, in a 80 character terminal, when you choose "RESTRICTION ENZYME" group menu ("recode" has a description line longer than 80 char). I found the problem in the emnu.c source. The line 1320 buffer = (char *) malloc(ajStrLen(gl->doc+1)); has to be changed in buffer = (char *) malloc(ajStrLen(gl->doc)+1); ...^^^^... Flavio Licciulli |=====================================================================| | Flavio Licciulli E-mail: flavio at area.ba.cnr.it | | System & Network Administrator WWW: http://www.ba.cnr.it/~flavio | | Italian EMBNET Node | | C.N.R. - Area di Ricerca Phone (39) 080-5482176/30/80 | | Via Amendola 166/5 BARI ITALY Fax (39) 080-5484467 | |=====================================================================| | | | L'intelligenza certe volte ci serve per fare | | impudentemente delle sciocchezze... | | La Rochefoucauld | '=====================================================================' From david.martin at biotek.uio.no Wed Feb 7 07:29:42 2001 From: david.martin at biotek.uio.no (David Martin) Date: Wed, 7 Feb 2001 13:29:42 +0100 Subject: Multiple architectures, same datafiles. Message-ID: I want to be able to use the same source tree but to compile for multiple architectures. I also want to use the same data files so I only need to update in one place and all the architectures work. If I set --libdir and --bindir then everything should be OK? make clean and rm config.cache between compilations. Thoughts? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From cupton at uvic.ca Wed Feb 7 13:00:42 2001 From: cupton at uvic.ca (Chris Upton) Date: Wed, 7 Feb 2001 10:00:42 -0800 Subject: GUIs for EMBOSS Message-ID: Hi, Where can I find info on GUIs for the EMBOSS programs. I know there was some.... but I can't seem to find it again!!!!!!! Cheers, Chris -- Chris Upton PhD Associate Professor Biochemistry and Microbiology Phone 250-721-6507 University of Victoria PO Box 3055 Victoria Fax 250-721-8855 BC V8W 3P6, Canada http://web.uvic.ca/biochem/bioc/Faculty/upton/upton.html From david.martin at biotek.uio.no Fri Feb 9 09:15:33 2001 From: david.martin at biotek.uio.no (David Martin) Date: Fri, 9 Feb 2001 15:15:33 +0100 Subject: Feature inheritance with sequences Message-ID: One of the great benefits of some of the graphical sequence manipulation tools is that when one chops out a section of a sequence the features travel with it. What is the position with regard to such things in EMBOSS? ie if pasteseq is used to splice sequence a into sequence b, will the feature table travel with it and be updated as well? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From bauer at genprofile.com Fri Feb 9 09:32:37 2001 From: bauer at genprofile.com (David Bauer) Date: Fri, 09 Feb 2001 15:32:37 +0100 Subject: Feature inheritance with sequences References: Message-ID: <3A83FF85.610F70CD@genprofile.com> David Martin wrote: > ie if pasteseq is used to splice sequence a into sequence b, will the > feature table travel with it and be updated as well? This would be wonderfull ! David. -- Dr. David Bauer GenProfile AG, Max-Delbrueck-Center, Erwin-Negelein-Haus Robert-Roessle-Str. 10, D-13125 Berlin, Germany bauer at genprofile.com, Tel:49-30-94892165, FAX:49-30-94892151 From Peter.Rice at uk.lionbioscience.com Fri Feb 9 09:58:29 2001 From: Peter.Rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 09 Feb 2001 14:58:29 +0000 Subject: Feature inheritance with sequences References: Message-ID: <3A840595.DF08258@lionbio.co.uk> David Martin wrote: > > One of the great benefits of some of the graphical sequence manipulation > tools is that when one chops out a section of a sequence the features > travel with it. > What is the position with regard to such things in EMBOSS? > > ie if pasteseq is used to splice sequence a into sequence b, will the > feature table travel with it and be updated as well? An interesting concept. In principle, yes. Features in EMBOSS follow the GFF model, so they are stored as groups of start and end positions. Assuming that they are not one of the more exotic kinds of start and end positions, it is easy to 'insert' one feature table into another. Deletion is also possible, although ends of features within the deleted region rules out making the process completely reversible. I have seen it all before - in the EBI's NEWFEATURES editor when mutating the 'replace' qualifiers. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From david.martin at biotek.uio.no Fri Feb 9 10:14:29 2001 From: david.martin at biotek.uio.no (David Martin) Date: Fri, 9 Feb 2001 16:14:29 +0100 Subject: Feature inheritance with sequences In-Reply-To: <3A840595.DF08258@lionbio.co.uk> Message-ID: On Fri, 9 Feb 2001, Peter Rice wrote: > David Martin wrote: > > > > One of the great benefits of some of the graphical sequence manipulation > > tools is that when one chops out a section of a sequence the features > > travel with it. > > What is the position with regard to such things in EMBOSS? > > > > ie if pasteseq is used to splice sequence a into sequence b, will the > > feature table travel with it and be updated as well? > > An interesting concept. > > In principle, yes. Features in EMBOSS follow the GFF model, so they are stored > as groups of start and end positions. Assuming that they are not one of the > more exotic kinds of start and end positions, it is easy to 'insert' one > feature table into another. Deletion is also possible, although ends of > features within the deleted region rules out making the process completely > reversible. > > I have seen it all before - in the EBI's NEWFEATURES editor when mutating the > 'replace' qualifiers. But does it actually work in practice? If one creates a new sequence from a and b, do the features travel with it? (I have seen very little documentation on using features.. anyone care to do a tutorial online via Wiki/IRC at some point as to how features work in EMBOSS?) Will EMBOSS write out features as standard in EMBL/Genbank format files or are we stuck in having extra GFF files floating about.. It would be good to have a tutorial on features. ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From Peter.Rice at uk.lionbioscience.com Fri Feb 9 10:20:42 2001 From: Peter.Rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 09 Feb 2001 15:20:42 +0000 Subject: Feature inheritance with sequences References: Message-ID: <3A840ACA.3F25E333@lionbio.co.uk> David Martin wrote: > > On Fri, 9 Feb 2001, Peter Rice wrote: > > > In principle, yes. Features in EMBOSS follow the GFF model, so they are stored > > as groups of start and end positions. Assuming that they are not one of the > > more exotic kinds of start and end positions, it is easy to 'insert' one > > feature table into another. Deletion is also possible, although ends of > > features within the deleted region rules out making the process completely > > reversible. > > But does it actually work in practice? If one creates a new sequence from > a and b, do the features travel with it? Depends on whether the sequences have features. At present, you can turn on features for an input sequence, but most applications don't use it (wisely, as I am rewriting it heavily). > Will EMBOSS write out features as standard in EMBL/Genbank format files or > are we stuck in having extra GFF files floating about.. They are stored internally as 'GFF with hints to EMBL feature locations' but can be written out in any format - including formats that have not yet been invented. Some of those formats could even be read back in again. Others will be standard forms of application output. Yes, that does include EMBL/GenBank/Swissprot entries with feature tables. > It would be good to have a tutorial on features. I will need to write one for Alan anyway once the updates are done. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From david.martin at biotek.uio.no Tue Feb 13 09:27:24 2001 From: david.martin at biotek.uio.no (David Martin) Date: Tue, 13 Feb 2001 15:27:24 +0100 Subject: DBIFASTA on univec Message-ID: Univec has a nasty variant of genbank header lines that is currently not parseable with DBIFASTA. THe syntax is.. >gnl|uv|X66730.1:1-2687-49 B.bronchiseptica plasmid pBBR1 I presume all that is needed is to add a new 'else if' statement in getExpr to cope? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From starksb at ebi.ac.uk Thu Feb 15 06:33:55 2001 From: starksb at ebi.ac.uk (David Starks-Browning) Date: Thu, 15 Feb 2001 11:33:55 +0000 Subject: problem with configure and --with-pngdriver Message-ID: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> Greetings, I've just built EMBOSS-1.9.1 on SGI IRIX with gcc and would like to report a problem I had. We have PNG (and other supporting) libraries installed in /sw/arch. I invoke configure with the following environment variables set: CPPFLAGS = -I/sw/arch/include LDFLAGS = -L/sw/arch/lib Then any attempt by configure to find png-related features should succeed, without specifying --with-pngdrivers=/sw/arch. Unfortunately, this fails on SGI IRIX, because of the following logic in configure: > # Check whether --with-pngdriver or --without-pngdriver was given. > if test "${with_pngdriver+set}" = set; then > withval="$with_pngdriver" > if test "$withval" != no ; then > echo "$ac_t""yes" 1>&6 > ALT_HOME="$withval" > else > echo "$ac_t""no" 1>&6 > fi > else > echo "$ac_t""yes" 1>&6 > ALT_HOME=/usr/local > if test ! -f "${ALT_HOME}/include/zlib.h" > then > ALT_HOME=/usr > fi > fi (I reformated this for readability.) With this logic (and some additional code that follows it), if with-pngdriver is not set, and zlib.h is not found in /usr/local/include, then "-L/usr/lib" is added to the compile command when testing for png components. I think this is a mistake. I can't think of why one would explicitly add -L/usr/lib (or *any* system library path, for that matter) to the compile command line. The compiler should handle that path search automatically. On IRIX, for example, this over-rides the compiler's effort to use appropriate system library paths, and forces ld to use o32-ABI libraries. On any modern SGI system, or with gcc, this is likely to fail, because all the other objects are likely to be n32 or n64. I do not have much experience with autoconf, but I am willing to attempt a rewrite of the CHECK_PNGDRIVER in aclocal.m4 if that would be useful to others. Ian wrote this originally, but I understand he's busy with other things now. (Is that correct?) Thanks for your help. David ------------------------------------------------------------------- David Starks-Browning | starksb at ebi.ac.uk EMBL Outstation -- | The European Bioinformatics Institute | Wellcome Trust Genome Campus | tel: +44 (1223) 494 616 Hinxton, Cambridge, CB10 1SD, UK | fax: +44 (1223) 494 468 ------------------------------------------------------------------- From starksb at ebi.ac.uk Thu Feb 15 08:37:36 2001 From: starksb at ebi.ac.uk (David Starks-Browning) Date: Thu, 15 Feb 2001 13:37:36 +0000 Subject: more problems with configure and --with-pngdriver In-Reply-To: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> References: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> Message-ID: <8409-Thu15Feb2001133736+0000-starksb@ebi.ac.uk> Greetings, Continuing with --with-pngdriver: We went to some effort to build shared versions of libgd.so for use by the perl module GD.pm. This is not the default, so perhaps few people have done this. When configure checks for png support, it tries to compile conftest.c, which contains a call to gdImageCreateFromPng(), by linking against -lgd. If this is successful, hoorah. If it fails, then configure advises you that you must upgrade your gd library. I find that compilation of conftest succeeds with libgd.so on Compaq, but fails on Linux and Solaris, with unresolved references to jpeg and Xpm routines. (We have IRIX too, but only libgd.a on that, and it succeeds.) If I compile manually, the Linux and Solaris compiles do succeed if I use the static library libgd.a. It could be that I built libgd.so correctly on Compaq but incorrectly on Linux and Solaris. So I ask, does anyone else have this problem? If it's not simply an error on my part, then perhaps configure needs to be told how to deal with this case. Again, if there is interest (i.e., it's not just me affected), then I could attempt to fix this in the CHECK_PNGDRIVERS macro. Regards, David ------------------------------------------------------------------- David Starks-Browning | starksb at ebi.ac.uk EMBL Outstation -- | The European Bioinformatics Institute | Wellcome Trust Genome Campus | tel: +44 (1223) 494 616 Hinxton, Cambridge, CB10 1SD, UK | fax: +44 (1223) 494 468 ------------------------------------------------------------------- From heme at postmark.net Wed Feb 14 07:31:42 2001 From: heme at postmark.net (Per Johansson) Date: Wed, 14 Feb 2001 12:31:42 +0000 Subject: EMBOSS:fuzznuc Message-ID: <20010214123143.31759.qmail@venus.postmark.net> Hi, Is it possible to get fuzznuc (or some other pattern search program) to read a file with several DNA patterns instead of just a single pattern? Sofia From ableasby at hgmp.mrc.ac.uk Sun Feb 18 09:45:52 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Sun, 18 Feb 2001 14:45:52 GMT Subject: EMBOSS:fuzznuc Message-ID: <200102181445.OAA00388@tin.hgmp.mrc.ac.uk> Dear Sofia, Getting multiple pattern searches into the fuzzies is high on the list of things to do. Alan From ableasby at hgmp.mrc.ac.uk Sun Feb 18 11:07:32 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Sun, 18 Feb 2001 16:07:32 GMT Subject: EMBOSS 1.10.0 Message-ID: <200102181607.QAA17711@bromine.hgmp.mrc.ac.uk> EMBOSS 1.10.0 This release contains several new applications, some which are still under active development. We hope to provide some of the data files referred to on our ftp server soon. MARSCAN Matrix/scaffold attachment regions (MARs/SARs) are genomic elements thought to delineate the structural and functional organisation of the eukaryotic genome. Originally, MARs and SARs were identified through their ability to bind to the nuclear matrix or scaffold. Binding cannot be assigned to a unique sequence element, but is dispersed over a region of several hundred base pairs. These elements are found flanking a gene or a small cluster of genes and are located often in the vicinity of cis-regulatory sequences. This has led to the suggestion that they contribute to higher order regulation of transcription by defining boundaries of independently controlled chromatin domains. There is indirect evidence to support this notion. In transgenic experiments MARs/SARs dampen position effects by shielding the transgene from the effects of the chromatin structure at the site of integration. Furthermore, they may act as boundary elements for enhancers, restricting their long range effect to only the promoters that are located in the same chromatin domain. marscan finds a bipartite sequence element that is unique for a large group of eukaryotic MARs/SARs. This MAR/SAR recognition signature (MRS) comprises two individual sequence elements that are <200 bp apart and may be aligned on positioned nucleosomes in MARs. The MRS can be used to correctly predict the position of MARs/SARs in plants and animals, based on genomic DNA sequence information alone. Experimental evidence from the analysis of >300 kb of sequence data from several eukaryotic organisms show that wherever a MRS is observed in the DNA sequence, the corresponding genomic fragment is a biochemically identifiable SAR. The MRS is a bipartite sequence element that consists of two individual sequences of 8 (AATAAYAA) and 16 bp (AWWRTAANNWWGNNNC) within a 200 bp distance from each other. One mismatch is allowed in the 16 bp pattern. The patterns can occur on either strand of the DNA with respect to each other. Not all SARs contain a MRS. Analysis of >300 kb of genomic sequence from a variety of eukaryotic organisms shows that the MRS faithfully predicts 80% of MARs and SARs, suggesting that at least one other type of MAR/SAR may exist which does not contain a MRS. SCOPE scope parses the scop classification file available at http://scop.mrc-lmb.cam.ac.uk/scop/search.cgi?dir=lin and writes the scop classification to an embl-like format file. This file (Escop.dat) should be placed in the emboss/data directory. NRSCOPE nrscope parses the embl-like format scop classification file generated by the EMBOSS application scope, and writes in the same format a file of non-redundant domains. The format of these files is explained in the scope documentation. The current version of nrscope removes redundancy at the level of the scop family, i.e. entries belonging to the same family will be non-redundant. DOMAINER domainer parses an embl-like format scop classification file generated by the EMBOSS applications scope or nrscope, and clean protein coordinate files generated by the coorde application (not currently in emboss, email Jon Ison jison at hgmp.mrc.ac.uk) and writes, for each domain in the scop classification file, clean domain coordinate files in embl-like and pdb formats . Each of these files contains coordinates for a single scop domain. STAMPS (under development) stamps parses an embl-like format scop classification file generated by the EMBOSS applications scope or nrscope, and calls stamp to generate structural alignments for each SCOP family. It is still under active development. You have to "make stamp" in the applications directory to create "stamps". Developers Notes 1. Most C datatypes have changed in the libraries. This is a prelude to getting true 64 bit operation. Notably ints are now "ajint"s and longs are now "ajlong"s. An ajint can be equal in size to an ajlong depending on the hardware; however, an ajlong should be used whenever a 64 bit int might be used. 2. The function ajFmtScanS has been added. This can be regarded as the EMBOSS version of the C function sscanf and operates similarly. It has several extensions, particularly %S is used for dynamically allocated string objects (AjPStr). This function makes reading data files considerably easier and many applications will be rewritten to use it rather than having to rely on tokenisation. As usual I've probably forgotten to mention some things and my colleagues will no doubt correct any oversights. Alan From oddmund.nordgard at biokjemi.uio.no Mon Feb 19 06:07:14 2001 From: oddmund.nordgard at biokjemi.uio.no (=?ISO-8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Mon, 19 Feb 2001 12:07:14 +0100 (CET) Subject: Prettyplot bug? In-Reply-To: Message-ID: Hello! The last box in a prettyplot with boxes lacks the right line. EMBOSS 1.9.1 Of less importance: The margin between the the first letters on each line and the left box-line is smaller than the other margins. Oddmund -- ******************************************* Oddmund Nordg?rd Ph.D. student Institute of Biochemistry Box 1041 Blindern 0316 OSLO NORWAY Phone: 22 85 66 99 Fax: 22 85 44 43 Email: oddmundn at biokjemi.uio.no Private: Kalbakkv. 21 0953 OSLO Phone: 22 25 23 93 ******************************************** Powered by Linux! http://www.linuxnorge.com From luetzel at imbim.uu.se Wed Feb 21 05:21:23 2001 From: luetzel at imbim.uu.se (Martin Luetzelberger) Date: Wed, 21 Feb 2001 11:21:23 +0100 (MET) Subject: REBASE bug? Message-ID: Hi all, I'm in trouble with EMBOSS 1.10.0 running on SuSE Linux 7.0. Compilation and installation of the programs was fine -- no problems at all. However, all of the programs for restriction mapping like "redata", "restrict" and "remap" crash with a segmentation fault. I've set EMBOSS_DATA to /usr/local/share/EMBOSS/data, the folder which includes the REBASE directory. I've extracted REBASE files using "rebaseextract" from "withrefm" (rebase 101). The files embossre.enz, embossre.ref and embossre.sup were placed properly into the REBASE folder by the "rebaseextract" program. Has anyone the same experience? Any suggestions? Martin Luetzelberger, Ph.D Dept. of Medical Biochemistry and Microbiology The Biomedical Center (BMC), BOX 582, Husargatan 3, University of Uppsala S-751 23 Uppsala, Sweden Phone : +46(0)18 471-4587 or -4557 Fax : +46(0)18 517533 Mobile: +46(0)73 0388442 E-Mail: Martin.Luetzelberger at imbim.uu.se From zno6 at cdc.gov Wed Feb 21 16:54:18 2001 From: zno6 at cdc.gov (Sammons, Scott) Date: Wed, 21 Feb 2001 16:54:18 -0500 Subject: Analyzing Large Sequences Message-ID: <1537895B1173D1118D9300A0246212B607A83B55@mcdc-atl-4.cdc.gov> Greetings: I am using EMBOSS 1.9.1 on a Solaris 2.8 machine. I would like to analyze large viruses, ~185 kb. I seem to be having some odd errors running the getorf program on sequences of this size. No error messages, but getorf is not finding orfs that I know are there. What is the maximum size sequence that EMBOSS can handle? I looked through the documentation but could not find info. Is the defined as integer (32767)? Thanks, Scott Sammons Centers for Disease Control and Prevention 1600 Clifton Rd, NE MS G36 Atlanta, GA 30333 404-639-3560 (Voice) 404-639-1331 (FAX) ssammons at cdc.gov From sgmd at genetik.fu-berlin.de Thu Feb 22 04:52:48 2001 From: sgmd at genetik.fu-berlin.de (Thomas Siegmund) Date: Thu, 22 Feb 2001 10:52:48 +0100 Subject: Analyzing Large Sequences Message-ID: <01022210524800.10896@scarlet> Hi Scott, I had similar problems when I tried to translate the whole Drosophila genome (quite a few of BACs of 300 kb or so). For me the "wimklein" programm from the SEALS package (NCBI) did the job perfectly. With best regards Thomas Siegmund -- Freie Universit?t Berlin Institut f?r Genetik Arnimallee 7 14195 Berlin Germany Tel: +49 30 838 54868 From ableasby at hgmp.mrc.ac.uk Thu Feb 22 05:58:19 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 22 Feb 2001 10:58:19 GMT Subject: Analyzing Large Sequences Message-ID: <200102221058.KAA16969@bromine.hgmp.mrc.ac.uk> Thomas writes in reply: >I had similar problems when I tried to translate the whole Drosophila genome >(quite a few of BACs of 300 kb or so). Could I just make a plea that any perceived bugs are emailed to emboss-bug at embnet.org . This has the advantages that: a) All authors will be informed of any problems b) It might help the person with the problem c) It might prevent others from having the same problem. Please, also report the version (if known), the platform you are using and whenever possible test data and what you expect to happen in order to help us reproduce any problem. Thanks Alan PS: The size limit is platform dependent but should be at least 2^31 and we are working towards making it 2^63 on platforms that can accept it. From ableasby at hgmp.mrc.ac.uk Mon Feb 26 11:48:12 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Mon, 26 Feb 2001 16:48:12 GMT Subject: EMBOSS 1.10.1 Message-ID: <200102261648.QAA20498@tin.hgmp.mrc.ac.uk> This is a bugfix release for marscan, getorf and garnier. It also contains a preliminary version of HMOMENT. This is still being fine-tuned in collaboration with the site that requested such a program in EMBOSS. Documentation will follow. I mention it just in case anyone asks! The cleaned-up version of PDB mentioned in the 1.10.0 release announcement is now available at: ftp.uk.embnet.org pub/databases/cpdb The CPDBSCOP files are in: ftp.uk.embnet.org pub/databases/cpdbscop These relate to the SCOPE/NRSCOPE/DOMAINER/(STAMPS) programs. More software is being deveoped that use these files. Uncompressed cpdb is 6.7Gb and cpdbscop is 3.5Gb. They are supplied as gzipped individual entries. My colleage (Jon Ison) can answer any related questions on this list. Alan From bauer at genprofile.com Wed Feb 28 05:51:01 2001 From: bauer at genprofile.com (David Bauer) Date: Wed, 28 Feb 2001 11:51:01 +0100 Subject: cirdna, restrict Message-ID: <3A9CD815.72A2FC70@genprofile.com> Hi, I wanted to display a result from restrict of a 5 kb Plasmid with cirdna. But it is not possible to put more the 8 Tick marks in the datafile. Up to 8 Ticks everything is OK. But with adding the 9th Tick the size of the ploted labels is scaled down to something which is readable only under a microscope. Lindna does not have this effect even if I take the complete output with all single-cutters which has 49 recognition sites. Is there really no alternative to the plplot library? Thanks, David. -- Dr. David Bauer GenProfile AG, Max-Delbrueck-Center, Erwin-Negelein-Haus Robert-Roessle-Str. 10, D-13125 Berlin, Germany bauer at genprofile.com, Tel:49-30-94892165, FAX:49-30-94892151 From ableasby at hgmp.mrc.ac.uk Wed Feb 28 06:05:50 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 28 Feb 2001 11:05:50 GMT Subject: cirdna, restrict Message-ID: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk> David: >Is there really no alternative to the plplot library? None that's around at the moment that I can see and is LGPL'd. We are interviewing for two positions early next month though and the job of one of the posts will be to write our own graphics library. Its not that hard. I'm as surprised as you that there's nothing else suitably licenced out there. If anyone can suggest one/donate one we'd be delighted. Alan From johann at egenetics.com Wed Feb 28 07:04:53 2001 From: johann at egenetics.com (Johann Visagie) Date: Wed, 28 Feb 2001 14:04:53 +0200 Subject: cirdna, restrict In-Reply-To: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk>; from ableasby@hgmp.mrc.ac.uk on Wed, Feb 28, 2001 at 11:05:50AM +0000 References: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk> Message-ID: <20010228140453.C65275@fling.sanbi.ac.za> ableasby at hgmp.mrc.ac.uk on 2001-02-28 (Wed) at 11:05:50 +0000: > > >Is there really no alternative to the plplot library? > > None that's around at the moment that I can see and is LGPL'd. > We are interviewing for two positions early next month though > and the job of one of the posts will be to write our own > graphics library. Its not that hard. I'm as surprised as you > that there's nothing else suitably licenced out there. If > anyone can suggest one/donate one we'd be delighted. A cursory search reveals more plotting tools and libraries than you can shake a stick at. Having no experience with any of them, I'm at a loss to suggest any particular one. I'm also not familiar enough with EMBOSS' plotting requirements to make a feature-by-feature comparison and see which would suffice. And finally, I don't know which ones you may already have tried and discarded. :-) -- Johann From David.Lapointe at umassmed.edu Wed Feb 28 09:18:06 2001 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Wed, 28 Feb 2001 09:18:06 -0500 Subject: cirdna, restrict Message-ID: For what it is worth, I have had problems converting EMBOSS ps output to PDF ( with Adobe Distiller). Prettyplot, for example, produces a great color postscript file, but when it is converted to PDF it is all messed up ( Fonts WAY TOO LARGE). I was intending to convert the ps output to PDF in the background for web viewing and easier printing. I guess I'd like to see an option for PDF output in the list of graphical formats. David > -----Original Message----- > From: ableasby at hgmp.mrc.ac.uk [mailto:ableasby at hgmp.mrc.ac.uk] > Sent: Wednesday, February 28, 2001 6:06 AM > To: emboss at hgmp.mrc.ac.uk > Subject: Re: cirdna, restrict > > > > David: > > >Is there really no alternative to the plplot library? > > None that's around at the moment that I can see and is LGPL'd. > We are interviewing for two positions early next month though > and the job of one of the posts will be to write our own > graphics library. Its not that hard. I'm as surprised as you > that there's nothing else suitably licenced out there. If > anyone can suggest one/donate one we'd be delighted. > > Alan > > From bauer at genprofile.com Wed Feb 28 09:58:31 2001 From: bauer at genprofile.com (David Bauer) Date: Wed, 28 Feb 2001 15:58:31 +0100 Subject: cirdna, restrict References: Message-ID: <3A9D1217.ADC1EC7B@genprofile.com> I'm not sure if the "fonts" in plplot postscript output are really fonts. I looked at the output from cirdna and there are only plotting commands in the file. Maybe Adobe Distiller has a problem with this. It just sees ploted lines and no fonts. What about trying to use the hpgl output format as starting point for further conversions ? (Don't know if Distiller can read this.). David. From david.martin at biotek.uio.no Thu Feb 1 08:45:53 2001 From: david.martin at biotek.uio.no (David Martin) Date: Thu, 1 Feb 2001 09:45:53 +0100 Subject: Updated Admin guide to 1.9.1 Message-ID: An updated version of the admin guide is now available. Main changes: Explaination of the use of EMBOSS_DATA in specifying the base directory in which data files are installed by the third party database processing programs rebaseextract, prosextract, printsextract and tfextract. Acknowledgement of the fact that I cannot persuade EMBOSS configure to look for include files in non-standard locations (eg with EMNU so I had to compile by hand). Placeholders for GUI installation. Fix to the style file New/updated files: emboss.sty admin.tex admin.ps admin.pdf At ftp://ftp.no.embnet.org/pub/EMBOSS-extras Comments, suggestions, bugfixes etc. to me. ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- From flavio at area.ba.cnr.it Wed Feb 7 10:13:09 2001 From: flavio at area.ba.cnr.it (Vito Flavio Licciulli) Date: Wed, 07 Feb 2001 11:13:09 +0100 (MET) Subject: EMNU 1.0.3 segmentation fault Message-ID: <4.1.20010207103600.00bc9d70@pop.ba.cnr.it> Hi to all, I have a problem in EMNU rel. 1.0.3. We have a Compaq Tru64 Unix (ex Digital Unix, OSF1) server. Emnu goes in "segmentation fault" when the length of a program description is longer than the terminal column. For example, in a 80 character terminal, when you choose "RESTRICTION ENZYME" group menu ("recode" has a description line longer than 80 char). I found the problem in the emnu.c source. The line 1320 buffer = (char *) malloc(ajStrLen(gl->doc+1)); has to be changed in buffer = (char *) malloc(ajStrLen(gl->doc)+1); ...^^^^... Flavio Licciulli |=====================================================================| | Flavio Licciulli E-mail: flavio at area.ba.cnr.it | | System & Network Administrator WWW: http://www.ba.cnr.it/~flavio | | Italian EMBNET Node | | C.N.R. - Area di Ricerca Phone (39) 080-5482176/30/80 | | Via Amendola 166/5 BARI ITALY Fax (39) 080-5484467 | |=====================================================================| | | | L'intelligenza certe volte ci serve per fare | | impudentemente delle sciocchezze... | | La Rochefoucauld | '=====================================================================' From david.martin at biotek.uio.no Wed Feb 7 12:29:42 2001 From: david.martin at biotek.uio.no (David Martin) Date: Wed, 7 Feb 2001 13:29:42 +0100 Subject: Multiple architectures, same datafiles. Message-ID: I want to be able to use the same source tree but to compile for multiple architectures. I also want to use the same data files so I only need to update in one place and all the architectures work. If I set --libdir and --bindir then everything should be OK? make clean and rm config.cache between compilations. Thoughts? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From cupton at uvic.ca Wed Feb 7 18:00:42 2001 From: cupton at uvic.ca (Chris Upton) Date: Wed, 7 Feb 2001 10:00:42 -0800 Subject: GUIs for EMBOSS Message-ID: Hi, Where can I find info on GUIs for the EMBOSS programs. I know there was some.... but I can't seem to find it again!!!!!!! Cheers, Chris -- Chris Upton PhD Associate Professor Biochemistry and Microbiology Phone 250-721-6507 University of Victoria PO Box 3055 Victoria Fax 250-721-8855 BC V8W 3P6, Canada http://web.uvic.ca/biochem/bioc/Faculty/upton/upton.html From david.martin at biotek.uio.no Fri Feb 9 14:15:33 2001 From: david.martin at biotek.uio.no (David Martin) Date: Fri, 9 Feb 2001 15:15:33 +0100 Subject: Feature inheritance with sequences Message-ID: One of the great benefits of some of the graphical sequence manipulation tools is that when one chops out a section of a sequence the features travel with it. What is the position with regard to such things in EMBOSS? ie if pasteseq is used to splice sequence a into sequence b, will the feature table travel with it and be updated as well? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From bauer at genprofile.com Fri Feb 9 14:32:37 2001 From: bauer at genprofile.com (David Bauer) Date: Fri, 09 Feb 2001 15:32:37 +0100 Subject: Feature inheritance with sequences References: Message-ID: <3A83FF85.610F70CD@genprofile.com> David Martin wrote: > ie if pasteseq is used to splice sequence a into sequence b, will the > feature table travel with it and be updated as well? This would be wonderfull ! David. -- Dr. David Bauer GenProfile AG, Max-Delbrueck-Center, Erwin-Negelein-Haus Robert-Roessle-Str. 10, D-13125 Berlin, Germany bauer at genprofile.com, Tel:49-30-94892165, FAX:49-30-94892151 From Peter.Rice at uk.lionbioscience.com Fri Feb 9 14:58:29 2001 From: Peter.Rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 09 Feb 2001 14:58:29 +0000 Subject: Feature inheritance with sequences References: Message-ID: <3A840595.DF08258@lionbio.co.uk> David Martin wrote: > > One of the great benefits of some of the graphical sequence manipulation > tools is that when one chops out a section of a sequence the features > travel with it. > What is the position with regard to such things in EMBOSS? > > ie if pasteseq is used to splice sequence a into sequence b, will the > feature table travel with it and be updated as well? An interesting concept. In principle, yes. Features in EMBOSS follow the GFF model, so they are stored as groups of start and end positions. Assuming that they are not one of the more exotic kinds of start and end positions, it is easy to 'insert' one feature table into another. Deletion is also possible, although ends of features within the deleted region rules out making the process completely reversible. I have seen it all before - in the EBI's NEWFEATURES editor when mutating the 'replace' qualifiers. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From david.martin at biotek.uio.no Fri Feb 9 15:14:29 2001 From: david.martin at biotek.uio.no (David Martin) Date: Fri, 9 Feb 2001 16:14:29 +0100 Subject: Feature inheritance with sequences In-Reply-To: <3A840595.DF08258@lionbio.co.uk> Message-ID: On Fri, 9 Feb 2001, Peter Rice wrote: > David Martin wrote: > > > > One of the great benefits of some of the graphical sequence manipulation > > tools is that when one chops out a section of a sequence the features > > travel with it. > > What is the position with regard to such things in EMBOSS? > > > > ie if pasteseq is used to splice sequence a into sequence b, will the > > feature table travel with it and be updated as well? > > An interesting concept. > > In principle, yes. Features in EMBOSS follow the GFF model, so they are stored > as groups of start and end positions. Assuming that they are not one of the > more exotic kinds of start and end positions, it is easy to 'insert' one > feature table into another. Deletion is also possible, although ends of > features within the deleted region rules out making the process completely > reversible. > > I have seen it all before - in the EBI's NEWFEATURES editor when mutating the > 'replace' qualifiers. But does it actually work in practice? If one creates a new sequence from a and b, do the features travel with it? (I have seen very little documentation on using features.. anyone care to do a tutorial online via Wiki/IRC at some point as to how features work in EMBOSS?) Will EMBOSS write out features as standard in EMBL/Genbank format files or are we stuck in having extra GFF files floating about.. It would be good to have a tutorial on features. ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From Peter.Rice at uk.lionbioscience.com Fri Feb 9 15:20:42 2001 From: Peter.Rice at uk.lionbioscience.com (Peter Rice) Date: Fri, 09 Feb 2001 15:20:42 +0000 Subject: Feature inheritance with sequences References: Message-ID: <3A840ACA.3F25E333@lionbio.co.uk> David Martin wrote: > > On Fri, 9 Feb 2001, Peter Rice wrote: > > > In principle, yes. Features in EMBOSS follow the GFF model, so they are stored > > as groups of start and end positions. Assuming that they are not one of the > > more exotic kinds of start and end positions, it is easy to 'insert' one > > feature table into another. Deletion is also possible, although ends of > > features within the deleted region rules out making the process completely > > reversible. > > But does it actually work in practice? If one creates a new sequence from > a and b, do the features travel with it? Depends on whether the sequences have features. At present, you can turn on features for an input sequence, but most applications don't use it (wisely, as I am rewriting it heavily). > Will EMBOSS write out features as standard in EMBL/Genbank format files or > are we stuck in having extra GFF files floating about.. They are stored internally as 'GFF with hints to EMBL feature locations' but can be written out in any format - including formats that have not yet been invented. Some of those formats could even be read back in again. Others will be standard forms of application output. Yes, that does include EMBL/GenBank/Swissprot entries with feature tables. > It would be good to have a tutorial on features. I will need to write one for Alan anyway once the updates are done. -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From david.martin at biotek.uio.no Tue Feb 13 14:27:24 2001 From: david.martin at biotek.uio.no (David Martin) Date: Tue, 13 Feb 2001 15:27:24 +0100 Subject: DBIFASTA on univec Message-ID: Univec has a nasty variant of genbank header lines that is currently not parseable with DBIFASTA. THe syntax is.. >gnl|uv|X66730.1:1-2687-49 B.bronchiseptica plasmid pBBR1 I presume all that is needed is to add a new 'else if' statement in getExpr to cope? ..d --------------------------------------------------------------------- * Dr. David Martin Biotechnology Centre of Oslo * * Node Manager Gaustadalleen 21 * * The Norwegian EMBNet Node P.O. box 1125 Blindern * * tel +47 22 84 05 35 N-0317 Oslo * * fax +47 22 84 05 01 Norway * --------------------------------------------------------------------- I will be leaving the Norwegian EMBnet node on 23rd February. All work related mail should be addressed to admin at embnet.uio.no where my successor, Rune Groven will deal with it. All personal email should be sent to dmartin at hgmp.mrc.ac.uk from whence it will be automatically forwarded to me. Spam should continue to be sent to /dev/null From starksb at ebi.ac.uk Thu Feb 15 11:33:55 2001 From: starksb at ebi.ac.uk (David Starks-Browning) Date: Thu, 15 Feb 2001 11:33:55 +0000 Subject: problem with configure and --with-pngdriver Message-ID: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> Greetings, I've just built EMBOSS-1.9.1 on SGI IRIX with gcc and would like to report a problem I had. We have PNG (and other supporting) libraries installed in /sw/arch. I invoke configure with the following environment variables set: CPPFLAGS = -I/sw/arch/include LDFLAGS = -L/sw/arch/lib Then any attempt by configure to find png-related features should succeed, without specifying --with-pngdrivers=/sw/arch. Unfortunately, this fails on SGI IRIX, because of the following logic in configure: > # Check whether --with-pngdriver or --without-pngdriver was given. > if test "${with_pngdriver+set}" = set; then > withval="$with_pngdriver" > if test "$withval" != no ; then > echo "$ac_t""yes" 1>&6 > ALT_HOME="$withval" > else > echo "$ac_t""no" 1>&6 > fi > else > echo "$ac_t""yes" 1>&6 > ALT_HOME=/usr/local > if test ! -f "${ALT_HOME}/include/zlib.h" > then > ALT_HOME=/usr > fi > fi (I reformated this for readability.) With this logic (and some additional code that follows it), if with-pngdriver is not set, and zlib.h is not found in /usr/local/include, then "-L/usr/lib" is added to the compile command when testing for png components. I think this is a mistake. I can't think of why one would explicitly add -L/usr/lib (or *any* system library path, for that matter) to the compile command line. The compiler should handle that path search automatically. On IRIX, for example, this over-rides the compiler's effort to use appropriate system library paths, and forces ld to use o32-ABI libraries. On any modern SGI system, or with gcc, this is likely to fail, because all the other objects are likely to be n32 or n64. I do not have much experience with autoconf, but I am willing to attempt a rewrite of the CHECK_PNGDRIVER in aclocal.m4 if that would be useful to others. Ian wrote this originally, but I understand he's busy with other things now. (Is that correct?) Thanks for your help. David ------------------------------------------------------------------- David Starks-Browning | starksb at ebi.ac.uk EMBL Outstation -- | The European Bioinformatics Institute | Wellcome Trust Genome Campus | tel: +44 (1223) 494 616 Hinxton, Cambridge, CB10 1SD, UK | fax: +44 (1223) 494 468 ------------------------------------------------------------------- From starksb at ebi.ac.uk Thu Feb 15 13:37:36 2001 From: starksb at ebi.ac.uk (David Starks-Browning) Date: Thu, 15 Feb 2001 13:37:36 +0000 Subject: more problems with configure and --with-pngdriver In-Reply-To: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> References: <4793-Thu15Feb2001113355+0000-starksb@ebi.ac.uk> Message-ID: <8409-Thu15Feb2001133736+0000-starksb@ebi.ac.uk> Greetings, Continuing with --with-pngdriver: We went to some effort to build shared versions of libgd.so for use by the perl module GD.pm. This is not the default, so perhaps few people have done this. When configure checks for png support, it tries to compile conftest.c, which contains a call to gdImageCreateFromPng(), by linking against -lgd. If this is successful, hoorah. If it fails, then configure advises you that you must upgrade your gd library. I find that compilation of conftest succeeds with libgd.so on Compaq, but fails on Linux and Solaris, with unresolved references to jpeg and Xpm routines. (We have IRIX too, but only libgd.a on that, and it succeeds.) If I compile manually, the Linux and Solaris compiles do succeed if I use the static library libgd.a. It could be that I built libgd.so correctly on Compaq but incorrectly on Linux and Solaris. So I ask, does anyone else have this problem? If it's not simply an error on my part, then perhaps configure needs to be told how to deal with this case. Again, if there is interest (i.e., it's not just me affected), then I could attempt to fix this in the CHECK_PNGDRIVERS macro. Regards, David ------------------------------------------------------------------- David Starks-Browning | starksb at ebi.ac.uk EMBL Outstation -- | The European Bioinformatics Institute | Wellcome Trust Genome Campus | tel: +44 (1223) 494 616 Hinxton, Cambridge, CB10 1SD, UK | fax: +44 (1223) 494 468 ------------------------------------------------------------------- From heme at postmark.net Wed Feb 14 12:31:42 2001 From: heme at postmark.net (Per Johansson) Date: Wed, 14 Feb 2001 12:31:42 +0000 Subject: EMBOSS:fuzznuc Message-ID: <20010214123143.31759.qmail@venus.postmark.net> Hi, Is it possible to get fuzznuc (or some other pattern search program) to read a file with several DNA patterns instead of just a single pattern? Sofia From ableasby at hgmp.mrc.ac.uk Sun Feb 18 14:45:52 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Sun, 18 Feb 2001 14:45:52 GMT Subject: EMBOSS:fuzznuc Message-ID: <200102181445.OAA00388@tin.hgmp.mrc.ac.uk> Dear Sofia, Getting multiple pattern searches into the fuzzies is high on the list of things to do. Alan From ableasby at hgmp.mrc.ac.uk Sun Feb 18 16:07:32 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Sun, 18 Feb 2001 16:07:32 GMT Subject: EMBOSS 1.10.0 Message-ID: <200102181607.QAA17711@bromine.hgmp.mrc.ac.uk> EMBOSS 1.10.0 This release contains several new applications, some which are still under active development. We hope to provide some of the data files referred to on our ftp server soon. MARSCAN Matrix/scaffold attachment regions (MARs/SARs) are genomic elements thought to delineate the structural and functional organisation of the eukaryotic genome. Originally, MARs and SARs were identified through their ability to bind to the nuclear matrix or scaffold. Binding cannot be assigned to a unique sequence element, but is dispersed over a region of several hundred base pairs. These elements are found flanking a gene or a small cluster of genes and are located often in the vicinity of cis-regulatory sequences. This has led to the suggestion that they contribute to higher order regulation of transcription by defining boundaries of independently controlled chromatin domains. There is indirect evidence to support this notion. In transgenic experiments MARs/SARs dampen position effects by shielding the transgene from the effects of the chromatin structure at the site of integration. Furthermore, they may act as boundary elements for enhancers, restricting their long range effect to only the promoters that are located in the same chromatin domain. marscan finds a bipartite sequence element that is unique for a large group of eukaryotic MARs/SARs. This MAR/SAR recognition signature (MRS) comprises two individual sequence elements that are <200 bp apart and may be aligned on positioned nucleosomes in MARs. The MRS can be used to correctly predict the position of MARs/SARs in plants and animals, based on genomic DNA sequence information alone. Experimental evidence from the analysis of >300 kb of sequence data from several eukaryotic organisms show that wherever a MRS is observed in the DNA sequence, the corresponding genomic fragment is a biochemically identifiable SAR. The MRS is a bipartite sequence element that consists of two individual sequences of 8 (AATAAYAA) and 16 bp (AWWRTAANNWWGNNNC) within a 200 bp distance from each other. One mismatch is allowed in the 16 bp pattern. The patterns can occur on either strand of the DNA with respect to each other. Not all SARs contain a MRS. Analysis of >300 kb of genomic sequence from a variety of eukaryotic organisms shows that the MRS faithfully predicts 80% of MARs and SARs, suggesting that at least one other type of MAR/SAR may exist which does not contain a MRS. SCOPE scope parses the scop classification file available at http://scop.mrc-lmb.cam.ac.uk/scop/search.cgi?dir=lin and writes the scop classification to an embl-like format file. This file (Escop.dat) should be placed in the emboss/data directory. NRSCOPE nrscope parses the embl-like format scop classification file generated by the EMBOSS application scope, and writes in the same format a file of non-redundant domains. The format of these files is explained in the scope documentation. The current version of nrscope removes redundancy at the level of the scop family, i.e. entries belonging to the same family will be non-redundant. DOMAINER domainer parses an embl-like format scop classification file generated by the EMBOSS applications scope or nrscope, and clean protein coordinate files generated by the coorde application (not currently in emboss, email Jon Ison jison at hgmp.mrc.ac.uk) and writes, for each domain in the scop classification file, clean domain coordinate files in embl-like and pdb formats . Each of these files contains coordinates for a single scop domain. STAMPS (under development) stamps parses an embl-like format scop classification file generated by the EMBOSS applications scope or nrscope, and calls stamp to generate structural alignments for each SCOP family. It is still under active development. You have to "make stamp" in the applications directory to create "stamps". Developers Notes 1. Most C datatypes have changed in the libraries. This is a prelude to getting true 64 bit operation. Notably ints are now "ajint"s and longs are now "ajlong"s. An ajint can be equal in size to an ajlong depending on the hardware; however, an ajlong should be used whenever a 64 bit int might be used. 2. The function ajFmtScanS has been added. This can be regarded as the EMBOSS version of the C function sscanf and operates similarly. It has several extensions, particularly %S is used for dynamically allocated string objects (AjPStr). This function makes reading data files considerably easier and many applications will be rewritten to use it rather than having to rely on tokenisation. As usual I've probably forgotten to mention some things and my colleagues will no doubt correct any oversights. Alan From oddmund.nordgard at biokjemi.uio.no Mon Feb 19 11:07:14 2001 From: oddmund.nordgard at biokjemi.uio.no (=?ISO-8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Mon, 19 Feb 2001 12:07:14 +0100 (CET) Subject: Prettyplot bug? In-Reply-To: Message-ID: Hello! The last box in a prettyplot with boxes lacks the right line. EMBOSS 1.9.1 Of less importance: The margin between the the first letters on each line and the left box-line is smaller than the other margins. Oddmund -- ******************************************* Oddmund Nordg?rd Ph.D. student Institute of Biochemistry Box 1041 Blindern 0316 OSLO NORWAY Phone: 22 85 66 99 Fax: 22 85 44 43 Email: oddmundn at biokjemi.uio.no Private: Kalbakkv. 21 0953 OSLO Phone: 22 25 23 93 ******************************************** Powered by Linux! http://www.linuxnorge.com From luetzel at imbim.uu.se Wed Feb 21 10:21:23 2001 From: luetzel at imbim.uu.se (Martin Luetzelberger) Date: Wed, 21 Feb 2001 11:21:23 +0100 (MET) Subject: REBASE bug? Message-ID: Hi all, I'm in trouble with EMBOSS 1.10.0 running on SuSE Linux 7.0. Compilation and installation of the programs was fine -- no problems at all. However, all of the programs for restriction mapping like "redata", "restrict" and "remap" crash with a segmentation fault. I've set EMBOSS_DATA to /usr/local/share/EMBOSS/data, the folder which includes the REBASE directory. I've extracted REBASE files using "rebaseextract" from "withrefm" (rebase 101). The files embossre.enz, embossre.ref and embossre.sup were placed properly into the REBASE folder by the "rebaseextract" program. Has anyone the same experience? Any suggestions? Martin Luetzelberger, Ph.D Dept. of Medical Biochemistry and Microbiology The Biomedical Center (BMC), BOX 582, Husargatan 3, University of Uppsala S-751 23 Uppsala, Sweden Phone : +46(0)18 471-4587 or -4557 Fax : +46(0)18 517533 Mobile: +46(0)73 0388442 E-Mail: Martin.Luetzelberger at imbim.uu.se From zno6 at cdc.gov Wed Feb 21 21:54:18 2001 From: zno6 at cdc.gov (Sammons, Scott) Date: Wed, 21 Feb 2001 16:54:18 -0500 Subject: Analyzing Large Sequences Message-ID: <1537895B1173D1118D9300A0246212B607A83B55@mcdc-atl-4.cdc.gov> Greetings: I am using EMBOSS 1.9.1 on a Solaris 2.8 machine. I would like to analyze large viruses, ~185 kb. I seem to be having some odd errors running the getorf program on sequences of this size. No error messages, but getorf is not finding orfs that I know are there. What is the maximum size sequence that EMBOSS can handle? I looked through the documentation but could not find info. Is the defined as integer (32767)? Thanks, Scott Sammons Centers for Disease Control and Prevention 1600 Clifton Rd, NE MS G36 Atlanta, GA 30333 404-639-3560 (Voice) 404-639-1331 (FAX) ssammons at cdc.gov From sgmd at genetik.fu-berlin.de Thu Feb 22 09:52:48 2001 From: sgmd at genetik.fu-berlin.de (Thomas Siegmund) Date: Thu, 22 Feb 2001 10:52:48 +0100 Subject: Analyzing Large Sequences Message-ID: <01022210524800.10896@scarlet> Hi Scott, I had similar problems when I tried to translate the whole Drosophila genome (quite a few of BACs of 300 kb or so). For me the "wimklein" programm from the SEALS package (NCBI) did the job perfectly. With best regards Thomas Siegmund -- Freie Universit?t Berlin Institut f?r Genetik Arnimallee 7 14195 Berlin Germany Tel: +49 30 838 54868 From ableasby at hgmp.mrc.ac.uk Thu Feb 22 10:58:19 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 22 Feb 2001 10:58:19 GMT Subject: Analyzing Large Sequences Message-ID: <200102221058.KAA16969@bromine.hgmp.mrc.ac.uk> Thomas writes in reply: >I had similar problems when I tried to translate the whole Drosophila genome >(quite a few of BACs of 300 kb or so). Could I just make a plea that any perceived bugs are emailed to emboss-bug at embnet.org . This has the advantages that: a) All authors will be informed of any problems b) It might help the person with the problem c) It might prevent others from having the same problem. Please, also report the version (if known), the platform you are using and whenever possible test data and what you expect to happen in order to help us reproduce any problem. Thanks Alan PS: The size limit is platform dependent but should be at least 2^31 and we are working towards making it 2^63 on platforms that can accept it. From ableasby at hgmp.mrc.ac.uk Mon Feb 26 16:48:12 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Mon, 26 Feb 2001 16:48:12 GMT Subject: EMBOSS 1.10.1 Message-ID: <200102261648.QAA20498@tin.hgmp.mrc.ac.uk> This is a bugfix release for marscan, getorf and garnier. It also contains a preliminary version of HMOMENT. This is still being fine-tuned in collaboration with the site that requested such a program in EMBOSS. Documentation will follow. I mention it just in case anyone asks! The cleaned-up version of PDB mentioned in the 1.10.0 release announcement is now available at: ftp.uk.embnet.org pub/databases/cpdb The CPDBSCOP files are in: ftp.uk.embnet.org pub/databases/cpdbscop These relate to the SCOPE/NRSCOPE/DOMAINER/(STAMPS) programs. More software is being deveoped that use these files. Uncompressed cpdb is 6.7Gb and cpdbscop is 3.5Gb. They are supplied as gzipped individual entries. My colleage (Jon Ison) can answer any related questions on this list. Alan From bauer at genprofile.com Wed Feb 28 10:51:01 2001 From: bauer at genprofile.com (David Bauer) Date: Wed, 28 Feb 2001 11:51:01 +0100 Subject: cirdna, restrict Message-ID: <3A9CD815.72A2FC70@genprofile.com> Hi, I wanted to display a result from restrict of a 5 kb Plasmid with cirdna. But it is not possible to put more the 8 Tick marks in the datafile. Up to 8 Ticks everything is OK. But with adding the 9th Tick the size of the ploted labels is scaled down to something which is readable only under a microscope. Lindna does not have this effect even if I take the complete output with all single-cutters which has 49 recognition sites. Is there really no alternative to the plplot library? Thanks, David. -- Dr. David Bauer GenProfile AG, Max-Delbrueck-Center, Erwin-Negelein-Haus Robert-Roessle-Str. 10, D-13125 Berlin, Germany bauer at genprofile.com, Tel:49-30-94892165, FAX:49-30-94892151 From ableasby at hgmp.mrc.ac.uk Wed Feb 28 11:05:50 2001 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 28 Feb 2001 11:05:50 GMT Subject: cirdna, restrict Message-ID: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk> David: >Is there really no alternative to the plplot library? None that's around at the moment that I can see and is LGPL'd. We are interviewing for two positions early next month though and the job of one of the posts will be to write our own graphics library. Its not that hard. I'm as surprised as you that there's nothing else suitably licenced out there. If anyone can suggest one/donate one we'd be delighted. Alan From johann at egenetics.com Wed Feb 28 12:04:53 2001 From: johann at egenetics.com (Johann Visagie) Date: Wed, 28 Feb 2001 14:04:53 +0200 Subject: cirdna, restrict In-Reply-To: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk>; from ableasby@hgmp.mrc.ac.uk on Wed, Feb 28, 2001 at 11:05:50AM +0000 References: <200102281105.LAA29267@bromine.hgmp.mrc.ac.uk> Message-ID: <20010228140453.C65275@fling.sanbi.ac.za> ableasby at hgmp.mrc.ac.uk on 2001-02-28 (Wed) at 11:05:50 +0000: > > >Is there really no alternative to the plplot library? > > None that's around at the moment that I can see and is LGPL'd. > We are interviewing for two positions early next month though > and the job of one of the posts will be to write our own > graphics library. Its not that hard. I'm as surprised as you > that there's nothing else suitably licenced out there. If > anyone can suggest one/donate one we'd be delighted. A cursory search reveals more plotting tools and libraries than you can shake a stick at. Having no experience with any of them, I'm at a loss to suggest any particular one. I'm also not familiar enough with EMBOSS' plotting requirements to make a feature-by-feature comparison and see which would suffice. And finally, I don't know which ones you may already have tried and discarded. :-) -- Johann From David.Lapointe at umassmed.edu Wed Feb 28 14:18:06 2001 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Wed, 28 Feb 2001 09:18:06 -0500 Subject: cirdna, restrict Message-ID: For what it is worth, I have had problems converting EMBOSS ps output to PDF ( with Adobe Distiller). Prettyplot, for example, produces a great color postscript file, but when it is converted to PDF it is all messed up ( Fonts WAY TOO LARGE). I was intending to convert the ps output to PDF in the background for web viewing and easier printing. I guess I'd like to see an option for PDF output in the list of graphical formats. David > -----Original Message----- > From: ableasby at hgmp.mrc.ac.uk [mailto:ableasby at hgmp.mrc.ac.uk] > Sent: Wednesday, February 28, 2001 6:06 AM > To: emboss at hgmp.mrc.ac.uk > Subject: Re: cirdna, restrict > > > > David: > > >Is there really no alternative to the plplot library? > > None that's around at the moment that I can see and is LGPL'd. > We are interviewing for two positions early next month though > and the job of one of the posts will be to write our own > graphics library. Its not that hard. I'm as surprised as you > that there's nothing else suitably licenced out there. If > anyone can suggest one/donate one we'd be delighted. > > Alan > > From bauer at genprofile.com Wed Feb 28 14:58:31 2001 From: bauer at genprofile.com (David Bauer) Date: Wed, 28 Feb 2001 15:58:31 +0100 Subject: cirdna, restrict References: Message-ID: <3A9D1217.ADC1EC7B@genprofile.com> I'm not sure if the "fonts" in plplot postscript output are really fonts. I looked at the output from cirdna and there are only plotting commands in the file. Maybe Adobe Distiller has a problem with this. It just sees ploted lines and no fonts. What about trying to use the hpgl output format as starting point for further conversions ? (Don't know if Distiller can read this.). David.