From gbottu at ben.vub.ac.be Mon Jul 10 11:16:35 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 10 Jul 2006 17:16:35 +0200 Subject: [emboss-dev] Question about sequence input-output - Checked by AntiVir DEMO version Message-ID: <20060710151635.GA29420@bigben.ulb.ac.be> Dear developers, I just installed the patched versions of ajseqread.c and ajseqwrite.c in order to prepare for the new release of EMBL. I took the occasion to take a look at the supported formats for input and output and was a litlte surprised : - for input the names "embl" and "em" support both the old and the new format. For writing "embl", "em" and "emblold" write in old format and "emblnew" writes in new format. Is that not a little bit confusing ? - Worse is the following : for reading "ncbi" and "fasta" read NCBI fastA format, "pearson" reads standard fastA format. For writing "ncbi" writes NCBI fastA format and "pearson" and "fasta" write standard fastA format. I remember that a colleague of mine had troubles because of this. Should "fasta" not consistently have the same meaning and then preferably be synonymous of "pearson" ? - I also noted that jackkinfernon, nexusnon and meganon are only supported for writing, not for reading, even if they are mentioned as input formats in http://emboss.sourceforge.net/docs/themes/SequenceFormats.html Comments ? Guy Bottu, BEN From pmr at ebi.ac.uk Mon Jul 10 13:24:36 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 10 Jul 2006 18:24:36 +0100 Subject: [emboss-dev] Question about sequence input-output - Checked by AntiVir DEMO version In-Reply-To: <20060710151635.GA29420@bigben.ulb.ac.be> References: <20060710151635.GA29420@bigben.ulb.ac.be> Message-ID: <44B28D54.7020102@ebi.ac.uk> Dear Guy, Guy Bottu wrote: > Dear developers, > > I just installed the patched versions of ajseqread.c and ajseqwrite.c in > order to prepare for the new release of EMBL. I took the occasion to take > a look at the supported formats for input and output and was a litlte > surprised : Very close to the release (Saturday 15th) ... this is the current status and unlikely to change. > - for input the names "embl" and "em" support both the old and the new > format. For writing "embl", "em" and "emblold" write in old format and > "emblnew" writes in new format. Is that not a little bit confusing ? We decided it is more confusing to change "embl" output. There is a new "emblnew" output for anyone who wants the new format. If it is popular, we may make it the default for "embl" output in a future release. > - Worse is the following : for reading "ncbi" and "fasta" read NCBI fastA > format, "pearson" reads standard fastA format. For writing "ncbi" writes > NCBI fastA format and "pearson" and "fasta" write standard fastA format. > I remember that a colleague of mine had troubles because of this. Should > "fasta" not consistently have the same meaning and then preferably be > synonymous of "pearson" ? "pearson" is only there for a few cases where we have to preserve the exact ID. "fasta" interprets the ID and can handle the strange NCBI style files. We need separate output format names because the IDs are so different. There are a huge number of possible combinations of "fasta" format. we certainly need to write "ncbi" for some users and "fasta" for others. > - I also noted that jackkinfernon, nexusnon and meganon are only > supported for writing, not for reading, even if they are mentioned as > input formats in > http://emboss.sourceforge.net/docs/themes/SequenceFormats.html They should not be listed as input formats. If anyone does need them we can add them in future (so far, nobody has asked for them!) Hope that helps, Peter From henrikki.almusa at helsinki.fi Mon Jul 10 14:09:13 2006 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Mon, 10 Jul 2006 21:09:13 +0300 Subject: [emboss-dev] makenucseq, error with completely random sequence? Message-ID: <44B297C9.306@helsinki.fi> Hello, I went through the code that I submitted and wondered if there is in fact an error in it :). When creating a completely random sequence we use following string to make the random sequence char seqCharNucPure[] = "ACGTUacgtu"; This is then converted into list and random element is taken to make the sequence. However later this is done ajStrExchangeSetCC(&seqstr,"u","t") which replaces all the occurrances of 'u' with 't'. Doesn't this mean that the 't' is now overrepresented? I used the seqCharNucPure in there hoping that some point that stucture might be available without copy and paste of the code. However should the program be changed to use little different version of the string (which could then in that later situation added to ajseqtype.c). char seqCharNucDnaPure[] = "ACGTacgt"; So something like this (untested)? --- makenucseq.c 2006-07-10 05:49:55.000000000 +0000 +++ makenucseq.c.fix 2006-07-10 11:24:53.000000000 +0000 @@ -125,7 +125,6 @@ ajStrFmtLower(&seqstr); seq = ajSeqNew(); - ajStrExchangeSetCC(&seqstr,"u","t"); if (extra < 0) ajStrCutStart(&seqstr,extra); ajSeqAssignSeqS(seq, seqstr); @@ -184,12 +183,12 @@ int i; int max; char *chars; - char seqCharNucPure[] = "ACGTUacgtu"; - int seqCharNucPureLength = 10; + char seqCharNucDnaPure[] = "ACGTacgt"; + int seqCharNucDnaPureLength = 8; AjPStr tmp; - chars = seqCharNucPure; - max = seqCharNucPureLength; + chars = seqCharNucDnaPure; + max = seqCharNucDnaPureLength; for (i = 0; i < max; i++) { Regards, -- Henrikki Almusa From ajb at ebi.ac.uk Sat Jul 15 05:42:53 2006 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Sat, 15 Jul 2006 10:42:53 +0100 (BST) Subject: [emboss-dev] EMBOSS 4.0.0 released Message-ID: <42406.81.98.244.247.1152956573.squirrel@webmail.ebi.ac.uk> EMBOSS-4.0.0.tar.gz is now available. It can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/ or via anonymous ftp to emboss.open-bio.org in the pub/EMBOSS directory. As usual, more complete information is in the ChangeLog file: here are some highlights. - new prompt style generated from 'knowntypes' - new -help format provides more information - new sequence access method 'dbfetch' uses the EBI's REST services - new sequence access method 'mrs' uses CMBI's "Maarten's Retrieval System." - new program backtranambig - new program makenucseq - new program makeprotseq - 'embl' format will read both the old EMBL format and the new one. A new output format 'emblnew' can be used to write the new format. - new 'swissnew' database format - lists of prosite patterns can now be used by fuzznuc, fuzzpro & fuzztran. Pattern lists can be specified using the @filename syntax. New options added specifically for pattern lists. Pattern lists have a new ACD definition type. - lists of regular expressions can now be used by dreg & preg - Use of GFF for proteins now allowed - prophet now uses an 'align' output type - iep allows the specification of the number of uncharged lysines and intra-chain disulphide bridges - splitter/union allow nucleotide deatues to be preserved - digest has a ragging capability - coderet writes any permutation of cds, mrna & protein to separate files - mincount option added to wordcount - biosed modified to allow the specification of sequence mutation position - wossname can now search for phrases - new sequence type 'gapstopprotein' - sequence reading from website URLs now defaults to HTTP 1.1 - new 'keywords' attribute in ACD files - many minor additions, bugfixes and placeholders for future capabilities EMBASSY packages VIENNA has been added. This is a port of the Vienna RNA package by Ivo Hofacker. It is to be regarded as an alpha test. We are investigating the incorporation of the Vienna sequence format into the main libraries. This would lead to simplification of the interface for future releases. HMMER: this package is now a wrapper written around HMMER 2.3.2 You must therefore install HMMER 2.3.2 from the http://hmmer.wustl.edu/ site and add it to your PATH. MEMENEW: this package is now a wrapper. You must therefore install MEME/MAST from the http://meme.sdsc.edu/meme/meme-download.html site and add it to your PATH. MYEMBOSS: This package enable developers to write their own applications using the standard EMBOSS distribution. PHYLIPNEW: This is now out of EMBOSS beta testing. It is PHYLIP version 3.6b Microsoft Windows An alpha test version of EMBOSS for Microsoft Windows is available from the 'windows' directory of the ftp server above. This port was done using Andre Blavier's EMBOSSWIN package as a starting point and we thank him for his work. The EMBOSS programs can be run from a DOS Command window. There is currently no GUI though we hope that some may spring up from the community. If, when trying to run applications, you get "DLL missing" errors then you will need to install the vcredist_x86.exe package for Visual C++ 2005 from the Microsoft web site. This small executable does not install a compiler, only the required runtime DLLs. Developers Developers should note that we are in the process of standardising library function names. Old function names will still work but will print out a 'deprecated' message when compiled using the GCC compiler. We therefore recommend use of this compiler for developers as an aid to updating their source code. Happy St Swithun's (Swithin's) Day. Alan EBI 15th July 2006 From javierluiso at gmail.com Sat Jul 22 21:33:22 2006 From: javierluiso at gmail.com (Javier Luiso) Date: Sat, 22 Jul 2006 22:33:22 -0300 Subject: [emboss-dev] New application - dottie Message-ID: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> Hi, I read the table of "suggested new applications for EMBOSS" ( http://emboss.sourceforge.net/apps/proposed.html) and I'm interesting in the app. dottie. I work in Computer Graphics and Visualization area, not in biology, so I need more specific information about the features dottie must include. Thanks. From javierluiso at gmail.com Mon Jul 24 09:37:41 2006 From: javierluiso at gmail.com (Javier Luiso) Date: Mon, 24 Jul 2006 10:37:41 -0300 Subject: [emboss-dev] New application - dottie In-Reply-To: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> References: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> Message-ID: <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> Hi, I read the table of "suggested new applications for EMBOSS" (http://emboss.sourceforge.net/apps/proposed.html) and I'm interesting in the app. dottie. I work in Computer Graphics and Visualization area, not in biology, so I need more specific information about the features dottie must include. Thanks. From cupton at uvic.ca Mon Jul 24 13:16:42 2006 From: cupton at uvic.ca (Chris Upton) Date: Mon, 24 Jul 2006 10:16:42 -0700 Subject: [emboss-dev] New application - dottie In-Reply-To: <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> References: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> Message-ID: HI, We have built a java interface to Dotter called JDOTTER (see links) . This allows use of a menu for file input, connection to a database of precalculated plots (if required), display of annotations, multiplatform use. It keeps the zoom functions of Dotter. http://athena.bioc.uvic.ca/workbench.php?tool=jdotter&db= http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14734323 Let me know if you want more info. Chris Upton On Jul 24, 2006, at 6:37 AM, Javier Luiso wrote: > Hi, I read the table of "suggested new applications for EMBOSS" > (http://emboss.sourceforge.net/apps/proposed.html) > and I'm interesting in the app. dottie. > I work in Computer Graphics and Visualization area, not in biology, > so I > need more specific information about the features dottie must include. > > Thanks. > Chris Upton Ph.D. Associate Professor Biochemistry and Microbiology Tel. 250-721-6507 University of Victoria Fax 250-721-8855 P.O. Box 3055 STN CSC Victoria, BC V8W 3P6 Canada web.uvic.ca/~cupton www.virology.ca www.sarsresearch.ca From gbottu at ben.vub.ac.be Mon Jul 10 15:16:35 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 10 Jul 2006 17:16:35 +0200 Subject: [emboss-dev] Question about sequence input-output - Checked by AntiVir DEMO version Message-ID: <20060710151635.GA29420@bigben.ulb.ac.be> Dear developers, I just installed the patched versions of ajseqread.c and ajseqwrite.c in order to prepare for the new release of EMBL. I took the occasion to take a look at the supported formats for input and output and was a litlte surprised : - for input the names "embl" and "em" support both the old and the new format. For writing "embl", "em" and "emblold" write in old format and "emblnew" writes in new format. Is that not a little bit confusing ? - Worse is the following : for reading "ncbi" and "fasta" read NCBI fastA format, "pearson" reads standard fastA format. For writing "ncbi" writes NCBI fastA format and "pearson" and "fasta" write standard fastA format. I remember that a colleague of mine had troubles because of this. Should "fasta" not consistently have the same meaning and then preferably be synonymous of "pearson" ? - I also noted that jackkinfernon, nexusnon and meganon are only supported for writing, not for reading, even if they are mentioned as input formats in http://emboss.sourceforge.net/docs/themes/SequenceFormats.html Comments ? Guy Bottu, BEN From pmr at ebi.ac.uk Mon Jul 10 17:24:36 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 10 Jul 2006 18:24:36 +0100 Subject: [emboss-dev] Question about sequence input-output - Checked by AntiVir DEMO version In-Reply-To: <20060710151635.GA29420@bigben.ulb.ac.be> References: <20060710151635.GA29420@bigben.ulb.ac.be> Message-ID: <44B28D54.7020102@ebi.ac.uk> Dear Guy, Guy Bottu wrote: > Dear developers, > > I just installed the patched versions of ajseqread.c and ajseqwrite.c in > order to prepare for the new release of EMBL. I took the occasion to take > a look at the supported formats for input and output and was a litlte > surprised : Very close to the release (Saturday 15th) ... this is the current status and unlikely to change. > - for input the names "embl" and "em" support both the old and the new > format. For writing "embl", "em" and "emblold" write in old format and > "emblnew" writes in new format. Is that not a little bit confusing ? We decided it is more confusing to change "embl" output. There is a new "emblnew" output for anyone who wants the new format. If it is popular, we may make it the default for "embl" output in a future release. > - Worse is the following : for reading "ncbi" and "fasta" read NCBI fastA > format, "pearson" reads standard fastA format. For writing "ncbi" writes > NCBI fastA format and "pearson" and "fasta" write standard fastA format. > I remember that a colleague of mine had troubles because of this. Should > "fasta" not consistently have the same meaning and then preferably be > synonymous of "pearson" ? "pearson" is only there for a few cases where we have to preserve the exact ID. "fasta" interprets the ID and can handle the strange NCBI style files. We need separate output format names because the IDs are so different. There are a huge number of possible combinations of "fasta" format. we certainly need to write "ncbi" for some users and "fasta" for others. > - I also noted that jackkinfernon, nexusnon and meganon are only > supported for writing, not for reading, even if they are mentioned as > input formats in > http://emboss.sourceforge.net/docs/themes/SequenceFormats.html They should not be listed as input formats. If anyone does need them we can add them in future (so far, nobody has asked for them!) Hope that helps, Peter From henrikki.almusa at helsinki.fi Mon Jul 10 18:09:13 2006 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Mon, 10 Jul 2006 21:09:13 +0300 Subject: [emboss-dev] makenucseq, error with completely random sequence? Message-ID: <44B297C9.306@helsinki.fi> Hello, I went through the code that I submitted and wondered if there is in fact an error in it :). When creating a completely random sequence we use following string to make the random sequence char seqCharNucPure[] = "ACGTUacgtu"; This is then converted into list and random element is taken to make the sequence. However later this is done ajStrExchangeSetCC(&seqstr,"u","t") which replaces all the occurrances of 'u' with 't'. Doesn't this mean that the 't' is now overrepresented? I used the seqCharNucPure in there hoping that some point that stucture might be available without copy and paste of the code. However should the program be changed to use little different version of the string (which could then in that later situation added to ajseqtype.c). char seqCharNucDnaPure[] = "ACGTacgt"; So something like this (untested)? --- makenucseq.c 2006-07-10 05:49:55.000000000 +0000 +++ makenucseq.c.fix 2006-07-10 11:24:53.000000000 +0000 @@ -125,7 +125,6 @@ ajStrFmtLower(&seqstr); seq = ajSeqNew(); - ajStrExchangeSetCC(&seqstr,"u","t"); if (extra < 0) ajStrCutStart(&seqstr,extra); ajSeqAssignSeqS(seq, seqstr); @@ -184,12 +183,12 @@ int i; int max; char *chars; - char seqCharNucPure[] = "ACGTUacgtu"; - int seqCharNucPureLength = 10; + char seqCharNucDnaPure[] = "ACGTacgt"; + int seqCharNucDnaPureLength = 8; AjPStr tmp; - chars = seqCharNucPure; - max = seqCharNucPureLength; + chars = seqCharNucDnaPure; + max = seqCharNucDnaPureLength; for (i = 0; i < max; i++) { Regards, -- Henrikki Almusa From ajb at ebi.ac.uk Sat Jul 15 09:42:53 2006 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Sat, 15 Jul 2006 10:42:53 +0100 (BST) Subject: [emboss-dev] EMBOSS 4.0.0 released Message-ID: <42406.81.98.244.247.1152956573.squirrel@webmail.ebi.ac.uk> EMBOSS-4.0.0.tar.gz is now available. It can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/ or via anonymous ftp to emboss.open-bio.org in the pub/EMBOSS directory. As usual, more complete information is in the ChangeLog file: here are some highlights. - new prompt style generated from 'knowntypes' - new -help format provides more information - new sequence access method 'dbfetch' uses the EBI's REST services - new sequence access method 'mrs' uses CMBI's "Maarten's Retrieval System." - new program backtranambig - new program makenucseq - new program makeprotseq - 'embl' format will read both the old EMBL format and the new one. A new output format 'emblnew' can be used to write the new format. - new 'swissnew' database format - lists of prosite patterns can now be used by fuzznuc, fuzzpro & fuzztran. Pattern lists can be specified using the @filename syntax. New options added specifically for pattern lists. Pattern lists have a new ACD definition type. - lists of regular expressions can now be used by dreg & preg - Use of GFF for proteins now allowed - prophet now uses an 'align' output type - iep allows the specification of the number of uncharged lysines and intra-chain disulphide bridges - splitter/union allow nucleotide deatues to be preserved - digest has a ragging capability - coderet writes any permutation of cds, mrna & protein to separate files - mincount option added to wordcount - biosed modified to allow the specification of sequence mutation position - wossname can now search for phrases - new sequence type 'gapstopprotein' - sequence reading from website URLs now defaults to HTTP 1.1 - new 'keywords' attribute in ACD files - many minor additions, bugfixes and placeholders for future capabilities EMBASSY packages VIENNA has been added. This is a port of the Vienna RNA package by Ivo Hofacker. It is to be regarded as an alpha test. We are investigating the incorporation of the Vienna sequence format into the main libraries. This would lead to simplification of the interface for future releases. HMMER: this package is now a wrapper written around HMMER 2.3.2 You must therefore install HMMER 2.3.2 from the http://hmmer.wustl.edu/ site and add it to your PATH. MEMENEW: this package is now a wrapper. You must therefore install MEME/MAST from the http://meme.sdsc.edu/meme/meme-download.html site and add it to your PATH. MYEMBOSS: This package enable developers to write their own applications using the standard EMBOSS distribution. PHYLIPNEW: This is now out of EMBOSS beta testing. It is PHYLIP version 3.6b Microsoft Windows An alpha test version of EMBOSS for Microsoft Windows is available from the 'windows' directory of the ftp server above. This port was done using Andre Blavier's EMBOSSWIN package as a starting point and we thank him for his work. The EMBOSS programs can be run from a DOS Command window. There is currently no GUI though we hope that some may spring up from the community. If, when trying to run applications, you get "DLL missing" errors then you will need to install the vcredist_x86.exe package for Visual C++ 2005 from the Microsoft web site. This small executable does not install a compiler, only the required runtime DLLs. Developers Developers should note that we are in the process of standardising library function names. Old function names will still work but will print out a 'deprecated' message when compiled using the GCC compiler. We therefore recommend use of this compiler for developers as an aid to updating their source code. Happy St Swithun's (Swithin's) Day. Alan EBI 15th July 2006 From javierluiso at gmail.com Sun Jul 23 01:33:22 2006 From: javierluiso at gmail.com (Javier Luiso) Date: Sat, 22 Jul 2006 22:33:22 -0300 Subject: [emboss-dev] New application - dottie Message-ID: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> Hi, I read the table of "suggested new applications for EMBOSS" ( http://emboss.sourceforge.net/apps/proposed.html) and I'm interesting in the app. dottie. I work in Computer Graphics and Visualization area, not in biology, so I need more specific information about the features dottie must include. Thanks. From javierluiso at gmail.com Mon Jul 24 13:37:41 2006 From: javierluiso at gmail.com (Javier Luiso) Date: Mon, 24 Jul 2006 10:37:41 -0300 Subject: [emboss-dev] New application - dottie In-Reply-To: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> References: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> Message-ID: <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> Hi, I read the table of "suggested new applications for EMBOSS" (http://emboss.sourceforge.net/apps/proposed.html) and I'm interesting in the app. dottie. I work in Computer Graphics and Visualization area, not in biology, so I need more specific information about the features dottie must include. Thanks. From cupton at uvic.ca Mon Jul 24 17:16:42 2006 From: cupton at uvic.ca (Chris Upton) Date: Mon, 24 Jul 2006 10:16:42 -0700 Subject: [emboss-dev] New application - dottie In-Reply-To: <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> References: <61d930160607221833pa0d622fna5166bbdac1724c8@mail.gmail.com> <61d930160607240637kae5d056q55e23e0f983bd8fe@mail.gmail.com> Message-ID: HI, We have built a java interface to Dotter called JDOTTER (see links) . This allows use of a menu for file input, connection to a database of precalculated plots (if required), display of annotations, multiplatform use. It keeps the zoom functions of Dotter. http://athena.bioc.uvic.ca/workbench.php?tool=jdotter&db= http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14734323 Let me know if you want more info. Chris Upton On Jul 24, 2006, at 6:37 AM, Javier Luiso wrote: > Hi, I read the table of "suggested new applications for EMBOSS" > (http://emboss.sourceforge.net/apps/proposed.html) > and I'm interesting in the app. dottie. > I work in Computer Graphics and Visualization area, not in biology, > so I > need more specific information about the features dottie must include. > > Thanks. > Chris Upton Ph.D. Associate Professor Biochemistry and Microbiology Tel. 250-721-6507 University of Victoria Fax 250-721-8855 P.O. Box 3055 STN CSC Victoria, BC V8W 3P6 Canada web.uvic.ca/~cupton www.virology.ca www.sarsresearch.ca