From Scott.Geib at ARS.USDA.GOV Wed Apr 3 22:40:37 2013 From: Scott.Geib at ARS.USDA.GOV (Geib, Scott) Date: Thu, 4 Apr 2013 02:40:37 +0000 Subject: [EMBOSS] featreport / featcopy outputting to genbank Message-ID: <0D54878997A4B9478F03938D61DB51D416AF40@001FSN2MPN1-016.001f.mgd2.msft.net> I am trying to output a gff3 file to genbank format using featreprot or featcopy. I have many molecules in my gff3 file (contigs from assembly), I was wondering how to designate them as different locus so that the flat file generates a new locus for each and then displays features associated with that locus. Right now it is all written to a single locus in the outputted genbank file. Just as example, I am wrtting my gff files as follows: Contig1 . databank_entry 1 1000 . . . organism="X";strain="Y" Contig1 . gene 171 497 . + . gene="sod" Contig1 . mRNA 171 497 . + . . Contig1 . exon 171 497 . + . . Contig1 . CDS 171 497 . + . gene="sod" Contig2 . databank_entry 1 1000 . . . organism="X";strain="Y" Contig2 . gene 3 323 . + . gene="ba" Contig2 . mRNA 3 323 . + . . Contig2 . exon 3 323 . + . . Contig2 . CDS 3 323 . + . gene="ba" And this is getting returned to me as a single locus Scott This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. From bernd.web at gmail.com Tue Apr 9 11:12:45 2013 From: bernd.web at gmail.com (Bernd W) Date: Tue, 9 Apr 2013 17:12:45 +0200 Subject: [EMBOSS] fuzznuc repetition issue Message-ID: Hi, I tried a repetition with fuzznuc in a pattern. It seems that when I start a range with 0 e.g. (0,1), the pattern is not found when it is located at the end of the sequence and the count of the character is 0. This only occurs when there are more matches possible. The following example shows this. It contains the pattern, once with 4 mismatches. >test ACTACTACATACATACACATATACACATGAGGTTTTAGGGGATGACGTAAGGGGGNNNNNGAGGAAGGAGGGGATGACGT fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern 'GAGGAAGGAGGGGATGACGT' results in the expected output: Start End Strand Pattern Mismatch Sequence 29 48 + pattern:GAGGAAGGAGGGGATGACGT 4 GAGGTTTTAGGGGATGACGT 61 80 + pattern:GAGGAAGGAGGGGATGACGT . GAGGAAGGAGGGGATGACGT However, fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern 'GAGGAAGGAGGGGATGACGTn(0,3)' only find the first pattern at pos 29, with 0,1,2 and 3 times a match any nucleotide (so 4 matches in total), but not the one at 61-80. Now, if I request 0 mismatches (-pmismatch 0), then this last pattern is reported (from 61 to 80). When requesting e.g. 3 mismatches no hit is found. The first has 4 mismatches, but now also also last with 0 mismatches is not reported. This only seems to be reported when I ask for 0 mismatches. However, when allowing 4 mismatches I'd expect 5 hits in total (4 starting at 29 with 4 mismatches) and one starting at 61. This occured in EMBOSS 6.3.1 and 6.5.7. Is this a wrong expectation, or is something not going entirely right? Kind regards, Bernd From aysegulunal at mu.edu.tr Thu Apr 11 08:01:42 2013 From: aysegulunal at mu.edu.tr (Aysegul UNAL) Date: Thu, 11 Apr 2013 15:01:42 +0300 Subject: [EMBOSS] about Emboss backtranseq tool codon usage table Message-ID: <00bb01ce36ac$54ebd6e0$fec384a0$@mu.edu.tr> Hi, I use your Emboss Backtranseq tool to translate my protein sequence into nucleotide sequence. There, you can select the codon usage of any organism. My organism is bacillus subtilis but there are two different selections, bacillus subtilis and Bacillus subtilis (high). I want to learn the difference between bacillus subtilis and bacillus subtilis (high)? because these two diferent selections give different nucleotide sequences. Thanks in advance. Assist.Prof.Dr. Aysegul UNAL Mugla Sitki Kocman University, Faculty of Science, Molecular Biology and Genetics Department Phone: 0252 211 5412 From markbudde at gmail.com Mon Apr 29 15:51:12 2013 From: markbudde at gmail.com (Mark Budde) Date: Mon, 29 Apr 2013 12:51:12 -0700 Subject: [EMBOSS] eprimer3 troubles Message-ID: Hi, I have spent too many hours trying to get eprimer3 to work on my Mac, and am out of ideas. I installed primer3 2.3.5 on Mac. I can run it successfulyl with command line. When I try and run eprimer3 I get errors. >From the emboss README: ... primer3 from the Whitehead institute must be installed in one or two versions. EMBOSS application eprimer3 launches primer3_core and expects this to be version 1.x of the program. EMBOSS application eprimer32 launches primer32_core and expects this to be a renamed primer3_core from version 2.x of the program. primer3's configuration files also need to be installed. The original application looks by default in /opt/primer3_config ... $ eprimer32 -sequence=AAAA Pick PCR primers and hybridization oligos Died: eprimer32 uses external program 'primer32_core' which is not in the PATH or defined as EMBOSS_PRIMER32_CORE Part of the 'primer3' package, version 2.2.3, available from the Whitehead Institute. See: http://primer3.sourceforge.net/ The primer3_core application must be renamed to primer32_core ... This suggested to me that emboss expects primer32_core to be named primer32_core, not primer3_core as indicated above. renaming the program to primer32_core, or typing eprimer3 into the command line produce the same error: ... $ eprimer3 -sequence=AAAA Pick PCR primers and hybridization oligos Error: Failed to open filename 'AAAA' Error: Unable to read sequence 'AAAA' Died: eprimer3 terminated: Bad value for '-sequence' and no prompt .... I get this error regardless of what I put as the sequence, or even if I input a file. Does anyone have suggestions or potential diagnostics? Thanks Mark From ricepeterm at yahoo.co.uk Tue Apr 30 04:38:26 2013 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Tue, 30 Apr 2013 09:38:26 +0100 Subject: [EMBOSS] eprimer3 troubles In-Reply-To: References: Message-ID: <517F8302.9090202@yahoo.co.uk> Hi Mark, On 29/04/2013 20:51, Mark Budde wrote: > Hi, > I have spent too many hours trying to get eprimer3 to work on my Mac, and > am out of ideas. Easily fixed ... and you've come to the right place to ask. However, one proviso ... EMBOSS eprimer32 works with primer3 version 2.2.3 We will support the primer3 2.3.x latest version with the next EMBOSS release. Providing wrappers to external programs becomes difficult when they change their options. We were waiting for the primer3 2.3 series to stabilise before deciding whether to update eprimer32 or to add a new eprimer323 (which may be what we will need to do to support the 2.3.x series, with yet another new name for the binary) > I installed primer3 2.3.5 on Mac. I can run it successfulyl with command > line. > > When I try and run eprimer3 I get errors. > > $ eprimer32 -sequence=AAAA > > Pick PCR primers and hybridization oligos > Died: eprimer32 uses external program 'primer32_core' which is not in the > PATH or defined as EMBOSS_PRIMER32_CORE > Part of the 'primer3' package, version 2.2.3, available from the > Whitehead Institute. See: http://primer3.sourceforge.net/ The > primer3_core application must be renamed to primer32_core > ... > > This suggested to me that emboss expects primer32_core to be named > primer32_core, not primer3_core as indicated above. You misread the instructions. It should indeed be renamed (or copied) from the primer3_core that is built when you install primer3 to primer32_core (so EMBOSS knows it is running primer 3.2) > renaming the program to primer32_core, or typing eprimer3 into the command > line produce the same error: > ... > $ eprimer3 -sequence=AAAA > > Pick PCR primers and hybridization oligos > Error: Failed to open filename 'AAAA' > Error: Unable to read sequence 'AAAA' > Died: eprimer3 terminated: Bad value for '-sequence' and no prompt > .... > > I get this error regardless of what I put as the sequence, or even if I > input a file. The message is correct. AAAA is taken as the name of a file. You can also use a database name and identifier such as embl:x13776. However, EMBOSS does also accept the sequence (if it is not too long for your system or shell's command line) with the prefix asis:: so you can use $ eprimer3 -sequence=asis::AAAA or simply $ eprimer3 asis::AAAA You should also give an output file name as such a sequence has no name. You can name an "asis" sequence with -sid=myseq on the command line. > Does anyone have suggestions or potential diagnostics? For diagnostics, any EMBOSS program can be run with-debug on the command line. This produces a file programname.dbg with a trace of messages written as it runs. In this case I am fairly confident we know the problem. regards, Peter Rice EMBOSS Team From Scott.Geib at ARS.USDA.GOV Thu Apr 4 02:40:37 2013 From: Scott.Geib at ARS.USDA.GOV (Geib, Scott) Date: Thu, 4 Apr 2013 02:40:37 +0000 Subject: [EMBOSS] featreport / featcopy outputting to genbank Message-ID: <0D54878997A4B9478F03938D61DB51D416AF40@001FSN2MPN1-016.001f.mgd2.msft.net> I am trying to output a gff3 file to genbank format using featreprot or featcopy. I have many molecules in my gff3 file (contigs from assembly), I was wondering how to designate them as different locus so that the flat file generates a new locus for each and then displays features associated with that locus. Right now it is all written to a single locus in the outputted genbank file. Just as example, I am wrtting my gff files as follows: Contig1 . databank_entry 1 1000 . . . organism="X";strain="Y" Contig1 . gene 171 497 . + . gene="sod" Contig1 . mRNA 171 497 . + . . Contig1 . exon 171 497 . + . . Contig1 . CDS 171 497 . + . gene="sod" Contig2 . databank_entry 1 1000 . . . organism="X";strain="Y" Contig2 . gene 3 323 . + . gene="ba" Contig2 . mRNA 3 323 . + . . Contig2 . exon 3 323 . + . . Contig2 . CDS 3 323 . + . gene="ba" And this is getting returned to me as a single locus Scott This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. From bernd.web at gmail.com Tue Apr 9 15:12:45 2013 From: bernd.web at gmail.com (Bernd W) Date: Tue, 9 Apr 2013 17:12:45 +0200 Subject: [EMBOSS] fuzznuc repetition issue Message-ID: Hi, I tried a repetition with fuzznuc in a pattern. It seems that when I start a range with 0 e.g. (0,1), the pattern is not found when it is located at the end of the sequence and the count of the character is 0. This only occurs when there are more matches possible. The following example shows this. It contains the pattern, once with 4 mismatches. >test ACTACTACATACATACACATATACACATGAGGTTTTAGGGGATGACGTAAGGGGGNNNNNGAGGAAGGAGGGGATGACGT fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern 'GAGGAAGGAGGGGATGACGT' results in the expected output: Start End Strand Pattern Mismatch Sequence 29 48 + pattern:GAGGAAGGAGGGGATGACGT 4 GAGGTTTTAGGGGATGACGT 61 80 + pattern:GAGGAAGGAGGGGATGACGT . GAGGAAGGAGGGGATGACGT However, fuzznuc -pmismatch 4 -sequence test.fa -outfile test.fuzznuc -pattern 'GAGGAAGGAGGGGATGACGTn(0,3)' only find the first pattern at pos 29, with 0,1,2 and 3 times a match any nucleotide (so 4 matches in total), but not the one at 61-80. Now, if I request 0 mismatches (-pmismatch 0), then this last pattern is reported (from 61 to 80). When requesting e.g. 3 mismatches no hit is found. The first has 4 mismatches, but now also also last with 0 mismatches is not reported. This only seems to be reported when I ask for 0 mismatches. However, when allowing 4 mismatches I'd expect 5 hits in total (4 starting at 29 with 4 mismatches) and one starting at 61. This occured in EMBOSS 6.3.1 and 6.5.7. Is this a wrong expectation, or is something not going entirely right? Kind regards, Bernd From aysegulunal at mu.edu.tr Thu Apr 11 12:01:42 2013 From: aysegulunal at mu.edu.tr (Aysegul UNAL) Date: Thu, 11 Apr 2013 15:01:42 +0300 Subject: [EMBOSS] about Emboss backtranseq tool codon usage table Message-ID: <00bb01ce36ac$54ebd6e0$fec384a0$@mu.edu.tr> Hi, I use your Emboss Backtranseq tool to translate my protein sequence into nucleotide sequence. There, you can select the codon usage of any organism. My organism is bacillus subtilis but there are two different selections, bacillus subtilis and Bacillus subtilis (high). I want to learn the difference between bacillus subtilis and bacillus subtilis (high)? because these two diferent selections give different nucleotide sequences. Thanks in advance. Assist.Prof.Dr. Aysegul UNAL Mugla Sitki Kocman University, Faculty of Science, Molecular Biology and Genetics Department Phone: 0252 211 5412 From markbudde at gmail.com Mon Apr 29 19:51:12 2013 From: markbudde at gmail.com (Mark Budde) Date: Mon, 29 Apr 2013 12:51:12 -0700 Subject: [EMBOSS] eprimer3 troubles Message-ID: Hi, I have spent too many hours trying to get eprimer3 to work on my Mac, and am out of ideas. I installed primer3 2.3.5 on Mac. I can run it successfulyl with command line. When I try and run eprimer3 I get errors. >From the emboss README: ... primer3 from the Whitehead institute must be installed in one or two versions. EMBOSS application eprimer3 launches primer3_core and expects this to be version 1.x of the program. EMBOSS application eprimer32 launches primer32_core and expects this to be a renamed primer3_core from version 2.x of the program. primer3's configuration files also need to be installed. The original application looks by default in /opt/primer3_config ... $ eprimer32 -sequence=AAAA Pick PCR primers and hybridization oligos Died: eprimer32 uses external program 'primer32_core' which is not in the PATH or defined as EMBOSS_PRIMER32_CORE Part of the 'primer3' package, version 2.2.3, available from the Whitehead Institute. See: http://primer3.sourceforge.net/ The primer3_core application must be renamed to primer32_core ... This suggested to me that emboss expects primer32_core to be named primer32_core, not primer3_core as indicated above. renaming the program to primer32_core, or typing eprimer3 into the command line produce the same error: ... $ eprimer3 -sequence=AAAA Pick PCR primers and hybridization oligos Error: Failed to open filename 'AAAA' Error: Unable to read sequence 'AAAA' Died: eprimer3 terminated: Bad value for '-sequence' and no prompt .... I get this error regardless of what I put as the sequence, or even if I input a file. Does anyone have suggestions or potential diagnostics? Thanks Mark From ricepeterm at yahoo.co.uk Tue Apr 30 08:38:26 2013 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Tue, 30 Apr 2013 09:38:26 +0100 Subject: [EMBOSS] eprimer3 troubles In-Reply-To: References: Message-ID: <517F8302.9090202@yahoo.co.uk> Hi Mark, On 29/04/2013 20:51, Mark Budde wrote: > Hi, > I have spent too many hours trying to get eprimer3 to work on my Mac, and > am out of ideas. Easily fixed ... and you've come to the right place to ask. However, one proviso ... EMBOSS eprimer32 works with primer3 version 2.2.3 We will support the primer3 2.3.x latest version with the next EMBOSS release. Providing wrappers to external programs becomes difficult when they change their options. We were waiting for the primer3 2.3 series to stabilise before deciding whether to update eprimer32 or to add a new eprimer323 (which may be what we will need to do to support the 2.3.x series, with yet another new name for the binary) > I installed primer3 2.3.5 on Mac. I can run it successfulyl with command > line. > > When I try and run eprimer3 I get errors. > > $ eprimer32 -sequence=AAAA > > Pick PCR primers and hybridization oligos > Died: eprimer32 uses external program 'primer32_core' which is not in the > PATH or defined as EMBOSS_PRIMER32_CORE > Part of the 'primer3' package, version 2.2.3, available from the > Whitehead Institute. See: http://primer3.sourceforge.net/ The > primer3_core application must be renamed to primer32_core > ... > > This suggested to me that emboss expects primer32_core to be named > primer32_core, not primer3_core as indicated above. You misread the instructions. It should indeed be renamed (or copied) from the primer3_core that is built when you install primer3 to primer32_core (so EMBOSS knows it is running primer 3.2) > renaming the program to primer32_core, or typing eprimer3 into the command > line produce the same error: > ... > $ eprimer3 -sequence=AAAA > > Pick PCR primers and hybridization oligos > Error: Failed to open filename 'AAAA' > Error: Unable to read sequence 'AAAA' > Died: eprimer3 terminated: Bad value for '-sequence' and no prompt > .... > > I get this error regardless of what I put as the sequence, or even if I > input a file. The message is correct. AAAA is taken as the name of a file. You can also use a database name and identifier such as embl:x13776. However, EMBOSS does also accept the sequence (if it is not too long for your system or shell's command line) with the prefix asis:: so you can use $ eprimer3 -sequence=asis::AAAA or simply $ eprimer3 asis::AAAA You should also give an output file name as such a sequence has no name. You can name an "asis" sequence with -sid=myseq on the command line. > Does anyone have suggestions or potential diagnostics? For diagnostics, any EMBOSS program can be run with-debug on the command line. This produces a file programname.dbg with a trace of messages written as it runs. In this case I am fairly confident we know the problem. regards, Peter Rice EMBOSS Team