From maoj at helix.nih.gov Tue Sep 4 11:30:33 2012 From: maoj at helix.nih.gov (Jean Mao) Date: Tue, 4 Sep 2012 11:30:33 -0400 Subject: [EMBOSS] Error in Emboss Explorer after updated EMBOSS to 6.5.7 Message-ID: <50461E99.3040805@helix.nih.gov> Hi, I know this list is for EMBOSS, not emboss explorer. However, I am not sure where to look for help with emboss explorer anymore so I am trying my luck here. Really appreciate if someone can point me to the right direction. Recently I updated emboss to 6.5.7 from 6.3.1. Emboss Explorer was functional before I switched the link. Once the link is switched to point to 6.5.7, the main manu panel on the left of the webpage is comletely messed up. In stead of showing categories and apps under each category, now there is only categories showing and no apps that one can click and load. The most current emboss explorer I can find is 2.2.0 and was out since 2006. Regards, Jean From forrest.bao at gmail.com Mon Sep 10 14:11:57 2012 From: forrest.bao at gmail.com (Forrest Sheng Bao) Date: Mon, 10 Sep 2012 13:11:57 -0500 Subject: [EMBOSS] any APIs for vectorstrip or EMBOSS Message-ID: Hi all, I am writing a program in which vectorstrip is used to remove vectors. I wanna be able to call the functions in vectorstrips from my program. Does vectorstrips or EMBOSS have a set of APIs that I can use? Cheers, Forrest From paul.tanger at colostate.edu Thu Sep 13 16:24:47 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 14:24:47 -0600 Subject: [EMBOSS] specify primer3 directory? Message-ID: Hi, I have a local install of emboss and I'm trying to get eprimer3 to work, but I'm getting this error: "Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified" primer3 is installed, but not in the default location (which I think is /opt/primer3_config ?) . How do I specify where primer3 is installed? Or is the cause of this error something else? googled for an answer for a while, but couldn't find one. Thanks! From paul.tanger at colostate.edu Thu Sep 13 17:58:47 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 15:58:47 -0600 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Message-ID: Thanks, maybe this is the problem but the solution you suggest doesn't seem to work because that is a primer3 qualifier not an eprimer3 qualifier. I get this error: [paultanger at bspmgenomics bin]$ ./eprimer3 ~/QTL-project/30scaffolds_affyMAI_CG.fsa ~/QTL-project/test2 --default_version=1 Died: Unknown qualifier --default_version=1 On Thu, Sep 13, 2012 at 3:49 PM, Scott Markel wrote: > Paul, > > You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. > > You can revert to the old defaults using this command-line argument. > > --default_version=1 > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications and Communications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger > Sent: Thursday, 13 September 13 2012 1:25 PM > To: emboss at lists.open-bio.org > Subject: [EMBOSS] specify primer3 directory? > > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From Scott.Markel at accelrys.com Thu Sep 13 18:01:55 2012 From: Scott.Markel at accelrys.com (Scott Markel) Date: Thu, 13 Sep 2012 15:01:55 -0700 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Message-ID: <5ACBA19439E77B43A06F4CAB897EC977068A51519E@EXCH1-COLO.accelrys.net> Paul, Yup, sorry about that. I should have stopped my answer after the paragraph about the primer3 change. Scott -----Original Message----- From: ptanger at rams.colostate.edu [mailto:ptanger at rams.colostate.edu] On Behalf Of Paul Tanger Sent: Thursday, 13 September 13 2012 2:59 PM To: Scott Markel Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] specify primer3 directory? Thanks, maybe this is the problem but the solution you suggest doesn't seem to work because that is a primer3 qualifier not an eprimer3 qualifier. I get this error: [paultanger at bspmgenomics bin]$ ./eprimer3 ~/QTL-project/30scaffolds_affyMAI_CG.fsa ~/QTL-project/test2 --default_version=1 Died: Unknown qualifier --default_version=1 On Thu, Sep 13, 2012 at 3:49 PM, Scott Markel wrote: > Paul, > > You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. > > You can revert to the old defaults using this command-line argument. > > --default_version=1 > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications and Communications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger > Sent: Thursday, 13 September 13 2012 1:25 PM > To: emboss at lists.open-bio.org > Subject: [EMBOSS] specify primer3 directory? > > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From Scott.Markel at accelrys.com Thu Sep 13 17:49:54 2012 From: Scott.Markel at accelrys.com (Scott Markel) Date: Thu, 13 Sep 2012 14:49:54 -0700 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: Message-ID: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Paul, You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. You can revert to the old defaults using this command-line argument. --default_version=1 Scott Scott Markel, Ph.D. Principal Bioinformatics Architect? email:? smarkel at accelrys.com Accelrys (Pipeline Pilot R&D)?????? mobile: +1 858 205 3653 10188 Telesis Court, Suite 100????? voice:? +1 858 799 5603 San Diego, CA 92121???????????????? fax:??? +1 858 799 5222 USA???????????????????????????????? web:??? http://www.accelrys.com http://www.linkedin.com/in/smarkel Secretary, Board of Directors: ??? International Society for Computational Biology Chair: ISCB Publications and Communications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger Sent: Thursday, 13 September 13 2012 1:25 PM To: emboss at lists.open-bio.org Subject: [EMBOSS] specify primer3 directory? Hi, I have a local install of emboss and I'm trying to get eprimer3 to work, but I'm getting this error: "Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified" primer3 is installed, but not in the default location (which I think is /opt/primer3_config ?) . How do I specify where primer3 is installed? Or is the cause of this error something else? googled for an answer for a while, but couldn't find one. Thanks! _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From paul.tanger at colostate.edu Thu Sep 13 19:03:27 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 17:03:27 -0600 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: <73B93C239077ED49B5302828A2076E351838AC2F@PHSX10MB15.partners.org> References: <73B93C239077ED49B5302828A2076E351838AC2F@PHSX10MB15.partners.org> Message-ID: That worked - Thanks so much! On Thu, Sep 13, 2012 at 4:37 PM, Drummond, Iain A. wrote: > I ran into (I think) similar problems running primer3 with Emboss eprimer3. > > It turned out that the most recent version of primer3 (2.x.x) will not run > with the current Emboss eprimer3; you have to downgrade primer3 to 1.1.4 to > get it to work. > > The Emboss folks know this I think. > > -Iain > > > ------- > Iain Drummond, Ph.D. > Associate Professor > Nephrology Division, Massachusetts General Hospital, > Department of Genetics, Harvard Medical School and > Program in Developmental and Regenerative Biology, Harvard Medical School > > Address for mailing: > > Nephrology Division, MGH > 149 13th Street, Rm 149-8000 > Charlestown MA 02129 > > 617 726 5647 (office) > 617 724 9693 (lab) > 617 726 5669 (fax) > > idrummon at receptor.mgh.harvard.edu > idrummond at partners.org > > HTTP://danio.mgh.harvard.edu > > > On 9/13/12 4:24 PM, "Paul Tanger" wrote: > >> Hi, >> I have a local install of emboss and I'm trying to get eprimer3 to >> work, but I'm getting this error: >> >> "Error: thermodynamic approach chosen, but path to thermodynamic >> parameters not specified" >> >> primer3 is installed, but not in the default location (which I think >> is /opt/primer3_config ?) . >> How do I specify where primer3 is installed? Or is the cause of this >> error something else? >> >> googled for an answer for a while, but couldn't find one. >> >> Thanks! >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > > > The information in this e-mail is intended only for the person to whom it is > addressed. If you believe this e-mail was sent to you in error and the e-mail > contains patient information, please contact the Partners Compliance HelpLine at > http://www.partners.org/complianceline . If the e-mail was sent to you in error > but does not contain patient information, please contact the sender and properly > dispose of the e-mail. > From ppetrov at mail.student.oulu.fi Fri Sep 14 04:06:47 2012 From: ppetrov at mail.student.oulu.fi (Petar Petrov) Date: Fri, 14 Sep 2012 11:06:47 +0300 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: Message-ID: <20120914110647.sacof6lkgscookc0@webmail.oulu.fi> Hi Paul, which version of primer3 do you use? What operating system? How did you install primer3 and EMBOSS? If it is primer3 version 2.* I think you should use eprimer32 wrapper instead and it expects primer3_core to be called primer32_core. To make primer3 aware of another location of the primer3_config directory: it seems that two files need to be modified before compiling: thal_main.c and primer3_boulder_main.c (in the subfolder src). Hope this helps. regards, Petar Quoting Paul Tanger : > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From daniel.rozenbaum at USPTO.GOV Fri Sep 14 09:08:36 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 14 Sep 2012 09:08:36 -0400 Subject: [EMBOSS] ajSeqxrefNewDbS Message-ID: Hi, What is the meaning of the messages "ajSeqrefNewDBS" that occasionally appear in the output of EMBOSS utilities? They're even mentioned in the documentation pages, e.g. http://emboss.sourceforge.net/apps/release/6.4/emboss/apps/oddcomp.html (the last line in the quote below): % oddcomp Identify proteins with specified sequence word composition Input protein sequence(s): tsw:* Program compseq output file: oddcomp.comp Window size to consider (e.g. 30 aa) [30]: Output file [12s1_arath.oddcomp]: out.odd ajSeqxrefNewDbS '1-I' 'FT025' Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria VA 22314 From daniel.rozenbaum at USPTO.GOV Fri Sep 14 08:56:14 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 14 Sep 2012 08:56:14 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format Message-ID: Hello Peter and everyone, I was wondering if I could revive the discussion about the support of IG format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark Office, where this format, in its multi-line sequence annotation form, is used extensively. Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: % makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname ig1 % cat ig1.ig ;, 10 bases EMBOSS_001 hcsptpstas1 ;, 10 bases EMBOSS_002 rdgwcvmtrm1 ;, 10 bases EMBOSS_003 fgtifgdgid1 % entret -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout ;, 10 bases EMBOSS_001 hcsptpstas1 ;, 10 bases In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria VA 22314 ------------------------- On 15/08/2012 17:57, Daniel Rozenbaum wrote: > Dear list, > > (Peter, many thanks for your prompt reply to my previous inquiry!) > > We need to deal with extensive databases in Intelligenetics format with multiple lines in annotation of each record. It appears however that EMBOSS concatenates all annotation lines into a single line when building its internal representation of the sequence description: > > % cat /tmp/IGSEQ.ig > ; Annotation line 1 > ; Annotation line 2 > ; Annotation line 3 > IGSEQ > ACGCATCGCATCAGACTACGC1 > > > % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig -osdirectory /tmp > > > % cat /tmp/IGSEQ.emboss_ig2ig.ig > ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases > IGSEQ > ACGCATCGCATCAGACTACGC1 > > Are there any plans to support multi-line annotation in this format? Interesting thought. We will take a look. It will need some care to maintain compatibility with other formats that have single (FASTA) or multiple (swissprot) descriptions. Which package is using this IG format? regards, Peter Rice EMBOSS Team From paul.tanger at colostate.edu Fri Sep 14 15:08:25 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Fri, 14 Sep 2012 13:08:25 -0600 Subject: [EMBOSS] specify primer3 directory? Message-ID: This works as well. I didn't look at eprimer32 because for some reason I thought the "32" was a 32 bit version or something.. Turns out eprimer32 is built to work with 2.x versions of primer3. As Petar suggested, for a non default install of primer3 you need to modify some files before everything will work. To provide more detail in case others are interested, in thal_main.c you need to modify lines 306-307: } else if ((stat("/opt/primer3_config", &st) == 0) && S_ISDIR(st.st_mode)) { tmp_ret = get_thermodynamic_values("/opt/primer3_config/", &o); and in primer3_boulder_main.c you need to modify lines 517-521: } else if ((stat("/opt/primer3_config", &st) == 0) && S_ISDIR(st.st_mode)) { thermodynamic_params_path = (char*) malloc(strlen("/opt/primer3_config/") * sizeof(char) + 1); if (NULL == thermodynamic_params_path) exit (-2); /* Out of memory */ strcpy(thermodynamic_params_path, "/opt/primer3_config/"); These changes apply to unix - I saw different places in those files that would need to be changed if you were doing this in windows. Has anyone got emboss and eprimer32 working and installed in galaxy? Or a similar primer design tool in galaxy? Thanks for the help everyone. Message: 6 Date: Fri, 14 Sep 2012 11:06:47 +0300 From: Petar Petrov Subject: Re: [EMBOSS] specify primer3 directory? To: emboss at lists.open-bio.org Message-ID: <20120914110647.sacof6lkgscookc0 at webmail.oulu.fi> Content-Type: text/plain; charset=windows-1251; DelSp="Yes"; format="flowed" Hi Paul, which version of primer3 do you use? What operating system? How did you install primer3 and EMBOSS? If it is primer3 version 2.* I think you should use eprimer32 wrapper instead and it expects primer3_core to be called primer32_core. To make primer3 aware of another location of the primer3_config directory: it seems that two files need to be modified before compiling: thal_main.c and primer3_boulder_main.c (in the subfolder src). Hope this helps. regards, Petar From daniel.rozenbaum at USPTO.GOV Mon Sep 17 22:00:58 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 17 Sep 2012 22:00:58 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> Greetings again, If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 On Sep 14, 2012, at 9:36 AM, "Rozenbaum, Daniel (Biocceleration Inc)" wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark Office, where this format, in its multi-line sequence annotation form, is used extensively. > > Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: > > % makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname ig1 > > % cat ig1.ig > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > EMBOSS_002 > rdgwcvmtrm1 > ;, 10 bases > EMBOSS_003 > fgtifgdgid1 > > > % entret -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > > In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. > > Many thanks, > Daniel > -- > Daniel Rozenbaum > Biocceleration, Inc. > OCIO/ Office of Application Engineering & Development/ Patent System Division > 600 Dulany St. > Alexandria VA 22314 > > ------------------------- > On 15/08/2012 17:57, Daniel Rozenbaum wrote: >> Dear list, >> >> (Peter, many thanks for your prompt reply to my previous inquiry!) >> >> We need to deal with extensive databases in Intelligenetics format with multiple lines in annotation of each record. It appears however that EMBOSS concatenates all annotation lines into a single line when building its internal representation of the sequence description: >> >> % cat /tmp/IGSEQ.ig >> ; Annotation line 1 >> ; Annotation line 2 >> ; Annotation line 3 >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> >> % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig -osdirectory /tmp >> >> >> % cat /tmp/IGSEQ.emboss_ig2ig.ig >> ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> Are there any plans to support multi-line annotation in this format? > > Interesting thought. We will take a look. It will need some care to > maintain compatibility with other formats that have single (FASTA) or > multiple (swissprot) descriptions. > > Which package is using this IG format? > > regards, > > Peter Rice > EMBOSS Team > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From p.j.a.cock at googlemail.com Tue Sep 18 04:25:11 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 18 Sep 2012 09:25:11 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: On Fri, Sep 14, 2012 at 1:56 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG > format if possible. I'm helping deploy EMBOSS at the US Patent and > Trademark Office, where this format, in its multi-line sequence annotation > form, is used extensively. Hi Daniel, That is interesting to know - I work on Biopython, which has support for reading and indexing the Intelligenetics "ig" format. I'd been under the impression that this was a defunct/unused file format (and therefore never bothered to implement support for writing it in Biopython). Does the US Patent and Trademark Office provide datasets to the public in this format? Thanks, Peter C. From daniel.rozenbaum at USPTO.GOV Tue Sep 18 07:42:53 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 18 Sep 2012 07:42:53 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: , Message-ID: Hi Peter, I don't believe the USPTO provides datasets to the public in the IG format. With best regards, Daniel ________________________________________ From: Peter Cock [p.j.a.cock at googlemail.com] Sent: Tuesday, September 18, 2012 4:25 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format On Fri, Sep 14, 2012 at 1:56 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG > format if possible. I'm helping deploy EMBOSS at the US Patent and > Trademark Office, where this format, in its multi-line sequence annotation > form, is used extensively. Hi Daniel, That is interesting to know - I work on Biopython, which has support for reading and indexing the Intelligenetics "ig" format. I'd been under the impression that this was a defunct/unused file format (and therefore never bothered to implement support for writing it in Biopython). Does the US Patent and Trademark Office provide datasets to the public in this format? Thanks, Peter C. From p.j.a.cock at googlemail.com Tue Sep 18 08:20:09 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 18 Sep 2012 13:20:09 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 12:42 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi Peter, > > I don't believe the USPTO provides datasets to the public in the IG format. > > With best regards, > Daniel OK, thanks. Peter From ricepeterm at yahoo.co.uk Wed Sep 19 06:06:53 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:06:53 +0100 Subject: [EMBOSS] any APIs for vectorstrip or EMBOSS In-Reply-To: References: Message-ID: <5059993D.1030604@yahoo.co.uk> Dear Forrest, > I am writing a program in which vectorstrip is used to remove vectors. I > wanna be able to call the functions in vectorstrips from my program. Does > vectorstrips or EMBOSS have a set of APIs that I can use? You can call any EMBOSS application using the command line by adding any options you need (some are required, for example the input sequence or file, other have default values) and add "-auto" to default any other options. You can also add the command line option "-filter" to read the first input from standard input, and to write the first output to standard output, so you can then write the input directly to the EMBOSS program and read the output directly from the program. If you allow the program to write output to a file, we recommend that you specify the output file name (for example "-outfile vec.out -outseq vec.seq") so your program knows which file to open. Hope this helps Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 06:48:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:48:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> Message-ID: <5059A2EF.7020504@yahoo.co.uk> Dear Daniel, On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Greetings again, > > If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? Very easy, a 1-day job including testing and documentation. Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. USPTO or Biocceleration) regards, Peter Rice EMBOSS Team From p.j.a.cock at googlemail.com Wed Sep 19 06:58:31 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 19 Sep 2012 11:58:31 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059A2EF.7020504@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: On Wed, Sep 19, 2012 at 11:48 AM, Peter Rice wrote: > Dear Daniel, > > > On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: >> >> Greetings again, >> >> If I may, another question on the issue of IG format: how difficult would >> it be to support database indexing for this format? > > > Very easy, a 1-day job including testing and documentation. > > Could you please make some example data available, and indicate which fields > could be indexed (including any information in formatted descriptions or in > naming conventions), and suggest a format name (e.g. USPTO or > Biocceleration) Does it need a new format name? EMBOSS already defines "ig" and "igstrict" - do the USPTO files diverge from these? Peter C. P.S. Biopython also uses the format name "ig", based on the current EMBOSS terminology. From ricepeterm at yahoo.co.uk Wed Sep 19 06:56:14 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:56:14 +0100 Subject: [EMBOSS] ajSeqxrefNewDbS In-Reply-To: References: Message-ID: <5059A4CE.6020501@yahoo.co.uk> On 14/09/2012 14:08, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi, > > What is the meaning of the messages "ajSeqrefNewDBS" that occasionally appear in the output of EMBOSS utilities? They're even mentioned in the documentation pages, e.g. http://emboss.sourceforge.net/apps/release/6.4/emboss/apps/oddcomp.html (the last line in the quote below): > > ajSeqxrefNewDbS '1-I' 'FT025' They can be safely ignored. The test example outputs are automatically added to the documentation and this one slipped through unnoticed among some other changes in 6.4. They are removed in the latest EMBOSS release 6.5 regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 07:32:45 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 12:32:45 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: <5059AD5D.9020503@yahoo.co.uk> Dear Daniel, On 14/09/2012 13:56, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: > > In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. Well spotted. The input buffer is not reset in Ig formats so the next line was included in the entret output. I will fix it in the next patch for the latest release (6.5). Let me know if you also need a patch for 6.4. regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 07:42:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 12:42:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <5059AF97.6020206@yahoo.co.uk> On 19/09/2012 11:58, Peter Cock wrote: > Does it need a new format name? EMBOSS already defines "ig" and > "igstrict" - do the USPTO files diverge from these? The format name is needed as an option to dbxflat -idformat so we can select a specific parser for any additional fields. For example, in dbxfasta -idformat has 7 names for 'fasta' format. regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 09:49:14 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 09:49:14 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059A2EF.7020504@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: Dear Peter, This is most wonderful news that's going to make a bunch of users really happy! I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. As for the name, how about something like "iguspto"? Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? With gratitude, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 -----Original Message----- From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] Sent: Wednesday, September 19, 2012 6:48 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format Dear Daniel, On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Greetings again, > > If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? Very easy, a 1-day job including testing and documentation. Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. USPTO or Biocceleration) regards, Peter Rice EMBOSS Team -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ig_uspto_sample.txt URL: From daniel.rozenbaum at USPTO.GOV Wed Sep 19 10:45:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 10:45:35 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <9D8AD7D9-FD9E-4825-9D01-7F451EB805C2@USPTO.GOV> A quick addition to the information on this format: while the example I sent has the records separated by a couple of new lines and a form feed (^L , 0x0c), in the most general case the first line of the next record (a line that starts with a semicolon) could appear immediately after the last sequence data line of the previous record. Empty lines between records are ignored. On Sep 19, 2012, at 10:09 AM, "Rozenbaum, Daniel (Biocceleration Inc)" wrote: > Dear Peter, > > This is most wonderful news that's going to make a bunch of users really happy! > > I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. > > As for the name, how about something like "iguspto"? > > Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? > > With gratitude, > Daniel > > -- > Daniel Rozenbaum > Biocceleration, Inc. > OCIO/ Office of Application Engineering & Development/ Patent System Division > 600 Dulany St. > Alexandria, VA 22314 > > -----Original Message----- > From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] > Sent: Wednesday, September 19, 2012 6:48 AM > To: Rozenbaum, Daniel (Biocceleration Inc) > Cc: emboss at lists.open-bio.org > Subject: Re: [EMBOSS] Support for multi-line annotation in ig format > > Dear Daniel, > > On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: >> Greetings again, >> >> If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? > > Very easy, a 1-day job including testing and documentation. > > Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. > USPTO or Biocceleration) > > regards, > > Peter Rice > EMBOSS Team > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From ricepeterm at yahoo.co.uk Wed Sep 19 11:14:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 16:14:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <5059E147.7000807@yahoo.co.uk> Dear Daniel, On 19/09/2012 14:49, Rozenbaum, Daniel (Biocceleration Inc) wrote: > I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. Thanks I'll take a look. We usually index an "access number" in addition to the identifier. Is there some significance in the parts of the id naming that could be used as an accession or a sequence version? > As for the name, how about something like "iguspto"? Thanks. I may just use USPTO but it's not important. > Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? Yes ... it is a fairly straightforward extension to dbxflat so I could send you a copy but for general release I would prefer to distribute it only from 6.5 onwards. regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 11:23:59 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 11:23:59 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059E147.7000807@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> Message-ID: Dear Peter, At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? Thank you, Daniel On Sep 19, 2012, at 11:14 AM, "Peter Rice" wrote: > Dear Daniel, > > On 19/09/2012 14:49, Rozenbaum, Daniel (Biocceleration Inc) wrote: > >> I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. > > Thanks I'll take a look. > > We usually index an "access number" in addition to the identifier. Is > there some significance in the parts of the id naming that could be used > as an accession or a sequence version? > >> As for the name, how about something like "iguspto"? > > Thanks. I may just use USPTO but it's not important. > >> Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? > > Yes ... it is a fairly straightforward extension to dbxflat so I could > send you a copy but for general release I would prefer to distribute it > only from 6.5 onwards. > > regards, > > Peter Rice > EMBOSS Team > From ricepeterm at yahoo.co.uk Wed Sep 19 11:34:32 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 16:34:32 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> Message-ID: <5059E608.2040600@yahoo.co.uk> On 19/09/2012 16:23, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? No, it is not a problem. You only need to tell the database definition it has no accession (but perhaps the patent number could be used as an accession) regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 12:12:58 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 12:12:58 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059E608.2040600@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> <5059E608.2040600@yahoo.co.uk> Message-ID: Right - unfortunately all the other fields, while appearing well structured and nicely formatted in the example I sent, may or may not be present (or present but poorly formatted due to legacy issues) in the general case. And the patent number may not be present in the data representing patent applications that are still pending review. Many thanks, Daniel -----Original Message----- From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] Sent: Wednesday, September 19, 2012 11:35 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format On 19/09/2012 16:23, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? No, it is not a problem. You only need to tell the database definition it has no accession (but perhaps the patent number could be used as an accession) regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 12:27:37 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 17:27:37 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> <5059E608.2040600@yahoo.co.uk> Message-ID: <5059F279.6080405@yahoo.co.uk> On 19/09/2012 17:12, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Right - unfortunately all the other fields, while appearing well structured and nicely formatted in the example I sent, may or may not be present (or present but poorly formatted due to legacy issues) in the general case. And the patent number may not be present in the data representing patent applications that are still pending review. Thanks. That at least keeps things simple! regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Thu Sep 20 11:30:07 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Thu, 20 Sep 2012 11:30:07 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss Message-ID: Hi everyone, I'm trying to figure out whether this issue I'm dealing with is a bug in Jemboss or a bug in my understanding :-) . I have EMBOSS 6.4.0 with Jemboss 1.5 installed in client-server mode. The server is a Linux box, and Jemboss is started via Java Web Start on a Windows box. Then I follow this sequence of steps: 1. Run "makeprotseq" interactively in Jemboss to generate a few random sequences. This works fine. When the "Saved Results" window opens, I use "File --> Save to Local File" to save the result as "C:\TEMP\sequences.fasta" 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: Error: Failed to open filename 'C' Error: Unable to read sequence 'C:\TEMP\sequences.fasta' Died: seqret terminated: Bad value for '-sequence' with -auto defined Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: EMBOSS run details Application: seqret -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta Started at Thu Sep 20 11_15_22 EDT 2012 Input files: /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. Does this make sense? Any advice would be greatly appreciated! With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From uludag at ebi.ac.uk Thu Sep 20 18:00:18 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Thu, 20 Sep 2012 23:00:18 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: Message-ID: <505B91F2.7060709@ebi.ac.uk> Hi Daniel, When we were implementing the array representation of command lines we mistakenly added the input sequence file names to the command line array prepared on the client side. As you showed in your example these inputs are added to the command line on the server side using their final file names. It looks following changes fixes the problem in my 6.4.0 test server and current CVS code base client. Index: org/emboss/jemboss/gui/form/BuildJembossForm.java =================================================================== RCS file: /home/repository/emboss/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java,v retrieving revision 1.113 diff -u -r1.113 BuildJembossForm.java --- org/emboss/jemboss/gui/form/BuildJembossForm.java 29 Jun 2011 14:12:48 -0000 1.113 +++ org/emboss/jemboss/gui/form/BuildJembossForm.java 20 Sep 2012 21:28:14 -0000 @@ -1113,12 +1113,11 @@ fn = fn.trim(); - optionsA.add("-" + val); - optionsA.add(fn); - if(withSoap) options = filesForSoap(fn,options,val,filesToMove); else { + optionsA.add("-" + val); + optionsA.add(fn); fn = addQuote(fn); options = options.concat(" -" + val + " " + fn); } Can you please try applying the above change to your 6.4.0 installation. In 6.4.0 deleted optionsA.add() lines are the lines 1116 and 1117. Since jemboss has relatively complex installation mechanism, it might be easier if you apply this change to a freshly extracted tar ball and make a new installation. Make sure jawa web start doesn't use the cached version of the jemboss client but uses the updated version. Regards, Mahmut > 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. > > 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: > Error: Failed to open filename 'C' > Error: Unable to read sequence 'C:\TEMP\sequences.fasta' > Died: seqret terminated: Bad value for '-sequence' with -auto defined > Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: > EMBOSS run details > > Application: seqret > -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta > Started at Thu Sep 20 11_15_22 EDT 2012 > > Input files: > /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta > It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. > From daniel.rozenbaum at USPTO.GOV Thu Sep 20 23:06:41 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Thu, 20 Sep 2012 23:06:41 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505B91F2.7060709@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> Message-ID: Hi Mahmut, I've applied the patch you suggested and it seems to have fixed the problem with supplying local sequence files. However, now it appears that supplying a remote file as input no longer works, and neither does using list files (either local or remote). When I try to use a local list file, the file does make it to server side, but the command line appearing in .desc doesn't have the '@' before the file name, and that seems to be the reason for the job failure. When I try to use a remote file, the generated command line doesn't contain a reference to the input file at all, and the resultant error message reads "Error: Unable to read sequence '' ". Can you reproduce these, or did I mess something up during my attempt to apply the patch and reinstall? Many thanks in advance, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Thursday, September 20, 2012 6:00 PM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, When we were implementing the array representation of command lines we mistakenly added the input sequence file names to the command line array prepared on the client side. As you showed in your example these inputs are added to the command line on the server side using their final file names. It looks following changes fixes the problem in my 6.4.0 test server and current CVS code base client. Index: org/emboss/jemboss/gui/form/BuildJembossForm.java =================================================================== RCS file: /home/repository/emboss/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java,v retrieving revision 1.113 diff -u -r1.113 BuildJembossForm.java --- org/emboss/jemboss/gui/form/BuildJembossForm.java 29 Jun 2011 14:12:48 -0000 1.113 +++ org/emboss/jemboss/gui/form/BuildJembossForm.java 20 Sep 2012 21:28:14 -0000 @@ -1113,12 +1113,11 @@ fn = fn.trim(); - optionsA.add("-" + val); - optionsA.add(fn); - if(withSoap) options = filesForSoap(fn,options,val,filesToMove); else { + optionsA.add("-" + val); + optionsA.add(fn); fn = addQuote(fn); options = options.concat(" -" + val + " " + fn); } Can you please try applying the above change to your 6.4.0 installation. In 6.4.0 deleted optionsA.add() lines are the lines 1116 and 1117. Since jemboss has relatively complex installation mechanism, it might be easier if you apply this change to a freshly extracted tar ball and make a new installation. Make sure jawa web start doesn't use the cached version of the jemboss client but uses the updated version. Regards, Mahmut > 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. > > 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: > Error: Failed to open filename 'C' > Error: Unable to read sequence 'C:\TEMP\sequences.fasta' > Died: seqret terminated: Bad value for '-sequence' with -auto defined > Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: > EMBOSS run details > > Application: seqret > -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta > Started at Thu Sep 20 11_15_22 EDT 2012 > > Input files: > /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta > It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. > From uludag at ebi.ac.uk Fri Sep 21 07:07:05 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 12:07:05 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> Message-ID: <505C4A59.2050207@ebi.ac.uk> Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From daniel.rozenbaum at USPTO.GOV Fri Sep 21 08:36:50 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 08:36:50 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C4A59.2050207@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: Hi Mahmut, No apologies necessary, and your prompt response is sincerely appreciated! I have rebuilt Jemboss.jar with these two new versions and everything seems to be working as expected. Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 7:07 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From uludag at ebi.ac.uk Fri Sep 21 08:47:32 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 13:47:32 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C61E4.1070001@ebi.ac.uk> Hi Daniel, Thanks for the update. Good to hear that changes working as expected. Mahmut From daniel.rozenbaum at USPTO.GOV Fri Sep 21 10:13:39 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 10:13:39 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C4A59.2050207@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: Hi Mahmut, I just noticed that the most recent changes might have introduced the following glitch: server side directories for Jemboss jobs on the server now appear to be created under the path that is a concatenation of paths "results.home" and "embossBin" in jemboss.properties. In other words if results.home is /path/to/emboss_results/ and embossBin is /usr/local/emboss/6.4.0/bin/ , I'm now getting Jemboss job directories created under /path/to/emboss_results/usr/local/emboss/6.4.0/bin/ . Could you please let me know if you're able to repdocude this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 7:07 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From uludag at ebi.ac.uk Fri Sep 21 10:17:50 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 15:17:50 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C770E.6000705@ebi.ac.uk> Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" in > jemboss.properties. I just checked that I have the same problem in my Jemboss jobs folder. this should be easy to fix. I will return soon. Mahmut From uludag at ebi.ac.uk Fri Sep 21 10:58:58 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 15:58:58 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C80B2.1080800@ebi.ac.uk> Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From daniel.rozenbaum at USPTO.GOV Fri Sep 21 22:26:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 22:26:32 -0400 Subject: [EMBOSS] Jemboss refusing to work with files containing character ^L (form feed, 0x0c) Message-ID: Hi all, It appears that Jemboss objects to working with files containing the form feed character (ASCII code 0x0c, displayed in some text editors as "^L"). Jemboss refuses to open such files when I double-click on them in File Manager, and it also refuses to, for example, copy from Remote to Local in File Manager. I'm attaching an example of a file that reproduces this problem in my environment. Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: text_with_linefeed.txt URL: From daniel.rozenbaum at USPTO.GOV Fri Sep 21 22:55:43 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 22:55:43 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Message-ID: Greetings, We had run into the following problem: if Jemboss results directory (results.home in jemboss.properties) is a directory on NFS, Jemboss is unable to delete job directories, either from Saved Results or from File Manager (in the case of the latter I first delete the files in the directory from the File Manager - no problem there). Here're the steps that reproduced the problem in our environment: 1. Run say "makeprotseq" in Jemboss. The resultant job directory contains files makeseq.fasta, .desc, and .finished. 2. Open File Manager, browse to the job directory in the Remote pane, and delete the file "makeseq.fasta". The file disappears, but when checking the contents of the directory on the server, it turns out that even though "makeseq.fasta" is no longer there, there's a file ".nfs0000000". Applying "lsof" on the file reveals that the it is being kept open by a "java" process corresponding to the Tomcat server. 3. Attempt to delete the directory from Jemboss File Manager. The directory disappears in File Manager view, but a check on the server side reveals that it's still there. Inside the directory, files .desc and .finished are gone but the file .nfs000000... is still there. After a "refresh" in Jemboss File Manager the directory reappears. It was our understanding that the appearance of these ".nfs00000..." files after the attempt to delete the result files was the standard behavior of NFS when there's an attempt to delete an open file. People smarter than me then were able to resolve the problem by adding calls "in.close()" after the "while" loops in the following methods in JembossServer.java show_acd show_saved_result list_saved_results loadFilesContent (in two places) update_result_status We would really appreciate if you could please review these changes and confirm whether they're indeed correct and necessary. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From daniel.rozenbaum at USPTO.GOV Sun Sep 23 00:48:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 00:48:32 -0400 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results Message-ID: Hi, We currently have EMBOSS built and installed without PNG/GD support. When we run "pepwindow" and select "Jemboss Graphics" as the graph format, the file pepwindow1.dat is, from what we're able to tell, correctly created, and Jemboss displays the graph in the Saved Results window, from which it is possible to save the graph to a local PNG image file. However, subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From daniel.rozenbaum at USPTO.GOV Sun Sep 23 00:17:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 00:17:32 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: My apologies - the part of the previous email where I described the modifications we made that appear to have helped resolve this problem was completely incorrect. I'm attaching a version of JembossServer.java (originally from the 6.4.0 distribution) with our modifications preceded with comments starting with "// Biocceleration: " . To reiterate, these modifications were mostly guesswork - trying to identify possible places where an explicit close() could have helped resolve the issue. ________________________________________ From: emboss-bounces at lists.open-bio.org [emboss-bounces at lists.open-bio.org] On Behalf Of Rozenbaum, Daniel (Biocceleration Inc) Sent: Friday, September 21, 2012 10:55 PM To: emboss at lists.open-bio.org Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Greetings, We had run into the following problem: if Jemboss results directory (results.home in jemboss.properties) is a directory on NFS, Jemboss is unable to delete job directories, either from Saved Results or from File Manager (in the case of the latter I first delete the files in the directory from the File Manager - no problem there). Here're the steps that reproduced the problem in our environment: 1. Run say "makeprotseq" in Jemboss. The resultant job directory contains files makeseq.fasta, .desc, and .finished. 2. Open File Manager, browse to the job directory in the Remote pane, and delete the file "makeseq.fasta". The file disappears, but when checking the contents of the directory on the server, it turns out that even though "makeseq.fasta" is no longer there, there's a file ".nfs0000000". Applying "lsof" on the file reveals that the it is being kept open by a "java" process corresponding to the Tomcat server. 3. Attempt to delete the directory from Jemboss File Manager. The directory disappears in File Manager view, but a check on the server side reveals that it's still there. Inside the directory, files .desc and .finished are gone but the file .nfs000000... is still there. After a "refresh" in Jemboss File Manager the directory reappears. It was our understanding that the appearance of these ".nfs00000..." files after the attempt to delete the result files was the standard behavior of NFS when there's an attempt to delete an open file. People smarter than me then were able to resolve the problem by adding calls "in.close()" after the "while" loops in the following methods in JembossServer.java show_acd show_saved_result list_saved_results loadFilesContent (in two places) update_result_status We would really appreciate if you could please review these changes and confirm whether they're indeed correct and necessary. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss -------------- next part -------------- A non-text attachment was scrubbed... Name: JembossServer.java_added_close_attempts Type: application/octet-stream Size: 31747 bytes Desc: JembossServer.java_added_close_attempts URL: From uludag at ebi.ac.uk Sun Sep 23 04:50:23 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 09:50:23 +0100 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: <505ECD4F.2020202@ebi.ac.uk> Hi Daniel, Thanks for the patch. I will apply your changes to the latest CVS version of the file. If I don't misread your email, problem may have stayed because we have more streams not closed in other jemboss classes. The ones in JembossFileServer get/put_file methods would be the main reason for the .nfs files you reported. I remember seeing this problem long time ago but didn't manage to find out its reason, i think i probably blamed nfs and didn't spend time to check whether we might have anything wrong in jemboss. Mahmut > the part of the previous email where I described the modifications we made that appear to have helped resolve this problem was completely incorrect. I'm attaching a version of JembossServer.java (originally from the 6.4.0 distribution) with our modifications preceded with comments starting with "// Biocceleration: " . To reiterate, these modifications were mostly guesswork - trying to identify possible places where an explicit close() could have helped resolve the issue. From uludag at ebi.ac.uk Sun Sep 23 09:33:36 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 14:33:36 +0100 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: <505F0FB0.7030208@ebi.ac.uk> Hi Daniel, I was able to produce .nfs files by following the steps you described. Closing the open streams in JembossServer class looks fixes the problem. I don't understand why you thought your modifications were completely incorrect? I checked in you patch to CVS together with a fix to allow jemboss file manager to delete remote folders when they have hidden files like .desc and .finished. Mahmut From uludag at ebi.ac.uk Sun Sep 23 11:44:03 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 16:44:03 +0100 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results In-Reply-To: References: Message-ID: <505F2E43.7020700@ebi.ac.uk> Hi Daniel, > subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. I'm not able to reproduce it with J/Emboss 6.5 client. I will later check with J/Emboss 6.4. > The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. I have tried updating the ShowResultSet class in package org.emboss.jemboss.gui to fix this problem. It looks working for me. > Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? No. After above modification I'm able to see the images displayed. I will next try to understand the string-encoding related problem you posted. Mahmut From daniel.rozenbaum at USPTO.GOV Sun Sep 23 13:52:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 13:52:35 -0400 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results In-Reply-To: <505F2E43.7020700@ebi.ac.uk> References: , <505F2E43.7020700@ebi.ac.uk> Message-ID: Hi Mahmut, Fantastic, thanks! After rebuilding with the most recent ShowResultSet.java, double-clicking on a pepwindow saved result, or on pepwindow1.dat in File Manager now displays the graph as expected. With best regards, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Sunday, September 23, 2012 11:44 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Jemboss: accessing saved pepwindow results Hi Daniel, > subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. I'm not able to reproduce it with J/Emboss 6.5 client. I will later check with J/Emboss 6.4. > The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. I have tried updating the ShowResultSet class in package org.emboss.jemboss.gui to fix this problem. It looks working for me. > Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? No. After above modification I'm able to see the images displayed. I will next try to understand the string-encoding related problem you posted. Mahmut From daniel.rozenbaum at USPTO.GOV Sun Sep 23 13:56:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 13:56:32 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: <505F0FB0.7030208@ebi.ac.uk> References: , <505F0FB0.7030208@ebi.ac.uk> Message-ID: Hi Mahmut, Sorry about that confusion - the file I attached to my previous email in this thread did contain the correct fixes. After rebuilding with the latest version of JembossServer.java, everything appears to be working properly. -Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Sunday, September 23, 2012 9:33 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Hi Daniel, I was able to produce .nfs files by following the steps you described. Closing the open streams in JembossServer class looks fixes the problem. I don't understand why you thought your modifications were completely incorrect? I checked in you patch to CVS together with a fix to allow jemboss file manager to delete remote folders when they have hidden files like .desc and .finished. Mahmut From daniel.rozenbaum at USPTO.GOV Mon Sep 24 12:13:56 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 12:13:56 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C80B2.1080800@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk> Message-ID: Hi Mahmut, After re-building with the latest versions of gui/form/BuildJembossForm.java (1.116) and server/JembossServer.java (1.47) I'm running into a problem with adding ":sequence_name" to local files. It is reproducible in my environment using the following steps: 1. Run makeprotseq to generate say 10 sequences 2. Call up seqret and drag-and-drop the file "makeseq.fasta" generated at the previous step from the File Manager window to the "Sequence Filename" field of seqret. 3. Append ":EMBOSS_003". So the full string in the "Sequence Filename" field looks like /path/emboss/results/username/makeprotseq_Mon_Sep_24_12_04_32_EDT_2012_51429/makeseq.fasta:EMBOSS_003 4. Execute seqret. Everything works as expected. 5. Drag-and-drop the file "makeseq.fasta" to a local folder 6. Drag-and-drop the file from the Local pane to seqret input, and add the same sequence id. So the full string in the "Sequence Filename" field in my case (a Windows client) looks something like H:\tmp\makeseq.fasta:EMBOSS_003 7. When seqret is run, the following error message appears: Error: Failed to open filename 'H' Error: Unable to read sequence 'H:\tmp\makeseq.fasta:EMBOSS_003' Died: seqret terminated: Bad value for '-sequence' with -auto defined Would you be able to look into this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 10:58 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From daniel.rozenbaum at USPTO.GOV Mon Sep 24 16:07:22 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 16:07:22 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, Message-ID: Mahmut, please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. ________________________________________ From: Rozenbaum, Daniel (Biocceleration Inc) Sent: Monday, September 24, 2012 12:13 PM To: Mahmut Uludag Cc: emboss at lists.open-bio.org Subject: RE: [EMBOSS] Handling of local file input in Jemboss Hi Mahmut, After re-building with the latest versions of gui/form/BuildJembossForm.java (1.116) and server/JembossServer.java (1.47) I'm running into a problem with adding ":sequence_name" to local files. It is reproducible in my environment using the following steps: 1. Run makeprotseq to generate say 10 sequences 2. Call up seqret and drag-and-drop the file "makeseq.fasta" generated at the previous step from the File Manager window to the "Sequence Filename" field of seqret. 3. Append ":EMBOSS_003". So the full string in the "Sequence Filename" field looks like /path/emboss/results/username/makeprotseq_Mon_Sep_24_12_04_32_EDT_2012_51429/makeseq.fasta:EMBOSS_003 4. Execute seqret. Everything works as expected. 5. Drag-and-drop the file "makeseq.fasta" to a local folder 6. Drag-and-drop the file from the Local pane to seqret input, and add the same sequence id. So the full string in the "Sequence Filename" field in my case (a Windows client) looks something like H:\tmp\makeseq.fasta:EMBOSS_003 7. When seqret is run, the following error message appears: Error: Failed to open filename 'H' Error: Unable to read sequence 'H:\tmp\makeseq.fasta:EMBOSS_003' Died: seqret terminated: Bad value for '-sequence' with -auto defined Would you be able to look into this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 10:58 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From uludag at ebi.ac.uk Mon Sep 24 16:17:03 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Mon, 24 Sep 2012 21:17:03 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, Message-ID: <5060BFBF.7090608@ebi.ac.uk> > please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. Confused again. I was able to reproduce the problem following the steps you described and just checked in a fix in BuildJembossForm class. Have you tried updating from CVS before your last rebuild? Since your email is about 4 minutes later than I checked in above change to CVS. Mahmut From daniel.rozenbaum at USPTO.GOV Mon Sep 24 22:18:34 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 22:18:34 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <5060BFBF.7090608@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, , <5060BFBF.7090608@ebi.ac.uk> Message-ID: Hi Mahmut, Thank you for being able to find the kernel of reason in my continued attempts to confuse this discussion :-) I rebuilt everything again with today's fix in BuildJembossForm, and everything seems to be working fine. With best regards, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Monday, September 24, 2012 4:17 PM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss > please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. Confused again. I was able to reproduce the problem following the steps you described and just checked in a fix in BuildJembossForm class. Have you tried updating from CVS before your last rebuild? Since your email is about 4 minutes later than I checked in above change to CVS. Mahmut From daniel.rozenbaum at USPTO.GOV Tue Sep 25 07:48:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 25 Sep 2012 07:48:35 -0400 Subject: [EMBOSS] protein three-to-one and one-to-three Message-ID: Hi, Our users are interested in having access to "three-to-one" and "one-to-thee" amino acid representation conversion utilities similar to these: http://bioinformatics.org/sms2/three_to_one.html http://bioinformatics.org/sms2/one_to_three.html >From what I've been able to tell, the latter is achievable with "showpep -three" (even though the users would have preferred a horizontal representation of the three-letter codes), and the only relevant library function currently available in EMBOSS is the embPropCharToThree() ; is this correct? Just wanted to make sure that I'm not missing anything if we decided to take a shot at developing such utilities within the EMBOSS framework. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From ricepeterm at yahoo.co.uk Tue Sep 25 08:14:40 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Tue, 25 Sep 2012 13:14:40 +0100 Subject: [EMBOSS] protein three-to-one and one-to-three In-Reply-To: References: Message-ID: <5061A030.5010809@yahoo.co.uk> Dear Daniel, On 25/09/2012 12:48, Rozenbaum, Daniel (Biocceleration Inc) wrote:> Hi, > > Our users are interested in having access to "three-to-one" and "one-to-thee" amino acid representation conversion utilities similar to these: > > http://bioinformatics.org/sms2/three_to_one.html > http://bioinformatics.org/sms2/one_to_three.html > >From what I've been able to tell, the latter is achievable with "showpep -three" (even though the users would have preferred a horizontal representation of the three-letter codes), and the only relevant library function currently available in EMBOSS is the embPropCharToThree() ; is this correct? those Much discussed in the early days when we decided to offer 3-to-1 for who found the letters hard to read but not the reverse direction because a 3-letter protein sequence can also be a valid 3x longer 1-letter protein sequence. > Just wanted to make sure that I'm not missing anything if we decided > to take a shot at developing such utilities within the EMBOSS > framework. Easy enough to develop, but I would suggest using some output format that makes it clear it is not a standard sequence format - otherwise EMBOSS (e.g. seqret) will try to read it as some other format and claim success. If your 3-letter output is unreadable then you can safely try implementing it as an input format. regards, Peter Rice From daniel.rozenbaum at USPTO.GOV Tue Sep 25 17:04:38 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 25 Sep 2012 17:04:38 -0400 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg Message-ID: Hi everyone, It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? Is dreg/preg the appropriate alternative here? Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From ricepeterm at yahoo.co.uk Wed Sep 26 05:54:54 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 26 Sep 2012 10:54:54 +0100 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: References: Message-ID: <5062D0EE.5060907@yahoo.co.uk> Dear Daniel, On 25/09/2012 22:04, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi everyone, > > It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? It is not possible, brackets are not allowed in fuzz* patterns. The repeats are for single residues or bases, usually used for unknowns to give a variable gap between known residues or bases (e.g. for protein active sites). > Is dreg/preg the appropriate alternative here? Yes. This example is one of the reasons we wrote them. Hope this helps, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 26 07:57:28 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 26 Sep 2012 07:57:28 -0400 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: <5062D0EE.5060907@yahoo.co.uk> References: , <5062D0EE.5060907@yahoo.co.uk> Message-ID: Dear Peter, Great, thanks! One quick follow-up question: is it possible to request dreg/preg to report just those sequences where a match is found? Fuzznuc/pro seem to be working like that, but I haven't been able to figure out how to achieve that with dreg/preg. With best regards, Daniel ________________________________________ From: Peter Rice [ricepeterm at yahoo.co.uk] Sent: Wednesday, September 26, 2012 5:54 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg Dear Daniel, On 25/09/2012 22:04, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi everyone, > > It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? It is not possible, brackets are not allowed in fuzz* patterns. The repeats are for single residues or bases, usually used for unknowns to give a variable gap between known residues or bases (e.g. for protein active sites). > Is dreg/preg the appropriate alternative here? Yes. This example is one of the reasons we wrote them. Hope this helps, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 26 08:40:23 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 26 Sep 2012 13:40:23 +0100 Subject: [EMBOSS] Fwd: Re: Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: <5062F70C.5090601@yahoo.co.uk> References: <5062F70C.5090601@yahoo.co.uk> Message-ID: <5062F7B7.1080506@yahoo.co.uk> On 26/09/2012 12:57, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > Great, thanks! One quick follow-up question: is it possible to request dreg/preg to report just those sequences where a match is found? Fuzznuc/pro seem to be working like that, but I haven't been able to figure out how to achieve that with dreg/preg. When any program writes a report, one option is to ask for it in -rformat listfile which reports the USAs of the matches. Good point though about the difference between outputs, something nobody had pointed out before. The fuzz programs came first so they should be the 'standard' and for the next release we will modify dreg and preg to only report sequences where a match is found. regards, Peter Rice EMBOSS Team From maoj at helix.nih.gov Tue Sep 4 15:30:33 2012 From: maoj at helix.nih.gov (Jean Mao) Date: Tue, 4 Sep 2012 11:30:33 -0400 Subject: [EMBOSS] Error in Emboss Explorer after updated EMBOSS to 6.5.7 Message-ID: <50461E99.3040805@helix.nih.gov> Hi, I know this list is for EMBOSS, not emboss explorer. However, I am not sure where to look for help with emboss explorer anymore so I am trying my luck here. Really appreciate if someone can point me to the right direction. Recently I updated emboss to 6.5.7 from 6.3.1. Emboss Explorer was functional before I switched the link. Once the link is switched to point to 6.5.7, the main manu panel on the left of the webpage is comletely messed up. In stead of showing categories and apps under each category, now there is only categories showing and no apps that one can click and load. The most current emboss explorer I can find is 2.2.0 and was out since 2006. Regards, Jean From forrest.bao at gmail.com Mon Sep 10 18:11:57 2012 From: forrest.bao at gmail.com (Forrest Sheng Bao) Date: Mon, 10 Sep 2012 13:11:57 -0500 Subject: [EMBOSS] any APIs for vectorstrip or EMBOSS Message-ID: Hi all, I am writing a program in which vectorstrip is used to remove vectors. I wanna be able to call the functions in vectorstrips from my program. Does vectorstrips or EMBOSS have a set of APIs that I can use? Cheers, Forrest From paul.tanger at colostate.edu Thu Sep 13 20:24:47 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 14:24:47 -0600 Subject: [EMBOSS] specify primer3 directory? Message-ID: Hi, I have a local install of emboss and I'm trying to get eprimer3 to work, but I'm getting this error: "Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified" primer3 is installed, but not in the default location (which I think is /opt/primer3_config ?) . How do I specify where primer3 is installed? Or is the cause of this error something else? googled for an answer for a while, but couldn't find one. Thanks! From paul.tanger at colostate.edu Thu Sep 13 21:58:47 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 15:58:47 -0600 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Message-ID: Thanks, maybe this is the problem but the solution you suggest doesn't seem to work because that is a primer3 qualifier not an eprimer3 qualifier. I get this error: [paultanger at bspmgenomics bin]$ ./eprimer3 ~/QTL-project/30scaffolds_affyMAI_CG.fsa ~/QTL-project/test2 --default_version=1 Died: Unknown qualifier --default_version=1 On Thu, Sep 13, 2012 at 3:49 PM, Scott Markel wrote: > Paul, > > You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. > > You can revert to the old defaults using this command-line argument. > > --default_version=1 > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications and Communications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger > Sent: Thursday, 13 September 13 2012 1:25 PM > To: emboss at lists.open-bio.org > Subject: [EMBOSS] specify primer3 directory? > > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From Scott.Markel at accelrys.com Thu Sep 13 22:01:55 2012 From: Scott.Markel at accelrys.com (Scott Markel) Date: Thu, 13 Sep 2012 15:01:55 -0700 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Message-ID: <5ACBA19439E77B43A06F4CAB897EC977068A51519E@EXCH1-COLO.accelrys.net> Paul, Yup, sorry about that. I should have stopped my answer after the paragraph about the primer3 change. Scott -----Original Message----- From: ptanger at rams.colostate.edu [mailto:ptanger at rams.colostate.edu] On Behalf Of Paul Tanger Sent: Thursday, 13 September 13 2012 2:59 PM To: Scott Markel Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] specify primer3 directory? Thanks, maybe this is the problem but the solution you suggest doesn't seem to work because that is a primer3 qualifier not an eprimer3 qualifier. I get this error: [paultanger at bspmgenomics bin]$ ./eprimer3 ~/QTL-project/30scaffolds_affyMAI_CG.fsa ~/QTL-project/test2 --default_version=1 Died: Unknown qualifier --default_version=1 On Thu, Sep 13, 2012 at 3:49 PM, Scott Markel wrote: > Paul, > > You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. > > You can revert to the old defaults using this command-line argument. > > --default_version=1 > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications and Communications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger > Sent: Thursday, 13 September 13 2012 1:25 PM > To: emboss at lists.open-bio.org > Subject: [EMBOSS] specify primer3 directory? > > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > From Scott.Markel at accelrys.com Thu Sep 13 21:49:54 2012 From: Scott.Markel at accelrys.com (Scott Markel) Date: Thu, 13 Sep 2012 14:49:54 -0700 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: Message-ID: <5ACBA19439E77B43A06F4CAB897EC977068A515199@EXCH1-COLO.accelrys.net> Paul, You might want to have a look at section 5 of the primer3 documentation ("CHANGES FROM VERSION 2.2.3"). They changed the default for PRIMER_THERMODYNAMIC_ALIGNMENT from 0 to 1. If this is left at 1, then you also need to supply a value for PRIMER_THERMODYNAMIC_PARAMETERS_PATH. You can revert to the old defaults using this command-line argument. --default_version=1 Scott Scott Markel, Ph.D. Principal Bioinformatics Architect? email:? smarkel at accelrys.com Accelrys (Pipeline Pilot R&D)?????? mobile: +1 858 205 3653 10188 Telesis Court, Suite 100????? voice:? +1 858 799 5603 San Diego, CA 92121???????????????? fax:??? +1 858 799 5222 USA???????????????????????????????? web:??? http://www.accelrys.com http://www.linkedin.com/in/smarkel Secretary, Board of Directors: ??? International Society for Computational Biology Chair: ISCB Publications and Communications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Paul Tanger Sent: Thursday, 13 September 13 2012 1:25 PM To: emboss at lists.open-bio.org Subject: [EMBOSS] specify primer3 directory? Hi, I have a local install of emboss and I'm trying to get eprimer3 to work, but I'm getting this error: "Error: thermodynamic approach chosen, but path to thermodynamic parameters not specified" primer3 is installed, but not in the default location (which I think is /opt/primer3_config ?) . How do I specify where primer3 is installed? Or is the cause of this error something else? googled for an answer for a while, but couldn't find one. Thanks! _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From paul.tanger at colostate.edu Thu Sep 13 23:03:27 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Thu, 13 Sep 2012 17:03:27 -0600 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: <73B93C239077ED49B5302828A2076E351838AC2F@PHSX10MB15.partners.org> References: <73B93C239077ED49B5302828A2076E351838AC2F@PHSX10MB15.partners.org> Message-ID: That worked - Thanks so much! On Thu, Sep 13, 2012 at 4:37 PM, Drummond, Iain A. wrote: > I ran into (I think) similar problems running primer3 with Emboss eprimer3. > > It turned out that the most recent version of primer3 (2.x.x) will not run > with the current Emboss eprimer3; you have to downgrade primer3 to 1.1.4 to > get it to work. > > The Emboss folks know this I think. > > -Iain > > > ------- > Iain Drummond, Ph.D. > Associate Professor > Nephrology Division, Massachusetts General Hospital, > Department of Genetics, Harvard Medical School and > Program in Developmental and Regenerative Biology, Harvard Medical School > > Address for mailing: > > Nephrology Division, MGH > 149 13th Street, Rm 149-8000 > Charlestown MA 02129 > > 617 726 5647 (office) > 617 724 9693 (lab) > 617 726 5669 (fax) > > idrummon at receptor.mgh.harvard.edu > idrummond at partners.org > > HTTP://danio.mgh.harvard.edu > > > On 9/13/12 4:24 PM, "Paul Tanger" wrote: > >> Hi, >> I have a local install of emboss and I'm trying to get eprimer3 to >> work, but I'm getting this error: >> >> "Error: thermodynamic approach chosen, but path to thermodynamic >> parameters not specified" >> >> primer3 is installed, but not in the default location (which I think >> is /opt/primer3_config ?) . >> How do I specify where primer3 is installed? Or is the cause of this >> error something else? >> >> googled for an answer for a while, but couldn't find one. >> >> Thanks! >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > > > The information in this e-mail is intended only for the person to whom it is > addressed. If you believe this e-mail was sent to you in error and the e-mail > contains patient information, please contact the Partners Compliance HelpLine at > http://www.partners.org/complianceline . If the e-mail was sent to you in error > but does not contain patient information, please contact the sender and properly > dispose of the e-mail. > From ppetrov at mail.student.oulu.fi Fri Sep 14 08:06:47 2012 From: ppetrov at mail.student.oulu.fi (Petar Petrov) Date: Fri, 14 Sep 2012 11:06:47 +0300 Subject: [EMBOSS] specify primer3 directory? In-Reply-To: References: Message-ID: <20120914110647.sacof6lkgscookc0@webmail.oulu.fi> Hi Paul, which version of primer3 do you use? What operating system? How did you install primer3 and EMBOSS? If it is primer3 version 2.* I think you should use eprimer32 wrapper instead and it expects primer3_core to be called primer32_core. To make primer3 aware of another location of the primer3_config directory: it seems that two files need to be modified before compiling: thal_main.c and primer3_boulder_main.c (in the subfolder src). Hope this helps. regards, Petar Quoting Paul Tanger : > Hi, > I have a local install of emboss and I'm trying to get eprimer3 to > work, but I'm getting this error: > > "Error: thermodynamic approach chosen, but path to thermodynamic > parameters not specified" > > primer3 is installed, but not in the default location (which I think > is /opt/primer3_config ?) . > How do I specify where primer3 is installed? Or is the cause of this > error something else? > > googled for an answer for a while, but couldn't find one. > > Thanks! > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From daniel.rozenbaum at USPTO.GOV Fri Sep 14 13:08:36 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 14 Sep 2012 09:08:36 -0400 Subject: [EMBOSS] ajSeqxrefNewDbS Message-ID: Hi, What is the meaning of the messages "ajSeqrefNewDBS" that occasionally appear in the output of EMBOSS utilities? They're even mentioned in the documentation pages, e.g. http://emboss.sourceforge.net/apps/release/6.4/emboss/apps/oddcomp.html (the last line in the quote below): % oddcomp Identify proteins with specified sequence word composition Input protein sequence(s): tsw:* Program compseq output file: oddcomp.comp Window size to consider (e.g. 30 aa) [30]: Output file [12s1_arath.oddcomp]: out.odd ajSeqxrefNewDbS '1-I' 'FT025' Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria VA 22314 From daniel.rozenbaum at USPTO.GOV Fri Sep 14 12:56:14 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 14 Sep 2012 08:56:14 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format Message-ID: Hello Peter and everyone, I was wondering if I could revive the discussion about the support of IG format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark Office, where this format, in its multi-line sequence annotation form, is used extensively. Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: % makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname ig1 % cat ig1.ig ;, 10 bases EMBOSS_001 hcsptpstas1 ;, 10 bases EMBOSS_002 rdgwcvmtrm1 ;, 10 bases EMBOSS_003 fgtifgdgid1 % entret -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout ;, 10 bases EMBOSS_001 hcsptpstas1 ;, 10 bases In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria VA 22314 ------------------------- On 15/08/2012 17:57, Daniel Rozenbaum wrote: > Dear list, > > (Peter, many thanks for your prompt reply to my previous inquiry!) > > We need to deal with extensive databases in Intelligenetics format with multiple lines in annotation of each record. It appears however that EMBOSS concatenates all annotation lines into a single line when building its internal representation of the sequence description: > > % cat /tmp/IGSEQ.ig > ; Annotation line 1 > ; Annotation line 2 > ; Annotation line 3 > IGSEQ > ACGCATCGCATCAGACTACGC1 > > > % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig -osdirectory /tmp > > > % cat /tmp/IGSEQ.emboss_ig2ig.ig > ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases > IGSEQ > ACGCATCGCATCAGACTACGC1 > > Are there any plans to support multi-line annotation in this format? Interesting thought. We will take a look. It will need some care to maintain compatibility with other formats that have single (FASTA) or multiple (swissprot) descriptions. Which package is using this IG format? regards, Peter Rice EMBOSS Team From paul.tanger at colostate.edu Fri Sep 14 19:08:25 2012 From: paul.tanger at colostate.edu (Paul Tanger) Date: Fri, 14 Sep 2012 13:08:25 -0600 Subject: [EMBOSS] specify primer3 directory? Message-ID: This works as well. I didn't look at eprimer32 because for some reason I thought the "32" was a 32 bit version or something.. Turns out eprimer32 is built to work with 2.x versions of primer3. As Petar suggested, for a non default install of primer3 you need to modify some files before everything will work. To provide more detail in case others are interested, in thal_main.c you need to modify lines 306-307: } else if ((stat("/opt/primer3_config", &st) == 0) && S_ISDIR(st.st_mode)) { tmp_ret = get_thermodynamic_values("/opt/primer3_config/", &o); and in primer3_boulder_main.c you need to modify lines 517-521: } else if ((stat("/opt/primer3_config", &st) == 0) && S_ISDIR(st.st_mode)) { thermodynamic_params_path = (char*) malloc(strlen("/opt/primer3_config/") * sizeof(char) + 1); if (NULL == thermodynamic_params_path) exit (-2); /* Out of memory */ strcpy(thermodynamic_params_path, "/opt/primer3_config/"); These changes apply to unix - I saw different places in those files that would need to be changed if you were doing this in windows. Has anyone got emboss and eprimer32 working and installed in galaxy? Or a similar primer design tool in galaxy? Thanks for the help everyone. Message: 6 Date: Fri, 14 Sep 2012 11:06:47 +0300 From: Petar Petrov Subject: Re: [EMBOSS] specify primer3 directory? To: emboss at lists.open-bio.org Message-ID: <20120914110647.sacof6lkgscookc0 at webmail.oulu.fi> Content-Type: text/plain; charset=windows-1251; DelSp="Yes"; format="flowed" Hi Paul, which version of primer3 do you use? What operating system? How did you install primer3 and EMBOSS? If it is primer3 version 2.* I think you should use eprimer32 wrapper instead and it expects primer3_core to be called primer32_core. To make primer3 aware of another location of the primer3_config directory: it seems that two files need to be modified before compiling: thal_main.c and primer3_boulder_main.c (in the subfolder src). Hope this helps. regards, Petar From daniel.rozenbaum at USPTO.GOV Tue Sep 18 02:00:58 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 17 Sep 2012 22:00:58 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> Greetings again, If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 On Sep 14, 2012, at 9:36 AM, "Rozenbaum, Daniel (Biocceleration Inc)" wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark Office, where this format, in its multi-line sequence annotation form, is used extensively. > > Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: > > % makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname ig1 > > % cat ig1.ig > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > EMBOSS_002 > rdgwcvmtrm1 > ;, 10 bases > EMBOSS_003 > fgtifgdgid1 > > > % entret -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > > In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. > > Many thanks, > Daniel > -- > Daniel Rozenbaum > Biocceleration, Inc. > OCIO/ Office of Application Engineering & Development/ Patent System Division > 600 Dulany St. > Alexandria VA 22314 > > ------------------------- > On 15/08/2012 17:57, Daniel Rozenbaum wrote: >> Dear list, >> >> (Peter, many thanks for your prompt reply to my previous inquiry!) >> >> We need to deal with extensive databases in Intelligenetics format with multiple lines in annotation of each record. It appears however that EMBOSS concatenates all annotation lines into a single line when building its internal representation of the sequence description: >> >> % cat /tmp/IGSEQ.ig >> ; Annotation line 1 >> ; Annotation line 2 >> ; Annotation line 3 >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> >> % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig -osdirectory /tmp >> >> >> % cat /tmp/IGSEQ.emboss_ig2ig.ig >> ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> Are there any plans to support multi-line annotation in this format? > > Interesting thought. We will take a look. It will need some care to > maintain compatibility with other formats that have single (FASTA) or > multiple (swissprot) descriptions. > > Which package is using this IG format? > > regards, > > Peter Rice > EMBOSS Team > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From p.j.a.cock at googlemail.com Tue Sep 18 08:25:11 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 18 Sep 2012 09:25:11 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: On Fri, Sep 14, 2012 at 1:56 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG > format if possible. I'm helping deploy EMBOSS at the US Patent and > Trademark Office, where this format, in its multi-line sequence annotation > form, is used extensively. Hi Daniel, That is interesting to know - I work on Biopython, which has support for reading and indexing the Intelligenetics "ig" format. I'd been under the impression that this was a defunct/unused file format (and therefore never bothered to implement support for writing it in Biopython). Does the US Patent and Trademark Office provide datasets to the public in this format? Thanks, Peter C. From daniel.rozenbaum at USPTO.GOV Tue Sep 18 11:42:53 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 18 Sep 2012 07:42:53 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: , Message-ID: Hi Peter, I don't believe the USPTO provides datasets to the public in the IG format. With best regards, Daniel ________________________________________ From: Peter Cock [p.j.a.cock at googlemail.com] Sent: Tuesday, September 18, 2012 4:25 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format On Fri, Sep 14, 2012 at 1:56 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG > format if possible. I'm helping deploy EMBOSS at the US Patent and > Trademark Office, where this format, in its multi-line sequence annotation > form, is used extensively. Hi Daniel, That is interesting to know - I work on Biopython, which has support for reading and indexing the Intelligenetics "ig" format. I'd been under the impression that this was a defunct/unused file format (and therefore never bothered to implement support for writing it in Biopython). Does the US Patent and Trademark Office provide datasets to the public in this format? Thanks, Peter C. From p.j.a.cock at googlemail.com Tue Sep 18 12:20:09 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 18 Sep 2012 13:20:09 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 12:42 PM, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi Peter, > > I don't believe the USPTO provides datasets to the public in the IG format. > > With best regards, > Daniel OK, thanks. Peter From ricepeterm at yahoo.co.uk Wed Sep 19 10:06:53 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:06:53 +0100 Subject: [EMBOSS] any APIs for vectorstrip or EMBOSS In-Reply-To: References: Message-ID: <5059993D.1030604@yahoo.co.uk> Dear Forrest, > I am writing a program in which vectorstrip is used to remove vectors. I > wanna be able to call the functions in vectorstrips from my program. Does > vectorstrips or EMBOSS have a set of APIs that I can use? You can call any EMBOSS application using the command line by adding any options you need (some are required, for example the input sequence or file, other have default values) and add "-auto" to default any other options. You can also add the command line option "-filter" to read the first input from standard input, and to write the first output to standard output, so you can then write the input directly to the EMBOSS program and read the output directly from the program. If you allow the program to write output to a file, we recommend that you specify the output file name (for example "-outfile vec.out -outseq vec.seq") so your program knows which file to open. Hope this helps Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 10:48:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:48:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> Message-ID: <5059A2EF.7020504@yahoo.co.uk> Dear Daniel, On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Greetings again, > > If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? Very easy, a 1-day job including testing and documentation. Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. USPTO or Biocceleration) regards, Peter Rice EMBOSS Team From p.j.a.cock at googlemail.com Wed Sep 19 10:58:31 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 19 Sep 2012 11:58:31 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059A2EF.7020504@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: On Wed, Sep 19, 2012 at 11:48 AM, Peter Rice wrote: > Dear Daniel, > > > On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: >> >> Greetings again, >> >> If I may, another question on the issue of IG format: how difficult would >> it be to support database indexing for this format? > > > Very easy, a 1-day job including testing and documentation. > > Could you please make some example data available, and indicate which fields > could be indexed (including any information in formatted descriptions or in > naming conventions), and suggest a format name (e.g. USPTO or > Biocceleration) Does it need a new format name? EMBOSS already defines "ig" and "igstrict" - do the USPTO files diverge from these? Peter C. P.S. Biopython also uses the format name "ig", based on the current EMBOSS terminology. From ricepeterm at yahoo.co.uk Wed Sep 19 10:56:14 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 11:56:14 +0100 Subject: [EMBOSS] ajSeqxrefNewDbS In-Reply-To: References: Message-ID: <5059A4CE.6020501@yahoo.co.uk> On 14/09/2012 14:08, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi, > > What is the meaning of the messages "ajSeqrefNewDBS" that occasionally appear in the output of EMBOSS utilities? They're even mentioned in the documentation pages, e.g. http://emboss.sourceforge.net/apps/release/6.4/emboss/apps/oddcomp.html (the last line in the quote below): > > ajSeqxrefNewDbS '1-I' 'FT025' They can be safely ignored. The test example outputs are automatically added to the documentation and this one slipped through unnoticed among some other changes in 6.4. They are removed in the latest EMBOSS release 6.5 regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 11:32:45 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 12:32:45 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: Message-ID: <5059AD5D.9020503@yahoo.co.uk> Dear Daniel, On 14/09/2012 13:56, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hello Peter and everyone, > > Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS: > > In the entret result above the first annotation line of the subsequent record is returned as part of the requested record. Well spotted. The input buffer is not reset in Ig formats so the next line was included in the entret output. I will fix it in the next patch for the latest release (6.5). Let me know if you also need a patch for 6.4. regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 11:42:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 12:42:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <5059AF97.6020206@yahoo.co.uk> On 19/09/2012 11:58, Peter Cock wrote: > Does it need a new format name? EMBOSS already defines "ig" and > "igstrict" - do the USPTO files diverge from these? The format name is needed as an option to dbxflat -idformat so we can select a specific parser for any additional fields. For example, in dbxfasta -idformat has 7 names for 'fasta' format. regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 13:49:14 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 09:49:14 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059A2EF.7020504@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: Dear Peter, This is most wonderful news that's going to make a bunch of users really happy! I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. As for the name, how about something like "iguspto"? Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? With gratitude, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 -----Original Message----- From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] Sent: Wednesday, September 19, 2012 6:48 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format Dear Daniel, On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Greetings again, > > If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? Very easy, a 1-day job including testing and documentation. Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. USPTO or Biocceleration) regards, Peter Rice EMBOSS Team -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ig_uspto_sample.txt URL: From daniel.rozenbaum at USPTO.GOV Wed Sep 19 14:45:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 10:45:35 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <9D8AD7D9-FD9E-4825-9D01-7F451EB805C2@USPTO.GOV> A quick addition to the information on this format: while the example I sent has the records separated by a couple of new lines and a form feed (^L , 0x0c), in the most general case the first line of the next record (a line that starts with a semicolon) could appear immediately after the last sequence data line of the previous record. Empty lines between records are ignored. On Sep 19, 2012, at 10:09 AM, "Rozenbaum, Daniel (Biocceleration Inc)" wrote: > Dear Peter, > > This is most wonderful news that's going to make a bunch of users really happy! > > I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. > > As for the name, how about something like "iguspto"? > > Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? > > With gratitude, > Daniel > > -- > Daniel Rozenbaum > Biocceleration, Inc. > OCIO/ Office of Application Engineering & Development/ Patent System Division > 600 Dulany St. > Alexandria, VA 22314 > > -----Original Message----- > From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] > Sent: Wednesday, September 19, 2012 6:48 AM > To: Rozenbaum, Daniel (Biocceleration Inc) > Cc: emboss at lists.open-bio.org > Subject: Re: [EMBOSS] Support for multi-line annotation in ig format > > Dear Daniel, > > On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote: >> Greetings again, >> >> If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format? > > Very easy, a 1-day job including testing and documentation. > > Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g. > USPTO or Biocceleration) > > regards, > > Peter Rice > EMBOSS Team > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From ricepeterm at yahoo.co.uk Wed Sep 19 15:14:15 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 16:14:15 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> Message-ID: <5059E147.7000807@yahoo.co.uk> Dear Daniel, On 19/09/2012 14:49, Rozenbaum, Daniel (Biocceleration Inc) wrote: > I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. Thanks I'll take a look. We usually index an "access number" in addition to the identifier. Is there some significance in the parts of the id naming that could be used as an accession or a sequence version? > As for the name, how about something like "iguspto"? Thanks. I may just use USPTO but it's not important. > Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? Yes ... it is a fairly straightforward extension to dbxflat so I could send you a copy but for general release I would prefer to distribute it only from 6.5 onwards. regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 15:23:59 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 11:23:59 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059E147.7000807@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> Message-ID: Dear Peter, At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? Thank you, Daniel On Sep 19, 2012, at 11:14 AM, "Peter Rice" wrote: > Dear Daniel, > > On 19/09/2012 14:49, Rozenbaum, Daniel (Biocceleration Inc) wrote: > >> I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable. > > Thanks I'll take a look. > > We usually index an "access number" in addition to the identifier. Is > there some significance in the parts of the id naming that could be used > as an accession or a sequence version? > >> As for the name, how about something like "iguspto"? > > Thanks. I may just use USPTO but it's not important. > >> Lastly, do you think the patch with this change would be made available for EMBOSS 6.4? > > Yes ... it is a fairly straightforward extension to dbxflat so I could > send you a copy but for general release I would prefer to distribute it > only from 6.5 onwards. > > regards, > > Peter Rice > EMBOSS Team > From ricepeterm at yahoo.co.uk Wed Sep 19 15:34:32 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 16:34:32 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> Message-ID: <5059E608.2040600@yahoo.co.uk> On 19/09/2012 16:23, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? No, it is not a problem. You only need to tell the database definition it has no accession (but perhaps the patent number could be used as an accession) regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 19 16:12:58 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 19 Sep 2012 12:12:58 -0400 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: <5059E608.2040600@yahoo.co.uk> References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> <5059E608.2040600@yahoo.co.uk> Message-ID: Right - unfortunately all the other fields, while appearing well structured and nicely formatted in the example I sent, may or may not be present (or present but poorly formatted due to legacy issues) in the general case. And the patent number may not be present in the data representing patent applications that are still pending review. Many thanks, Daniel -----Original Message----- From: Peter Rice [mailto:ricepeterm at yahoo.co.uk] Sent: Wednesday, September 19, 2012 11:35 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Support for multi-line annotation in ig format On 19/09/2012 16:23, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > At least within the context of USPTO the sequence identifier is the only consistently present piece of information that uniquely identifies the sequence. Does the absence of an accession number field make the task of adding support for this in EMBOSS more complex? No, it is not a problem. You only need to tell the database definition it has no accession (but perhaps the patent number could be used as an accession) regards, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 19 16:27:37 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 19 Sep 2012 17:27:37 +0100 Subject: [EMBOSS] Support for multi-line annotation in ig format In-Reply-To: References: <288765C7-5A75-47B6-9D97-239732C0EA6A@USPTO.GOV> <5059A2EF.7020504@yahoo.co.uk> <5059E147.7000807@yahoo.co.uk> <5059E608.2040600@yahoo.co.uk> Message-ID: <5059F279.6080405@yahoo.co.uk> On 19/09/2012 17:12, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Right - unfortunately all the other fields, while appearing well structured and nicely formatted in the example I sent, may or may not be present (or present but poorly formatted due to legacy issues) in the general case. And the patent number may not be present in the data representing patent applications that are still pending review. Thanks. That at least keeps things simple! regards, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Thu Sep 20 15:30:07 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Thu, 20 Sep 2012 11:30:07 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss Message-ID: Hi everyone, I'm trying to figure out whether this issue I'm dealing with is a bug in Jemboss or a bug in my understanding :-) . I have EMBOSS 6.4.0 with Jemboss 1.5 installed in client-server mode. The server is a Linux box, and Jemboss is started via Java Web Start on a Windows box. Then I follow this sequence of steps: 1. Run "makeprotseq" interactively in Jemboss to generate a few random sequences. This works fine. When the "Saved Results" window opens, I use "File --> Save to Local File" to save the result as "C:\TEMP\sequences.fasta" 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: Error: Failed to open filename 'C' Error: Unable to read sequence 'C:\TEMP\sequences.fasta' Died: seqret terminated: Bad value for '-sequence' with -auto defined Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: EMBOSS run details Application: seqret -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta Started at Thu Sep 20 11_15_22 EDT 2012 Input files: /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. Does this make sense? Any advice would be greatly appreciated! With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From uludag at ebi.ac.uk Thu Sep 20 22:00:18 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Thu, 20 Sep 2012 23:00:18 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: Message-ID: <505B91F2.7060709@ebi.ac.uk> Hi Daniel, When we were implementing the array representation of command lines we mistakenly added the input sequence file names to the command line array prepared on the client side. As you showed in your example these inputs are added to the command line on the server side using their final file names. It looks following changes fixes the problem in my 6.4.0 test server and current CVS code base client. Index: org/emboss/jemboss/gui/form/BuildJembossForm.java =================================================================== RCS file: /home/repository/emboss/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java,v retrieving revision 1.113 diff -u -r1.113 BuildJembossForm.java --- org/emboss/jemboss/gui/form/BuildJembossForm.java 29 Jun 2011 14:12:48 -0000 1.113 +++ org/emboss/jemboss/gui/form/BuildJembossForm.java 20 Sep 2012 21:28:14 -0000 @@ -1113,12 +1113,11 @@ fn = fn.trim(); - optionsA.add("-" + val); - optionsA.add(fn); - if(withSoap) options = filesForSoap(fn,options,val,filesToMove); else { + optionsA.add("-" + val); + optionsA.add(fn); fn = addQuote(fn); options = options.concat(" -" + val + " " + fn); } Can you please try applying the above change to your 6.4.0 installation. In 6.4.0 deleted optionsA.add() lines are the lines 1116 and 1117. Since jemboss has relatively complex installation mechanism, it might be easier if you apply this change to a freshly extracted tar ball and make a new installation. Make sure jawa web start doesn't use the cached version of the jemboss client but uses the updated version. Regards, Mahmut > 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. > > 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: > Error: Failed to open filename 'C' > Error: Unable to read sequence 'C:\TEMP\sequences.fasta' > Died: seqret terminated: Bad value for '-sequence' with -auto defined > Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: > EMBOSS run details > > Application: seqret > -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta > Started at Thu Sep 20 11_15_22 EDT 2012 > > Input files: > /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta > It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. > From daniel.rozenbaum at USPTO.GOV Fri Sep 21 03:06:41 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Thu, 20 Sep 2012 23:06:41 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505B91F2.7060709@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> Message-ID: Hi Mahmut, I've applied the patch you suggested and it seems to have fixed the problem with supplying local sequence files. However, now it appears that supplying a remote file as input no longer works, and neither does using list files (either local or remote). When I try to use a local list file, the file does make it to server side, but the command line appearing in .desc doesn't have the '@' before the file name, and that seems to be the reason for the job failure. When I try to use a remote file, the generated command line doesn't contain a reference to the input file at all, and the resultant error message reads "Error: Unable to read sequence '' ". Can you reproduce these, or did I mess something up during my attempt to apply the patch and reinstall? Many thanks in advance, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Thursday, September 20, 2012 6:00 PM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, When we were implementing the array representation of command lines we mistakenly added the input sequence file names to the command line array prepared on the client side. As you showed in your example these inputs are added to the command line on the server side using their final file names. It looks following changes fixes the problem in my 6.4.0 test server and current CVS code base client. Index: org/emboss/jemboss/gui/form/BuildJembossForm.java =================================================================== RCS file: /home/repository/emboss/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java,v retrieving revision 1.113 diff -u -r1.113 BuildJembossForm.java --- org/emboss/jemboss/gui/form/BuildJembossForm.java 29 Jun 2011 14:12:48 -0000 1.113 +++ org/emboss/jemboss/gui/form/BuildJembossForm.java 20 Sep 2012 21:28:14 -0000 @@ -1113,12 +1113,11 @@ fn = fn.trim(); - optionsA.add("-" + val); - optionsA.add(fn); - if(withSoap) options = filesForSoap(fn,options,val,filesToMove); else { + optionsA.add("-" + val); + optionsA.add(fn); fn = addQuote(fn); options = options.concat(" -" + val + " " + fn); } Can you please try applying the above change to your 6.4.0 installation. In 6.4.0 deleted optionsA.add() lines are the lines 1116 and 1117. Since jemboss has relatively complex installation mechanism, it might be easier if you apply this change to a freshly extracted tar ball and make a new installation. Make sure jawa web start doesn't use the cached version of the jemboss client but uses the updated version. Regards, Mahmut > 2. Open "seqret" in Jemboss, and use the "Browse files" option to select "C:\TEMP\sequences.fasta" as input. > > 3. Run seqret interactively. The Saved Results window that opens contains the following error messages: > Error: Failed to open filename 'C' > Error: Unable to read sequence 'C:\TEMP\sequences.fasta' > Died: seqret terminated: Bad value for '-sequence' with -auto defined > Looking in the subdirectory created for this job on the server side, it does contain a file "C__TEMP__sequences.fasta" with the correct contents. But the ".desc" file in that directory reads the following: > EMBOSS run details > > Application: seqret > -nofeature -sequence C:\TEMP\sequences.fasta -nofirstonly -auto C__TEMP_sequences.fasta > Started at Thu Sep 20 11_15_22 EDT 2012 > > Input files: > /usr/local/emboss/results/username/seqret_Thu_Sep_20_11_15_22_EDT_2012_1234/C__TEMP_sequences.fasta > It appears therefore that the command line, instead of using the path to the server-side copy of the input file, still uses the path to the file on the client. > From uludag at ebi.ac.uk Fri Sep 21 11:07:05 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 12:07:05 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> Message-ID: <505C4A59.2050207@ebi.ac.uk> Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From daniel.rozenbaum at USPTO.GOV Fri Sep 21 12:36:50 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 08:36:50 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C4A59.2050207@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: Hi Mahmut, No apologies necessary, and your prompt response is sincerely appreciated! I have rebuilt Jemboss.jar with these two new versions and everything seems to be working as expected. Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 7:07 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From uludag at ebi.ac.uk Fri Sep 21 12:47:32 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 13:47:32 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C61E4.1070001@ebi.ac.uk> Hi Daniel, Thanks for the update. Good to hear that changes working as expected. Mahmut From daniel.rozenbaum at USPTO.GOV Fri Sep 21 14:13:39 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 10:13:39 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C4A59.2050207@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: Hi Mahmut, I just noticed that the most recent changes might have introduced the following glitch: server side directories for Jemboss jobs on the server now appear to be created under the path that is a concatenation of paths "results.home" and "embossBin" in jemboss.properties. In other words if results.home is /path/to/emboss_results/ and embossBin is /usr/local/emboss/6.4.0/bin/ , I'm now getting Jemboss job directories created under /path/to/emboss_results/usr/local/emboss/6.4.0/bin/ . Could you please let me know if you're able to repdocude this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 7:07 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > supplying a remote file as input no longer works, and neither > does using list files (either local or remote). > > When I try to use a local list file, the file does make it to server > side, but the command line appearing in .desc doesn't have the '@' > before the file name, and that seems to be the reason for the job > failure. > > When I try to use a remote file, the generated command line doesn't > contain a reference to the input file at all, and the resultant error > message reads "Error: Unable to read sequence '' ". > > Can you reproduce these, or did I mess something up during my attempt > to apply the patch and reinstall? It was my mistake, apologies. When I initially looked to the problem I noticed possible changes may include/interfere-with handling of list-files and remote-files but in my last look I just forgot this point. I now have tried fixing it but this has required some changes on the server side as well. Can you download latest versions of the following two files and test again. http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/gui/form/BuildJembossForm.java?view=log http://code.open-bio.org/emboss/emboss/jemboss/org/emboss/jemboss/server/JembossServer.java?view=log (These 2 files has not changed much recently, as far as i can see they are happy to work as part of 6.4.0) Regards, Mahmut From uludag at ebi.ac.uk Fri Sep 21 14:17:50 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 15:17:50 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C770E.6000705@ebi.ac.uk> Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" in > jemboss.properties. I just checked that I have the same problem in my Jemboss jobs folder. this should be easy to fix. I will return soon. Mahmut From uludag at ebi.ac.uk Fri Sep 21 14:58:58 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Fri, 21 Sep 2012 15:58:58 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> Message-ID: <505C80B2.1080800@ebi.ac.uk> Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From daniel.rozenbaum at USPTO.GOV Sat Sep 22 02:26:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 22:26:32 -0400 Subject: [EMBOSS] Jemboss refusing to work with files containing character ^L (form feed, 0x0c) Message-ID: Hi all, It appears that Jemboss objects to working with files containing the form feed character (ASCII code 0x0c, displayed in some text editors as "^L"). Jemboss refuses to open such files when I double-click on them in File Manager, and it also refuses to, for example, copy from Remote to Local in File Manager. I'm attaching an example of a file that reproduces this problem in my environment. Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: text_with_linefeed.txt URL: From daniel.rozenbaum at USPTO.GOV Sat Sep 22 02:55:43 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Fri, 21 Sep 2012 22:55:43 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Message-ID: Greetings, We had run into the following problem: if Jemboss results directory (results.home in jemboss.properties) is a directory on NFS, Jemboss is unable to delete job directories, either from Saved Results or from File Manager (in the case of the latter I first delete the files in the directory from the File Manager - no problem there). Here're the steps that reproduced the problem in our environment: 1. Run say "makeprotseq" in Jemboss. The resultant job directory contains files makeseq.fasta, .desc, and .finished. 2. Open File Manager, browse to the job directory in the Remote pane, and delete the file "makeseq.fasta". The file disappears, but when checking the contents of the directory on the server, it turns out that even though "makeseq.fasta" is no longer there, there's a file ".nfs0000000". Applying "lsof" on the file reveals that the it is being kept open by a "java" process corresponding to the Tomcat server. 3. Attempt to delete the directory from Jemboss File Manager. The directory disappears in File Manager view, but a check on the server side reveals that it's still there. Inside the directory, files .desc and .finished are gone but the file .nfs000000... is still there. After a "refresh" in Jemboss File Manager the directory reappears. It was our understanding that the appearance of these ".nfs00000..." files after the attempt to delete the result files was the standard behavior of NFS when there's an attempt to delete an open file. People smarter than me then were able to resolve the problem by adding calls "in.close()" after the "while" loops in the following methods in JembossServer.java show_acd show_saved_result list_saved_results loadFilesContent (in two places) update_result_status We would really appreciate if you could please review these changes and confirm whether they're indeed correct and necessary. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From daniel.rozenbaum at USPTO.GOV Sun Sep 23 04:48:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 00:48:32 -0400 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results Message-ID: Hi, We currently have EMBOSS built and installed without PNG/GD support. When we run "pepwindow" and select "Jemboss Graphics" as the graph format, the file pepwindow1.dat is, from what we're able to tell, correctly created, and Jemboss displays the graph in the Saved Results window, from which it is possible to save the graph to a local PNG image file. However, subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From daniel.rozenbaum at USPTO.GOV Sun Sep 23 04:17:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 00:17:32 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: My apologies - the part of the previous email where I described the modifications we made that appear to have helped resolve this problem was completely incorrect. I'm attaching a version of JembossServer.java (originally from the 6.4.0 distribution) with our modifications preceded with comments starting with "// Biocceleration: " . To reiterate, these modifications were mostly guesswork - trying to identify possible places where an explicit close() could have helped resolve the issue. ________________________________________ From: emboss-bounces at lists.open-bio.org [emboss-bounces at lists.open-bio.org] On Behalf Of Rozenbaum, Daniel (Biocceleration Inc) Sent: Friday, September 21, 2012 10:55 PM To: emboss at lists.open-bio.org Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Greetings, We had run into the following problem: if Jemboss results directory (results.home in jemboss.properties) is a directory on NFS, Jemboss is unable to delete job directories, either from Saved Results or from File Manager (in the case of the latter I first delete the files in the directory from the File Manager - no problem there). Here're the steps that reproduced the problem in our environment: 1. Run say "makeprotseq" in Jemboss. The resultant job directory contains files makeseq.fasta, .desc, and .finished. 2. Open File Manager, browse to the job directory in the Remote pane, and delete the file "makeseq.fasta". The file disappears, but when checking the contents of the directory on the server, it turns out that even though "makeseq.fasta" is no longer there, there's a file ".nfs0000000". Applying "lsof" on the file reveals that the it is being kept open by a "java" process corresponding to the Tomcat server. 3. Attempt to delete the directory from Jemboss File Manager. The directory disappears in File Manager view, but a check on the server side reveals that it's still there. Inside the directory, files .desc and .finished are gone but the file .nfs000000... is still there. After a "refresh" in Jemboss File Manager the directory reappears. It was our understanding that the appearance of these ".nfs00000..." files after the attempt to delete the result files was the standard behavior of NFS when there's an attempt to delete an open file. People smarter than me then were able to resolve the problem by adding calls "in.close()" after the "while" loops in the following methods in JembossServer.java show_acd show_saved_result list_saved_results loadFilesContent (in two places) update_result_status We would really appreciate if you could please review these changes and confirm whether they're indeed correct and necessary. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss -------------- next part -------------- A non-text attachment was scrubbed... Name: JembossServer.java_added_close_attempts Type: application/octet-stream Size: 31747 bytes Desc: JembossServer.java_added_close_attempts URL: From uludag at ebi.ac.uk Sun Sep 23 08:50:23 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 09:50:23 +0100 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: <505ECD4F.2020202@ebi.ac.uk> Hi Daniel, Thanks for the patch. I will apply your changes to the latest CVS version of the file. If I don't misread your email, problem may have stayed because we have more streams not closed in other jemboss classes. The ones in JembossFileServer get/put_file methods would be the main reason for the .nfs files you reported. I remember seeing this problem long time ago but didn't manage to find out its reason, i think i probably blamed nfs and didn't spend time to check whether we might have anything wrong in jemboss. Mahmut > the part of the previous email where I described the modifications we made that appear to have helped resolve this problem was completely incorrect. I'm attaching a version of JembossServer.java (originally from the 6.4.0 distribution) with our modifications preceded with comments starting with "// Biocceleration: " . To reiterate, these modifications were mostly guesswork - trying to identify possible places where an explicit close() could have helped resolve the issue. From uludag at ebi.ac.uk Sun Sep 23 13:33:36 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 14:33:36 +0100 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: References: Message-ID: <505F0FB0.7030208@ebi.ac.uk> Hi Daniel, I was able to produce .nfs files by following the steps you described. Closing the open streams in JembossServer class looks fixes the problem. I don't understand why you thought your modifications were completely incorrect? I checked in you patch to CVS together with a fix to allow jemboss file manager to delete remote folders when they have hidden files like .desc and .finished. Mahmut From uludag at ebi.ac.uk Sun Sep 23 15:44:03 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Sun, 23 Sep 2012 16:44:03 +0100 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results In-Reply-To: References: Message-ID: <505F2E43.7020700@ebi.ac.uk> Hi Daniel, > subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. I'm not able to reproduce it with J/Emboss 6.5 client. I will later check with J/Emboss 6.4. > The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. I have tried updating the ShowResultSet class in package org.emboss.jemboss.gui to fix this problem. It looks working for me. > Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? No. After above modification I'm able to see the images displayed. I will next try to understand the string-encoding related problem you posted. Mahmut From daniel.rozenbaum at USPTO.GOV Sun Sep 23 17:52:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 13:52:35 -0400 Subject: [EMBOSS] Jemboss: accessing saved pepwindow results In-Reply-To: <505F2E43.7020700@ebi.ac.uk> References: , <505F2E43.7020700@ebi.ac.uk> Message-ID: Hi Mahmut, Fantastic, thanks! After rebuilding with the most recent ShowResultSet.java, double-clicking on a pepwindow saved result, or on pepwindow1.dat in File Manager now displays the graph as expected. With best regards, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Sunday, September 23, 2012 11:44 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Jemboss: accessing saved pepwindow results Hi Daniel, > subsequent attempts to view the saved "pepwindow" results fail: when double-clicked on in the "Saved results list on server" window, a "Saved Results" window opens with a single tab named "pepwindow1.dat" in the tab , but the window is blank. I'm not able to reproduce it with J/Emboss 6.5 client. I will later check with J/Emboss 6.4. > The same thing happens when double-clicking on "pepwindow1.dat" in the Remote pan of File Manager. I have tried updating the ShowResultSet class in package org.emboss.jemboss.gui to fix this problem. It looks working for me. > Were we incorrect in expecting that calling up "pepwindow1.dat" would either display the image using Jemboss Graphics again or show the file contents as text? No. After above modification I'm able to see the images displayed. I will next try to understand the string-encoding related problem you posted. Mahmut From daniel.rozenbaum at USPTO.GOV Sun Sep 23 17:56:32 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Sun, 23 Sep 2012 13:56:32 -0400 Subject: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS In-Reply-To: <505F0FB0.7030208@ebi.ac.uk> References: , <505F0FB0.7030208@ebi.ac.uk> Message-ID: Hi Mahmut, Sorry about that confusion - the file I attached to my previous email in this thread did contain the correct fixes. After rebuilding with the latest version of JembossServer.java, everything appears to be working properly. -Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Sunday, September 23, 2012 9:33 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Jemboss unable to delete job directories if results.home is on NFS Hi Daniel, I was able to produce .nfs files by following the steps you described. Closing the open streams in JembossServer class looks fixes the problem. I don't understand why you thought your modifications were completely incorrect? I checked in you patch to CVS together with a fix to allow jemboss file manager to delete remote folders when they have hidden files like .desc and .finished. Mahmut From daniel.rozenbaum at USPTO.GOV Mon Sep 24 16:13:56 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 12:13:56 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <505C80B2.1080800@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk> Message-ID: Hi Mahmut, After re-building with the latest versions of gui/form/BuildJembossForm.java (1.116) and server/JembossServer.java (1.47) I'm running into a problem with adding ":sequence_name" to local files. It is reproducible in my environment using the following steps: 1. Run makeprotseq to generate say 10 sequences 2. Call up seqret and drag-and-drop the file "makeseq.fasta" generated at the previous step from the File Manager window to the "Sequence Filename" field of seqret. 3. Append ":EMBOSS_003". So the full string in the "Sequence Filename" field looks like /path/emboss/results/username/makeprotseq_Mon_Sep_24_12_04_32_EDT_2012_51429/makeseq.fasta:EMBOSS_003 4. Execute seqret. Everything works as expected. 5. Drag-and-drop the file "makeseq.fasta" to a local folder 6. Drag-and-drop the file from the Local pane to seqret input, and add the same sequence id. So the full string in the "Sequence Filename" field in my case (a Windows client) looks something like H:\tmp\makeseq.fasta:EMBOSS_003 7. When seqret is run, the following error message appears: Error: Failed to open filename 'H' Error: Unable to read sequence 'H:\tmp\makeseq.fasta:EMBOSS_003' Died: seqret terminated: Bad value for '-sequence' with -auto defined Would you be able to look into this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 10:58 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From daniel.rozenbaum at USPTO.GOV Mon Sep 24 20:07:22 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 16:07:22 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, Message-ID: Mahmut, please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. ________________________________________ From: Rozenbaum, Daniel (Biocceleration Inc) Sent: Monday, September 24, 2012 12:13 PM To: Mahmut Uludag Cc: emboss at lists.open-bio.org Subject: RE: [EMBOSS] Handling of local file input in Jemboss Hi Mahmut, After re-building with the latest versions of gui/form/BuildJembossForm.java (1.116) and server/JembossServer.java (1.47) I'm running into a problem with adding ":sequence_name" to local files. It is reproducible in my environment using the following steps: 1. Run makeprotseq to generate say 10 sequences 2. Call up seqret and drag-and-drop the file "makeseq.fasta" generated at the previous step from the File Manager window to the "Sequence Filename" field of seqret. 3. Append ":EMBOSS_003". So the full string in the "Sequence Filename" field looks like /path/emboss/results/username/makeprotseq_Mon_Sep_24_12_04_32_EDT_2012_51429/makeseq.fasta:EMBOSS_003 4. Execute seqret. Everything works as expected. 5. Drag-and-drop the file "makeseq.fasta" to a local folder 6. Drag-and-drop the file from the Local pane to seqret input, and add the same sequence id. So the full string in the "Sequence Filename" field in my case (a Windows client) looks something like H:\tmp\makeseq.fasta:EMBOSS_003 7. When seqret is run, the following error message appears: Error: Failed to open filename 'H' Error: Unable to read sequence 'H:\tmp\makeseq.fasta:EMBOSS_003' Died: seqret terminated: Bad value for '-sequence' with -auto defined Would you be able to look into this? Many thanks, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Friday, September 21, 2012 10:58 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss Hi Daniel, > server side directories for Jemboss jobs on the > server now appear to be created under the path that is a > concatenation of paths "results.home" and "embossBin" I have checked in a fix for this problem in JembossServer class. In Soaplab and jdispatcher projects we don't hide the full path of the program executed. While working on the previous problem I thought we can do the same in Jemboss. Although I was not quite sure with it I just made that change. Obviously I didn't made it in the correct way. I now have undone it but we can add this feature properly if it is desirable. I have also checked in a fix in BuildJembossForm class as the recent fix did also broke the inputs through copy/paste form. Mahmut From uludag at ebi.ac.uk Mon Sep 24 20:17:03 2012 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Mon, 24 Sep 2012 21:17:03 +0100 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, Message-ID: <5060BFBF.7090608@ebi.ac.uk> > please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. Confused again. I was able to reproduce the problem following the steps you described and just checked in a fix in BuildJembossForm class. Have you tried updating from CVS before your last rebuild? Since your email is about 4 minutes later than I checked in above change to CVS. Mahmut From daniel.rozenbaum at USPTO.GOV Tue Sep 25 02:18:34 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Mon, 24 Sep 2012 22:18:34 -0400 Subject: [EMBOSS] Handling of local file input in Jemboss In-Reply-To: <5060BFBF.7090608@ebi.ac.uk> References: , <505B91F2.7060709@ebi.ac.uk> , <505C4A59.2050207@ebi.ac.uk> , <505C80B2.1080800@ebi.ac.uk>, , <5060BFBF.7090608@ebi.ac.uk> Message-ID: Hi Mahmut, Thank you for being able to find the kernel of reason in my continued attempts to confuse this discussion :-) I rebuilt everything again with today's fix in BuildJembossForm, and everything seems to be working fine. With best regards, Daniel ________________________________________ From: Mahmut Uludag [uludag at ebi.ac.uk] Sent: Monday, September 24, 2012 4:17 PM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Handling of local file input in Jemboss > please disregard my previous email - I must have made a mistake in my rebuild. After rebuilding more carefully again, the problem seems to have disappeared. My apologies for the confusion. Confused again. I was able to reproduce the problem following the steps you described and just checked in a fix in BuildJembossForm class. Have you tried updating from CVS before your last rebuild? Since your email is about 4 minutes later than I checked in above change to CVS. Mahmut From daniel.rozenbaum at USPTO.GOV Tue Sep 25 11:48:35 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 25 Sep 2012 07:48:35 -0400 Subject: [EMBOSS] protein three-to-one and one-to-three Message-ID: Hi, Our users are interested in having access to "three-to-one" and "one-to-thee" amino acid representation conversion utilities similar to these: http://bioinformatics.org/sms2/three_to_one.html http://bioinformatics.org/sms2/one_to_three.html >From what I've been able to tell, the latter is achievable with "showpep -three" (even though the users would have preferred a horizontal representation of the three-letter codes), and the only relevant library function currently available in EMBOSS is the embPropCharToThree() ; is this correct? Just wanted to make sure that I'm not missing anything if we decided to take a shot at developing such utilities within the EMBOSS framework. With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From ricepeterm at yahoo.co.uk Tue Sep 25 12:14:40 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Tue, 25 Sep 2012 13:14:40 +0100 Subject: [EMBOSS] protein three-to-one and one-to-three In-Reply-To: References: Message-ID: <5061A030.5010809@yahoo.co.uk> Dear Daniel, On 25/09/2012 12:48, Rozenbaum, Daniel (Biocceleration Inc) wrote:> Hi, > > Our users are interested in having access to "three-to-one" and "one-to-thee" amino acid representation conversion utilities similar to these: > > http://bioinformatics.org/sms2/three_to_one.html > http://bioinformatics.org/sms2/one_to_three.html > >From what I've been able to tell, the latter is achievable with "showpep -three" (even though the users would have preferred a horizontal representation of the three-letter codes), and the only relevant library function currently available in EMBOSS is the embPropCharToThree() ; is this correct? those Much discussed in the early days when we decided to offer 3-to-1 for who found the letters hard to read but not the reverse direction because a 3-letter protein sequence can also be a valid 3x longer 1-letter protein sequence. > Just wanted to make sure that I'm not missing anything if we decided > to take a shot at developing such utilities within the EMBOSS > framework. Easy enough to develop, but I would suggest using some output format that makes it clear it is not a standard sequence format - otherwise EMBOSS (e.g. seqret) will try to read it as some other format and claim success. If your 3-letter output is unreadable then you can safely try implementing it as an input format. regards, Peter Rice From daniel.rozenbaum at USPTO.GOV Tue Sep 25 21:04:38 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Tue, 25 Sep 2012 17:04:38 -0400 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg Message-ID: Hi everyone, It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? Is dreg/preg the appropriate alternative here? Many thanks, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 From ricepeterm at yahoo.co.uk Wed Sep 26 09:54:54 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 26 Sep 2012 10:54:54 +0100 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: References: Message-ID: <5062D0EE.5060907@yahoo.co.uk> Dear Daniel, On 25/09/2012 22:04, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi everyone, > > It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? It is not possible, brackets are not allowed in fuzz* patterns. The repeats are for single residues or bases, usually used for unknowns to give a variable gap between known residues or bases (e.g. for protein active sites). > Is dreg/preg the appropriate alternative here? Yes. This example is one of the reasons we wrote them. Hope this helps, Peter Rice EMBOSS Team From daniel.rozenbaum at USPTO.GOV Wed Sep 26 11:57:28 2012 From: daniel.rozenbaum at USPTO.GOV (Rozenbaum, Daniel (Biocceleration Inc)) Date: Wed, 26 Sep 2012 07:57:28 -0400 Subject: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: <5062D0EE.5060907@yahoo.co.uk> References: , <5062D0EE.5060907@yahoo.co.uk> Message-ID: Dear Peter, Great, thanks! One quick follow-up question: is it possible to request dreg/preg to report just those sequences where a match is found? Fuzznuc/pro seem to be working like that, but I haven't been able to figure out how to achieve that with dreg/preg. With best regards, Daniel ________________________________________ From: Peter Rice [ricepeterm at yahoo.co.uk] Sent: Wednesday, September 26, 2012 5:54 AM To: Rozenbaum, Daniel (Biocceleration Inc) Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] Repetition pattern in fuzznuc/fuzzpro or dreg/preg Dear Daniel, On 25/09/2012 22:04, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Hi everyone, > > It looks like it isn't possible to specify a GCG findpatterns style pattern "(GSG){1,10}" ("GCG" repeating 1 to 10 times) in fuzznuc or fuzzpro, is it? It is not possible, brackets are not allowed in fuzz* patterns. The repeats are for single residues or bases, usually used for unknowns to give a variable gap between known residues or bases (e.g. for protein active sites). > Is dreg/preg the appropriate alternative here? Yes. This example is one of the reasons we wrote them. Hope this helps, Peter Rice EMBOSS Team From ricepeterm at yahoo.co.uk Wed Sep 26 12:40:23 2012 From: ricepeterm at yahoo.co.uk (Peter Rice) Date: Wed, 26 Sep 2012 13:40:23 +0100 Subject: [EMBOSS] Fwd: Re: Repetition pattern in fuzznuc/fuzzpro or dreg/preg In-Reply-To: <5062F70C.5090601@yahoo.co.uk> References: <5062F70C.5090601@yahoo.co.uk> Message-ID: <5062F7B7.1080506@yahoo.co.uk> On 26/09/2012 12:57, Rozenbaum, Daniel (Biocceleration Inc) wrote: > Dear Peter, > > Great, thanks! One quick follow-up question: is it possible to request dreg/preg to report just those sequences where a match is found? Fuzznuc/pro seem to be working like that, but I haven't been able to figure out how to achieve that with dreg/preg. When any program writes a report, one option is to ask for it in -rformat listfile which reports the USAs of the matches. Good point though about the difference between outputs, something nobody had pointed out before. The fuzz programs came first so they should be the 'standard' and for the next release we will modify dreg and preg to only report sequences where a match is found. regards, Peter Rice EMBOSS Team