From s.newslists at gmail.com Sun Dec 4 05:25:56 2011 From: s.newslists at gmail.com (Stefan) Date: Sun, 4 Dec 2011 11:25:56 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4E159971.9070509@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> Message-ID: 2011/7/7 Peter Rice : > Very close to release date next week, so hard to do anything immediately. Hello to the list, are there any news in plasmid documentation and in-silico cloning with EMBOSS? In the last month I was asked 4 times if this is possible with EMBOSS. People like it to work with EMBOSS and want to do that things also with this suite. I would be happy to hear any progress. Thanks Stefan From pmr at ebi.ac.uk Sun Dec 4 15:22:12 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Sun, 04 Dec 2011 20:22:12 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> Message-ID: <4EDBD674.9050409@ebi.ac.uk> On 04/12/2011 10:25, Stefan wrote: > 2011/7/7 Peter Rice: >> Very close to release date next week, so hard to do anything immediately. > > are there any news in plasmid documentation and in-silico cloning with > EMBOSS? In the last month I was asked 4 times if this is possible with > EMBOSS. People like it to work with EMBOSS and want to do that things > also with this suite. Can you give us some examples of what you would like to see? Examples always help us to design new applicatons. We still only have cirdna, mainly because we are limited by the plplot graphics library (and would welcome suggestions of other graphics libraries we could try). We have extended the capabilities by creating input files from some other applications. regards, Peter Rice EMBOSS team From pmr at ebi.ac.uk Mon Dec 5 04:26:00 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 09:26:00 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDC8BDA.30006@fmi.ch> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> Message-ID: <4EDC8E28.7090702@ebi.ac.uk> On 12/05/2011 09:16 AM, Hans-Rudolf Hotz wrote: > Thanks to the Galaxy framework, our lab scientist are using more and > more EMBOSS tools. Now, it would be very handy if there was an EMBOSS > drawing tool (even a very simple one) which takes an embl or genbank > file as input and creates something "colorful". By default each Key or > Qualifier is given a specific color and shape (eg 'arrow'). And you can > change them as options. Interesting suggestion. Can you send an example of what you would like to see? You can use your favourite drawing tool if you like, but it is OK if you draw it in crayon, scan it and send the image :-) regards, Peter Rice EMBOSS team From hrh at fmi.ch Mon Dec 5 04:16:10 2011 From: hrh at fmi.ch (Hans-Rudolf Hotz) Date: Mon, 5 Dec 2011 10:16:10 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDBD674.9050409@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> Message-ID: <4EDC8BDA.30006@fmi.ch> On 12/04/2011 09:22 PM, Peter Rice wrote: > On 04/12/2011 10:25, Stefan wrote: >> 2011/7/7 Peter Rice: >>> Very close to release date next week, so hard to do anything >>> immediately. >> >> are there any news in plasmid documentation and in-silico cloning with >> EMBOSS? In the last month I was asked 4 times if this is possible with >> EMBOSS. People like it to work with EMBOSS and want to do that things >> also with this suite. > > Can you give us some examples of what you would like to see? Examples > always help us to design new applicatons. > Hi Peter and Stefan Please allow me to jump in and tell you about our situation: Our wet lab scientist use depending (which lab/university they are coming from) a combination of old commercial products (which we can still use thank to the perpetual licenses) and free/open source products like 'Serial Cloner', 'Ape', 'Gentle', etc I constantly 'preach' how vital it is to make sure you safe each sequence/plasmid/etc not only in the tool specific format, but also in a text format like embl or genbank, where all the annotation is stored in the Feature Table Thanks to the Galaxy framework, our lab scientist are using more and more EMBOSS tools. Now, it would be very handy if there was an EMBOSS drawing tool (even a very simple one) which takes an embl or genbank file as input and creates something "colorful". By default each Key or Qualifier is given a specific color and shape (eg 'arrow'). And you can change them as options. I hope this works as a use case? Regards, Hans Hans-Rudolf Hotz, PhD Bioinformatics Support Friedrich Miescher Institute for Biomedical Research Maulbeerstrasse 66 4058 Basel/Switzerland > We still only have cirdna, mainly because we are limited by the plplot > graphics library (and would welcome suggestions of other graphics > libraries we could try). > > We have extended the capabilities by creating input files from some > other applications. > > regards, > > Peter Rice > EMBOSS team > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From p.j.a.cock at googlemail.com Mon Dec 5 05:08:59 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Dec 2011 10:08:59 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDC8E28.7090702@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> <4EDC8E28.7090702@ebi.ac.uk> Message-ID: On Mon, Dec 5, 2011 at 9:26 AM, Peter Rice wrote: > On 12/05/2011 09:16 AM, Hans-Rudolf Hotz wrote: >> >> Thanks to the Galaxy framework, our lab scientist are using more and >> more EMBOSS tools. Now, it would be very handy if there was an EMBOSS >> drawing tool (even a very simple one) which takes an embl or genbank >> file as input and creates something "colorful". By default each Key or >> Qualifier is given a specific color and shape (eg 'arrow'). And you can >> change them as options. > > Interesting suggestion. Can you send an example of what you > would like to see? > > You can use your favourite drawing tool if you like, but it is OK if > you draw it in crayon, scan it and send the image :-) > I use Biopython's GenomeDiagram for this sort of thing, which internally uses a Python library called ReportLab which can produce PDF, SVG, PNG, etc. Sadly I doubt that would be suitable for EMBOSS which would really want a C library. GenomeDiagram has the notion of tracks - which can be useful for separating different types of features which would otherwise overlap (e.g. gene/mRNA/CDS) and other tricks. It also has some defaults about which feature qualifiers to use as the feature name (e.g. gene, locustag). I have considered writing a Galaxy tool to take an EMBL or GenBank file and produce a picture, but haven't got round to it yet. Anyway, there may be some useful ideas for layout here if nothing else. Peter From pmr at ebi.ac.uk Mon Dec 5 05:19:41 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 10:19:41 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> <4EDC8E28.7090702@ebi.ac.uk> Message-ID: <4EDC9ABD.2020605@ebi.ac.uk> On 12/05/2011 10:08 AM, Peter Cock wrote: > I use Biopython's GenomeDiagram for this sort of thing, which > internally uses a Python library called ReportLab which can > produce PDF, SVG, PNG, etc. Sadly I doubt that would be > suitable for EMBOSS which would really want a C library. > > GenomeDiagram has the notion of tracks - which can be > useful for separating different types of features which would > otherwise overlap (e.g. gene/mRNA/CDS) and other tricks. > It also has some defaults about which feature qualifiers > to use as the feature name (e.g. gene, locustag). The EMBOSS solution to tracks would probably use the Sequence Ontology to define the tracks, and use features below some specified tag (or set of tags) for each track. We can apply the same tracks to showfeat (the text display of features) Tracks also simplify the issues of colours for features. lindna has code for rendering features withg scaling and avoiding overlaps which we could also, I hope, reuse. Maybe not for the next release (release date 15th January, but code freeze before Christmas) but we can provide applications for testing by anyone interested after the release. regards, Peter Rice EMBOSS Team From david.bauer at bayer.com Mon Dec 5 04:38:34 2011 From: david.bauer at bayer.com (david.bauer at bayer.com) Date: Mon, 5 Dec 2011 10:38:34 +0100 Subject: [EMBOSS] Plasmid drawing Message-ID: Hi Peter, I think Stefan means not only the drawing of plasmid maps but the creation of new constructs in the computer before doing it at the bench. This is a field where there are currently only commercial packages available like VectorNTI and Clone Manager ( http://www.scied.com/pr_cmpro.htm) etc. Clone manager is there since the times of MS-DOS but the prices have increased substantially. They used to advertise the software with the slogan: "It costs only as much as a cloning kit for the lab" - which was in the range of 300,-USD. The most important functions, which I think should be implement first are the cloning operations. So e.g. - take plasmid1, open the multi cloning site with BamHI and EcoRI - take plasmid2, cut out a fragment with BglII and EcoRI - ligate the fragment from plasmid2 into plasmid1 (BamHI and BglII have compatible ends which can be ligated, so an algorithm is needed to check this) - draw a map of the newly created plasmid3 This is a rather simple example. There can be more complex procedures like Klenow fill in (blunt end ligation of incompatible restriction sites) or the use of more than 2 fragments in one ligation. I think most of the functions needed for this are already present in EMBOSS. It shouldn't be so complicated to create an application which uses functions from restrict, cutseq, pasteseq to accomplish the above mentioned task. Although it's probably not so trivial to make this happen before the release next week ;-) Cheers, David. Peter Rice Gesendet von: emboss-bounces at lists.open-bio.org 04/12/2011 21:22 An Stefan Kopie emboss at lists.open-bio.org Thema Re: [EMBOSS] Plasmid drawing On 04/12/2011 10:25, Stefan wrote: > 2011/7/7 Peter Rice: >> Very close to release date next week, so hard to do anything immediately. > > are there any news in plasmid documentation and in-silico cloning with > EMBOSS? In the last month I was asked 4 times if this is possible with > EMBOSS. People like it to work with EMBOSS and want to do that things > also with this suite. Can you give us some examples of what you would like to see? Examples always help us to design new applicatons. We still only have cirdna, mainly because we are limited by the plplot graphics library (and would welcome suggestions of other graphics libraries we could try). We have extended the capabilities by creating input files from some other applications. regards, Peter Rice EMBOSS team _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From s.newslists at gmail.com Mon Dec 5 05:44:34 2011 From: s.newslists at gmail.com (Stefan) Date: Mon, 5 Dec 2011 11:44:34 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: > I think Stefan means not only the drawing of plasmid maps but the creation of new constructs in the computer before doing it at the bench. What David wrote is exactly what I wanted to explain with "in-silico cloning". With plasmid documentation I meant that it is possible to draw aplasmid with features and restriction sites. Thanks From pmr at ebi.ac.uk Mon Dec 5 08:23:23 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 13:23:23 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: <4EDCC5CB.2090402@ebi.ac.uk> On 05/12/2011 12:52, Hotz, Hans-Rudolf wrote: > "cirdna" and "lindna" are nice and do their job most of the time. In my > view, the problem is the input file format. Maybe a converter from > embl/genbank to *.crip or *.linp would be a good start? Or adding the those > formats to the output options of seqret? We do already support "draw" an a report output format. If we add it as a feature output format you can use featcopy (or seqret -feat) to create a cirdna or lindna input file. We can then work on the feature format to add extra information by feature type. We could also add an application to merge a set of feature inputs so you could combine genbank/embl files with restriction map output, then use the results as input. I will see what I can do for the next release. Thanks for the very helpful suggestions Peter Rice EMBOSS Team From marko at cryst.bioc.cam.ac.uk Mon Dec 5 08:10:06 2011 From: marko at cryst.bioc.cam.ac.uk (Marko Hyvonen) Date: Mon, 5 Dec 2011 13:10:06 +0000 (GMT) Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: I would find this kind of in silico cloning great addition to EMBOSS. As an example of (a free?) drawing program for plasmids, I use PlasMapper (http://wishart.biology.ualberta.ca/PlasMapper/ , source code is at least available on the site). Here is an example I made from their own selection of plasmids (hopefully valid link for a while still): http://wishart.biology.ualberta.ca/PlasMapper/jsp/displayPlasmidMap.jsp?fileName=plasMap26_1323082022185.png&fileFormat=png I like the fact that it includes a number of predefined features in plasmids as built-ins, so that you only need to include the "unique" features in your own plasmids (and I am sure the set of built-ins could be expanded much further). Outputs svg too, which is brilliant for editing later on. Marko On Mon, 5 Dec 2011, david.bauer at bayer.com wrote: > Hi Peter, > > I think Stefan means not only the drawing of plasmid maps but the creation > of new constructs in the computer before doing it at the bench. > This is a field where there are currently only commercial packages > available like VectorNTI and Clone Manager ( > http://www.scied.com/pr_cmpro.htm) etc. > Clone manager is there since the times of MS-DOS but the prices have > increased substantially. They used to advertise the software with the > slogan: "It costs only as much as a cloning kit for the lab" - which was > in the range of 300,-USD. > > The most important functions, which I think should be implement first are > the cloning operations. > So e.g. > - take plasmid1, open the multi cloning site with BamHI and EcoRI > - take plasmid2, cut out a fragment with BglII and EcoRI > - ligate the fragment from plasmid2 into plasmid1 > (BamHI and BglII have compatible ends which can be ligated, so an > algorithm is needed to check this) > - draw a map of the newly created plasmid3 > This is a rather simple example. > There can be more complex procedures like Klenow fill in (blunt end > ligation of incompatible restriction sites) or the use of more than 2 > fragments in one ligation. > > I think most of the functions needed for this are already present in > EMBOSS. > It shouldn't be so complicated to create an application which uses > functions from restrict, cutseq, pasteseq to accomplish the above > mentioned task. > Although it's probably not so trivial to make this happen before the > release next week ;-) > > Cheers, > David. > > > > > Peter Rice > Gesendet von: emboss-bounces at lists.open-bio.org > 04/12/2011 21:22 > > An > Stefan > Kopie > emboss at lists.open-bio.org > Thema > Re: [EMBOSS] Plasmid drawing > > > > > > > On 04/12/2011 10:25, Stefan wrote: >> 2011/7/7 Peter Rice: >>> Very close to release date next week, so hard to do anything > immediately. >> >> are there any news in plasmid documentation and in-silico cloning with >> EMBOSS? In the last month I was asked 4 times if this is possible with >> EMBOSS. People like it to work with EMBOSS and want to do that things >> also with this suite. > > Can you give us some examples of what you would like to see? Examples > always help us to design new applicatons. > > We still only have cirdna, mainly because we are limited by the plplot > graphics library (and would welcome suggestions of other graphics > libraries we could try). > > We have extended the capabilities by creating input files from some > other applications. > > regards, > > Peter Rice > EMBOSS team > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > _____________________________________ Marko Hyvonen Department of Biochemistry, University of Cambridge marko at cryst.bioc.cam.ac.uk http://www-cryst.bioc.cam.ac.uk/groups/hyvonen tel: +44-(0)1223-766 044 mobile: +44-(0)7796-174 877 fax: +44-(0)1223-766 002 -------------------------------------- From hmenager at pasteur.fr Mon Dec 5 11:14:20 2011 From: hmenager at pasteur.fr (=?ISO-8859-1?Q?Herv=E9_M=E9nager?=) Date: Mon, 05 Dec 2011 17:14:20 +0100 Subject: [EMBOSS] ORF selection with EMBOSS Message-ID: <4EDCEDDC.3020200@pasteur.fr> Hi, I have been using transeq and checktrans to find ORFs in DNA sequences. I have an question regarding the selection of the ORFs in case checktrans finds multiple ones in the result of transeq: I would like to select the longest one automatically (assuming it is the correct one). Is there any existing tool in EMBOSS (or alternatively in any Bio* library) that does this job? Cheers, Herv? From mathog at caltech.edu Mon Dec 5 11:35:31 2011 From: mathog at caltech.edu (mathog) Date: Mon, 05 Dec 2011 08:35:31 -0800 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> Message-ID: On Sun, 4 Dec 2011 11:25:56 +0100, Stefan wrote: > are there any news in plasmid documentation and in-silico cloning > with > EMBOSS? Pierre Lindenbaum wrote a nice little program called "cloneit" for figuring out cloning strategies. That is, specifically what series of reactions to use to insert a particular piece of DNA into a vector. Not sure if that is what you mean by "in-silico cloning" or not. There is a separate cgi version for web use. Here is the paper: http://www.ncbi.nlm.nih.gov/pubmed/9682060?dopt=Abstract The original distribution site is long since off line, but the code says that it may be redistributed (and incorporated in noncommercial software, but see the exact wording for details.) Anyway, if anybody wants to have a look, I packed up our copies and put them here: http://saf.bio.caltech.edu/pub/software/molbio/cloneit.tar.gz http://saf.bio.caltech.edu/pub/software/molbio/cloneitcgi.tar.gz This is not a graphics program, it is a reaction planning program. As for the graphical output of plasmid diagrams, historically none of the drawing programs does exactly what the end users want. (These get closer and closer, but never quite cover all the bases). For that reason it is really important that the graphics driver be able to output to an object format that can then be imported into a drawing program and edited there. Modifying an image never turns out as nicely. When going to object formats it is key that text remain text and not be converted into line segments or paths. Users get really frustrated when they can't edit labels or change fonts or font sizes. Locally we have a hacked up GCG driver that emits cgm format for the GCG drawing programs. Nowadays going to SVG would make more sense. While it is sometimes possible to read PDF back into a drawing program like inkscape, text imported that way is very hit and miss, mostly miss. Often it comes through with each letter as a separate text object, which is no fun at all to work with. Other times it will come in with each letter separately kerned, and when that kerning is removed, the text jumps all over the place in unpredictable ways. Why would you remove the kerning? Because there is another bug/limitation in inkscape where kerned text is automatically converted to images when exporting to emf or wmf format, as when trying to move that image into a powerpoint document. Moving rotated text between various programs is the most unreliable operation, as far as I can tell. For instance, I have never found a way to get text which runs at an angle other than 0 degrees from inskcape into powerpoint with that angle intact. The text comes through, but the angle is lost. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From Scott.Markel at accelrys.com Mon Dec 5 20:14:19 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Mon, 5 Dec 2011 17:14:19 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. Here are EMBOSS command lines for embossversion and acdtable. > embossversion Reports the current EMBOSS version number 6.4.0.2 > acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . Lines 66-9: "-sequence" associated seqall qualifiers Lines 183-6: "-pattern" associated pattern qualifiers Lines 212-5: "-outfile" associated report qualifiers Scott Scott Markel, Ph.D. Principal Bioinformatics Architect? email:? smarkel at accelrys.com Accelrys (Pipeline Pilot R&D)?????? mobile: +1 858 205 3653 10188 Telesis Court, Suite 100????? voice:? +1 858 799 5603 San Diego, CA 92121???????????????? fax:??? +1 858 799 5222 USA???????????????????????????????? web:??? http://www.accelrys.com http://www.linkedin.com/in/smarkel Secretary, Board of Directors: ??? International Society for Computational Biology Chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics From cjfields at illinois.edu Mon Dec 5 21:03:35 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 6 Dec 2011 02:03:35 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> Message-ID: Scott, is this something that needs to be addressed on the bioperl end? chris On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > > Here are EMBOSS command lines for embossversion and acdtable. > >> embossversion > Reports the current EMBOSS version number > 6.4.0.2 > >> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html > > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . > > Lines 66-9: > > > "-sequence" associated seqall qualifiers > > Lines 183-6: > > > "-pattern" associated pattern qualifiers > > Lines 212-5: > > > "-outfile" associated report qualifiers > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pandey.gaurav at gmail.com Mon Dec 5 23:07:10 2011 From: pandey.gaurav at gmail.com (Gaurav Pandey) Date: Mon, 5 Dec 2011 23:07:10 -0500 Subject: [EMBOSS] Research opportunities in systems biology at Mt. Sinai School of Medicine Message-ID: Please forward this announcement to potential applicants if it is not directly applicable to you. Apologies for cross-posting. --------------------------- The Institute for Genomics and Multi-scale Biology (multiscale.mssm.edu) at the Mount Sinai School of Medicine (www.mssm.edu) in New York City is seeking applications from enthusiastic and capable researchers for positions at Ph.D., postdoc and staff level. The Institute is the hub of genomics research at Mount Sinai, collaborating with13 other disease-oriented and core-technology-based institutes, and aims to translate genomics-based science into actionable medical knowledge. The Institute's faculty members have expertise in a wide range of areas such as machine learning, biostatistics, genetics, genomics, sequencing technology, data science, and high-performance computing (HPC). We wish to recruit capable and motivated graduate students, post-doctoral associates, and research staff members. Please look at the detailed announcement at the end of this email providing information on specific requirements for these positions. PhD and MD/PhD applicants should consult the requirements of the Graduate School at Mount Sinai ( http://www.mssm.edu/education/graduate-school) for application and potential admission. Others should send their requisite application material (see below) to multiscale.biology at mssm.edu. For more information, visit the Institute's website (multiscale.mssm.edu) and/or email multiscale.biology at mssm.edu. ------------------------------- The Multiscale Institute is actively recruiting graduate students, post-doctoral scholars and technical staff interested in working on challenging and important problems in computational systems biology, especially with a focus on translating this work to the medical domain. Prospective Students If you are a prospective MD ( http://www.mssm.edu/education/medical-education/programs/md-program), Ph.D. or MD/Ph.D. (http://www.mssm.edu/education/graduate-school) student, the Multiscale Institute provides a cutting-edge research environment, strong mentorship and comprehensive training in computational biology. Please check out the various degree programs within the Mount Sinai School of Medicine, and for Ph.D. students, the Genetics and Genomics Training Area (http://www.mssm.edu/education/graduate-school/degrees-and-programs/phd-program/multidisciplinary-training-areas/genetics-and-genomic-sciences)in particular, for more information. Alternatively, contact us directly at multiscale.biology at mssm.edu for information. The Multiscale Institute faculty members have backgrounds in Computer Science, Engineering, Statistics, Mathematics, Physics, Genetics, Molecular Biology, and related disciplines, and are looking for students from all of those fields. The essential requirement is an interest in and capability of thinking about real-world problems in quantitative terms. Additional desired (but not essential) characteristics will be (1) a background in biology, acquired through courses and/or research work and/or (2) experience with data-intensive real-world problems (from any domain) and/or (3) experience with programming. If you are not yet applying to graduate school, we recommend you to check out SURP ( http://www.mssm.edu/education/graduate-school/degrees-and-programs/summer-undergraduate-research-program), the Summer Undergraduate Research Program at Mount Sinai. Prospective Post-doctoral Scholars At the Multiscale Institute, you will have a chance to work with world class scientists and engineers tackling important and highly visible problems in computational biology, with an emphasis on translating such work into the medical domain. Some example targets of the Institute are identifying actionable biomarkers and therapeutics for complex human diseases. Candidates should have a recent PhD and/or MD degree in a science, technology, engineering or medical field, and discipline and high motivation to pursue independent research in computational biology. Applicants are expected to have a solid background in programming and computational techniques, with a working knowledge of molecular biology and genetics being highly desirable. To apply, send a CV, a research statement and contact information for three referees to the multiscale.biology at mssm.edu. More information about postdoctoral training at Mt. Sinai can be found at the Office of Postdoctoral Affairs ( http://www.mssm.edu/education/postdoctoral-training). Technical Staff We are looking for engineers and scientists to help Institute researchers extract clinical insight from petabyte-scale data sets. The ideal candidate will be comfortable in an academic environment, and will bring energy and creativity to the Institute?s work. In particular, the applicant is expected to have substantial experience working with big data and its associated infrastructure, including data retrieval from a variety of data-stores, statistics with R/C++/Java etc., and visualization for the web and print. A biology background is desired, but not required; we are primarily looking for people with strong data analysis and software engineering skills. To apply, send your resume to multiscale.biology at mssm.edu -- Gaurav Pandey, Ph.D. Assistant Professor Department of Genetics and Genomic Sciences Mount Sinai School of Medicine, New York City http://www.mssm.edu/profiles/gaurav-pandey From pmr at ebi.ac.uk Tue Dec 6 03:52:33 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 06 Dec 2011 08:52:33 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> Message-ID: <4EDDD7D1.4030001@ebi.ac.uk> Dear Scott, On 06/12/2011 00:35, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing. Oops. Well spotted. Simple fix in ajax/acd/ajacd.c ... we have a new release due in January when our current funding ends, but will look to release a patch for you. From the version number, was this mEMBOSS you were using? regards, Peter Rice EMBOSS Team From jison at ebi.ac.uk Tue Dec 6 06:00:15 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 6 Dec 2011 11:00:15 -0000 (UTC) Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDCEDDC.3020200@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> Message-ID: <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> Hi Herv? Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the longest ORF then that should be an easy enough option to add. Cheers Jon > Hi, > I have been using transeq and checktrans to find ORFs in DNA sequences. > I have an question regarding the selection of the ORFs in case > checktrans finds multiple ones in the result of transeq: I would like to > select the longest one automatically (assuming it is the correct one). > Is there any existing tool in EMBOSS (or alternatively in any Bio* > library) that does this job? > Cheers, > Herv? > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From hmenager at pasteur.fr Tue Dec 6 06:07:11 2011 From: hmenager at pasteur.fr (=?ISO-8859-1?Q?Herv=E9_M=E9nager?=) Date: Tue, 06 Dec 2011 12:07:11 +0100 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> Message-ID: <4EDDF75F.8020400@pasteur.fr> Hi Jon, That's exactly what would make me happy. Assuming it makes it easier for you if I formulate officially this request, where do I need to send the request? Herv? On 12/06/2011 12:00 PM, Jon Ison wrote: > Hi Herv? > > Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the longest > ORF then that should be an easy enough option to add. > > Cheers > > Jon > > >> Hi, >> I have been using transeq and checktrans to find ORFs in DNA sequences. >> I have an question regarding the selection of the ORFs in case >> checktrans finds multiple ones in the result of transeq: I would like to >> select the longest one automatically (assuming it is the correct one). >> Is there any existing tool in EMBOSS (or alternatively in any Bio* >> library) that does this job? >> Cheers, >> Herv? >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > From jison at ebi.ac.uk Tue Dec 6 06:11:01 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 6 Dec 2011 11:11:01 -0000 (UTC) Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDDF75F.8020400@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> Message-ID: <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> A note of it is already on the SourceForge "Feature Requests" ... J:) > Hi Jon, > > That's exactly what would make me happy. Assuming it makes it easier for > you if I formulate officially this request, where do I need to send the > request? > > Herv? > > On 12/06/2011 12:00 PM, Jon Ison wrote: >> Hi Herv? >> >> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >> longest >> ORF then that should be an easy enough option to add. >> >> Cheers >> >> Jon >> >> >>> Hi, >>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>> I have an question regarding the selection of the ORFs in case >>> checktrans finds multiple ones in the result of transeq: I would like to >>> select the longest one automatically (assuming it is the correct one). >>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>> library) that does this job? >>> Cheers, >>> Herv? >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss >>> >> >> > From andrespinzon at gmail.com Tue Dec 6 07:53:36 2011 From: andrespinzon at gmail.com (Andres Pinzon) Date: Tue, 6 Dec 2011 07:53:36 -0500 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> Message-ID: Herve, Have you tried using "getorf" and then "sizeseq"? I think it will work. Then you could get the first sequence from the output. Best, On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison wrote: > A note of it is already on the SourceForge "Feature Requests" ... > > J:) > > > >> Hi Jon, >> >> That's exactly what would make me happy. Assuming it makes it easier for >> you if I formulate officially this request, where do I need to send the >> request? >> >> Herv? >> >> On 12/06/2011 12:00 PM, Jon Ison wrote: >>> Hi Herv? >>> >>> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >>> longest >>> ORF then that should be an easy enough option to add. >>> >>> Cheers >>> >>> Jon >>> >>> >>>> Hi, >>>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>>> I have an question regarding the selection of the ORFs in case >>>> checktrans finds multiple ones in the result of transeq: I would like to >>>> select the longest one automatically (assuming it is the correct one). >>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>> library) that does this job? >>>> Cheers, >>>> Herv? >>>> _______________________________________________ >>>> EMBOSS mailing list >>>> EMBOSS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>> >>> >>> >> > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss -- Andr?s Pinz?n From hmenager at pasteur.fr Tue Dec 6 09:20:11 2011 From: hmenager at pasteur.fr (=?UTF-8?B?SGVydsOpIE3DqW5hZ2Vy?=) Date: Tue, 06 Dec 2011 15:20:11 +0100 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> Message-ID: <4EDE249B.7080207@pasteur.fr> Hi Andres, From my point of view, it would do the job if I didn't have a _list_ of genes as an input: I hence need to input a list of genes and get as an output a list of the longest ORF computed for each gene. If I do this directly with getorf and sizeseq, my understanding is that sizeseq won't allow me to select the longest ORF for each input sequence. I can make the loop myself, but I won't unless it is not ?here somewhere in EMBOSS ;). Cheers, Herv? On 12/06/2011 01:53 PM, Andres Pinzon wrote: > Herve, > Have you tried using "getorf" and then "sizeseq"? > I think it will work. Then you could get the first sequence from the output. > > Best, > > On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison wrote: >> A note of it is already on the SourceForge "Feature Requests" ... >> >> J:) >> >> >> >>> Hi Jon, >>> >>> That's exactly what would make me happy. Assuming it makes it easier for >>> you if I formulate officially this request, where do I need to send the >>> request? >>> >>> Herv? >>> >>> On 12/06/2011 12:00 PM, Jon Ison wrote: >>>> Hi Herv? >>>> >>>> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >>>> longest >>>> ORF then that should be an easy enough option to add. >>>> >>>> Cheers >>>> >>>> Jon >>>> >>>> >>>>> Hi, >>>>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>>>> I have an question regarding the selection of the ORFs in case >>>>> checktrans finds multiple ones in the result of transeq: I would like to >>>>> select the longest one automatically (assuming it is the correct one). >>>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>>> library) that does this job? >>>>> Cheers, >>>>> Herv? >>>>> _______________________________________________ >>>>> EMBOSS mailing list >>>>> EMBOSS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>>> >>>> >>>> >>> >> >> >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > > > From andrespinzon at gmail.com Tue Dec 6 09:19:39 2011 From: andrespinzon at gmail.com (Andres Pinzon) Date: Tue, 6 Dec 2011 09:19:39 -0500 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDE249B.7080207@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> <4EDE249B.7080207@pasteur.fr> Message-ID: D'accord ;) Best On Tue, Dec 6, 2011 at 9:20 AM, Herv? M?nager wrote: > Hi Andres, > From my point of view, it would do the job if I didn't have a _list_ of > genes as an input: I hence need to input a list of genes and get as an > output a list of the longest ORF computed for each gene. If I do this > directly with getorf and sizeseq, my understanding is that sizeseq won't > allow me to select the longest ORF for each input sequence. I can make the > loop myself, but I won't unless it is not ?here somewhere in EMBOSS ;). > Cheers, > Herv? > > > On 12/06/2011 01:53 PM, Andres Pinzon wrote: >> >> Herve, >> Have you tried using "getorf" and then "sizeseq"? >> I think it will work. Then you could get the first sequence from the >> output. >> >> Best, >> >> On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison ?wrote: >>> >>> A note of it is already on the SourceForge "Feature Requests" ... >>> >>> J:) >>> >>> >>> >>>> Hi Jon, >>>> >>>> That's exactly what would make me happy. Assuming it makes it easier for >>>> you if I formulate officially this request, where do I need to send the >>>> request? >>>> >>>> Herv? >>>> >>>> On 12/06/2011 12:00 PM, Jon Ison wrote: >>>>> >>>>> Hi Herv? >>>>> >>>>> Nothing in EMBOSS so far as I'm aware, but if all you want is for >>>>> checktrans to report the >>>>> longest >>>>> ORF then that should be an easy enough option to add. >>>>> >>>>> Cheers >>>>> >>>>> Jon >>>>> >>>>> >>>>>> Hi, >>>>>> I have been using transeq and checktrans to find ORFs in DNA >>>>>> sequences. >>>>>> I have an question regarding the selection of the ORFs in case >>>>>> checktrans finds multiple ones in the result of transeq: I would like >>>>>> to >>>>>> select the longest one automatically (assuming it is the correct one). >>>>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>>>> library) that does this job? >>>>>> Cheers, >>>>>> Herv? >>>>>> _______________________________________________ >>>>>> EMBOSS mailing list >>>>>> EMBOSS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>>>> >>>>> >>>>> >>>> >>> >>> >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss >> >> >> >> > -- Andr?s Pinz?n From Scott.Markel at accelrys.com Tue Dec 6 15:21:05 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Tue, 6 Dec 2011 12:21:05 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <4EDDD7D1.4030001@ebi.ac.uk> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> <4EDDD7D1.4030001@ebi.ac.uk> Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA78386B@EXCH1-COLO.accelrys.net> Peter, Yes, what I generated for you was Windows-based, but we run EMBOSS on both Windows and Linux in Pipeline Pilot. Regarding a patch, our next release is later in 2012, so whatever is more convenient for you regarding timing is fine with us. Scott -----Original Message----- From: Peter Rice [mailto:pmr at ebi.ac.uk] Sent: Tuesday, 06 December 06 2011 12:53 AM To: Scott Markel Cc: emboss at lists.open-bio.org; Kristine Briedis Subject: Re: HTML tag mismatch in acdtable output for fuzzpro Dear Scott, On 06/12/2011 00:35, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing. Oops. Well spotted. Simple fix in ajax/acd/ajacd.c ... we have a new release due in January when our current funding ends, but will look to release a patch for you. From the version number, was this mEMBOSS you were using? regards, Peter Rice EMBOSS Team From Scott.Markel at accelrys.com Tue Dec 6 15:21:18 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Tue, 6 Dec 2011 12:21:18 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> Chris, I don't think so. We certainly made changes to our BioPerl copy to work around the EMBOSS bug, but I don't think BioPerl needs to incorporate what we did (writing a little subroutine to fix the tags). An EMBOSS fix takes care of our problem. Scott -----Original Message----- From: Fields, Christopher J [mailto:cjfields at illinois.edu] Sent: Monday, 05 December 05 2011 6:04 PM To: Scott Markel Cc: emboss at lists.open-bio.org; Kristine Briedis Subject: Re: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro Scott, is this something that needs to be addressed on the bioperl end? chris On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > > Here are EMBOSS command lines for embossversion and acdtable. > >> embossversion > Reports the current EMBOSS version number > 6.4.0.2 > >> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html > > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . > > Lines 66-9: > > > "-sequence" associated seqall qualifiers > > Lines 183-6: > > > "-pattern" associated pattern qualifiers > > Lines 212-5: > > > "-outfile" associated report qualifiers > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From cjfields at illinois.edu Tue Dec 6 17:00:52 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 6 Dec 2011 22:00:52 +0000 Subject: [EMBOSS] problem with EMBOSS configuration '--without x' Message-ID: <68A7E0DA-660A-48CC-9646-4E3CD42C2367@illinois.edu> Not sure when the next version is due out, but I thought this is worth mentioning in case someone else runs into it. We had an odd issue with our local EMBOSS installation (6.4.0) on a RHEL VM which is likely a bug in the configuration step. We were installing for use mainly with Galaxy, and didn't need X11 configuration, so we configured as follows: ./configure --prefix=/opt/local/Bio/EMBOSS --without-x However, the installed binaries couldn't find the acd files or data; the only way to find them (as well as the data files) was to explicitly set EMBOSS_ACDROOT and EMBOSS_DATA. Oddly, installing libx11-devel and removing the '--without-x' flag during configuration worked just fine, no need for default env variables or .embossrc files. Anyone else run into this? chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From cjfields at illinois.edu Tue Dec 6 21:20:55 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 7 Dec 2011 02:20:55 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> Message-ID: <7A5DAABA-33BE-4D94-8224-2353D3C2B90F@illinois.edu> Yeah, caught Peter's response on that. Just wanted to make sure there isn't anything we need to do from our end :) chris On Dec 6, 2011, at 2:21 PM, Scott Markel wrote: > Chris, > > I don't think so. We certainly made changes to our BioPerl copy to work around the EMBOSS bug, but I don't think BioPerl needs to incorporate what we did (writing a little subroutine to fix the tags). An EMBOSS fix takes care of our problem. > > Scott > > > -----Original Message----- > From: Fields, Christopher J [mailto:cjfields at illinois.edu] > Sent: Monday, 05 December 05 2011 6:04 PM > To: Scott Markel > Cc: emboss at lists.open-bio.org; Kristine Briedis > Subject: Re: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro > > Scott, is this something that needs to be addressed on the bioperl end? > > chris > > On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > >> We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. >> >> Here are EMBOSS command lines for embossversion and acdtable. >> >>> embossversion >> Reports the current EMBOSS version number >> 6.4.0.2 >> >>> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html >> >> And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . >> >> Lines 66-9: >> >> >> "-sequence" associated seqall qualifiers >> >> Lines 183-6: >> >> >> "-pattern" associated pattern qualifiers >> >> Lines 212-5: >> >> >> "-outfile" associated report qualifiers >> >> Scott >> >> Scott Markel, Ph.D. >> Principal Bioinformatics Architect email: smarkel at accelrys.com >> Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 >> 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 >> San Diego, CA 92121 fax: +1 858 799 5222 >> USA web: http://www.accelrys.com >> >> http://www.linkedin.com/in/smarkel >> Secretary, Board of Directors: >> International Society for Computational Biology >> Chair: ISCB Publications Committee >> Associate Editor: PLoS Computational Biology >> Editorial Board: Briefings in Bioinformatics >> >> >> >> >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > > > From s.newslists at gmail.com Sun Dec 4 10:25:56 2011 From: s.newslists at gmail.com (Stefan) Date: Sun, 4 Dec 2011 11:25:56 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4E159971.9070509@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> Message-ID: 2011/7/7 Peter Rice : > Very close to release date next week, so hard to do anything immediately. Hello to the list, are there any news in plasmid documentation and in-silico cloning with EMBOSS? In the last month I was asked 4 times if this is possible with EMBOSS. People like it to work with EMBOSS and want to do that things also with this suite. I would be happy to hear any progress. Thanks Stefan From pmr at ebi.ac.uk Sun Dec 4 20:22:12 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Sun, 04 Dec 2011 20:22:12 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> Message-ID: <4EDBD674.9050409@ebi.ac.uk> On 04/12/2011 10:25, Stefan wrote: > 2011/7/7 Peter Rice: >> Very close to release date next week, so hard to do anything immediately. > > are there any news in plasmid documentation and in-silico cloning with > EMBOSS? In the last month I was asked 4 times if this is possible with > EMBOSS. People like it to work with EMBOSS and want to do that things > also with this suite. Can you give us some examples of what you would like to see? Examples always help us to design new applicatons. We still only have cirdna, mainly because we are limited by the plplot graphics library (and would welcome suggestions of other graphics libraries we could try). We have extended the capabilities by creating input files from some other applications. regards, Peter Rice EMBOSS team From pmr at ebi.ac.uk Mon Dec 5 09:26:00 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 09:26:00 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDC8BDA.30006@fmi.ch> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> Message-ID: <4EDC8E28.7090702@ebi.ac.uk> On 12/05/2011 09:16 AM, Hans-Rudolf Hotz wrote: > Thanks to the Galaxy framework, our lab scientist are using more and > more EMBOSS tools. Now, it would be very handy if there was an EMBOSS > drawing tool (even a very simple one) which takes an embl or genbank > file as input and creates something "colorful". By default each Key or > Qualifier is given a specific color and shape (eg 'arrow'). And you can > change them as options. Interesting suggestion. Can you send an example of what you would like to see? You can use your favourite drawing tool if you like, but it is OK if you draw it in crayon, scan it and send the image :-) regards, Peter Rice EMBOSS team From hrh at fmi.ch Mon Dec 5 09:16:10 2011 From: hrh at fmi.ch (Hans-Rudolf Hotz) Date: Mon, 5 Dec 2011 10:16:10 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDBD674.9050409@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> Message-ID: <4EDC8BDA.30006@fmi.ch> On 12/04/2011 09:22 PM, Peter Rice wrote: > On 04/12/2011 10:25, Stefan wrote: >> 2011/7/7 Peter Rice: >>> Very close to release date next week, so hard to do anything >>> immediately. >> >> are there any news in plasmid documentation and in-silico cloning with >> EMBOSS? In the last month I was asked 4 times if this is possible with >> EMBOSS. People like it to work with EMBOSS and want to do that things >> also with this suite. > > Can you give us some examples of what you would like to see? Examples > always help us to design new applicatons. > Hi Peter and Stefan Please allow me to jump in and tell you about our situation: Our wet lab scientist use depending (which lab/university they are coming from) a combination of old commercial products (which we can still use thank to the perpetual licenses) and free/open source products like 'Serial Cloner', 'Ape', 'Gentle', etc I constantly 'preach' how vital it is to make sure you safe each sequence/plasmid/etc not only in the tool specific format, but also in a text format like embl or genbank, where all the annotation is stored in the Feature Table Thanks to the Galaxy framework, our lab scientist are using more and more EMBOSS tools. Now, it would be very handy if there was an EMBOSS drawing tool (even a very simple one) which takes an embl or genbank file as input and creates something "colorful". By default each Key or Qualifier is given a specific color and shape (eg 'arrow'). And you can change them as options. I hope this works as a use case? Regards, Hans Hans-Rudolf Hotz, PhD Bioinformatics Support Friedrich Miescher Institute for Biomedical Research Maulbeerstrasse 66 4058 Basel/Switzerland > We still only have cirdna, mainly because we are limited by the plplot > graphics library (and would welcome suggestions of other graphics > libraries we could try). > > We have extended the capabilities by creating input files from some > other applications. > > regards, > > Peter Rice > EMBOSS team > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From p.j.a.cock at googlemail.com Mon Dec 5 10:08:59 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Dec 2011 10:08:59 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: <4EDC8E28.7090702@ebi.ac.uk> References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> <4EDC8E28.7090702@ebi.ac.uk> Message-ID: On Mon, Dec 5, 2011 at 9:26 AM, Peter Rice wrote: > On 12/05/2011 09:16 AM, Hans-Rudolf Hotz wrote: >> >> Thanks to the Galaxy framework, our lab scientist are using more and >> more EMBOSS tools. Now, it would be very handy if there was an EMBOSS >> drawing tool (even a very simple one) which takes an embl or genbank >> file as input and creates something "colorful". By default each Key or >> Qualifier is given a specific color and shape (eg 'arrow'). And you can >> change them as options. > > Interesting suggestion. Can you send an example of what you > would like to see? > > You can use your favourite drawing tool if you like, but it is OK if > you draw it in crayon, scan it and send the image :-) > I use Biopython's GenomeDiagram for this sort of thing, which internally uses a Python library called ReportLab which can produce PDF, SVG, PNG, etc. Sadly I doubt that would be suitable for EMBOSS which would really want a C library. GenomeDiagram has the notion of tracks - which can be useful for separating different types of features which would otherwise overlap (e.g. gene/mRNA/CDS) and other tricks. It also has some defaults about which feature qualifiers to use as the feature name (e.g. gene, locustag). I have considered writing a Galaxy tool to take an EMBL or GenBank file and produce a picture, but haven't got round to it yet. Anyway, there may be some useful ideas for layout here if nothing else. Peter From pmr at ebi.ac.uk Mon Dec 5 10:19:41 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 10:19:41 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> <4EDBD674.9050409@ebi.ac.uk> <4EDC8BDA.30006@fmi.ch> <4EDC8E28.7090702@ebi.ac.uk> Message-ID: <4EDC9ABD.2020605@ebi.ac.uk> On 12/05/2011 10:08 AM, Peter Cock wrote: > I use Biopython's GenomeDiagram for this sort of thing, which > internally uses a Python library called ReportLab which can > produce PDF, SVG, PNG, etc. Sadly I doubt that would be > suitable for EMBOSS which would really want a C library. > > GenomeDiagram has the notion of tracks - which can be > useful for separating different types of features which would > otherwise overlap (e.g. gene/mRNA/CDS) and other tricks. > It also has some defaults about which feature qualifiers > to use as the feature name (e.g. gene, locustag). The EMBOSS solution to tracks would probably use the Sequence Ontology to define the tracks, and use features below some specified tag (or set of tags) for each track. We can apply the same tracks to showfeat (the text display of features) Tracks also simplify the issues of colours for features. lindna has code for rendering features withg scaling and avoiding overlaps which we could also, I hope, reuse. Maybe not for the next release (release date 15th January, but code freeze before Christmas) but we can provide applications for testing by anyone interested after the release. regards, Peter Rice EMBOSS Team From david.bauer at bayer.com Mon Dec 5 09:38:34 2011 From: david.bauer at bayer.com (david.bauer at bayer.com) Date: Mon, 5 Dec 2011 10:38:34 +0100 Subject: [EMBOSS] Plasmid drawing Message-ID: Hi Peter, I think Stefan means not only the drawing of plasmid maps but the creation of new constructs in the computer before doing it at the bench. This is a field where there are currently only commercial packages available like VectorNTI and Clone Manager ( http://www.scied.com/pr_cmpro.htm) etc. Clone manager is there since the times of MS-DOS but the prices have increased substantially. They used to advertise the software with the slogan: "It costs only as much as a cloning kit for the lab" - which was in the range of 300,-USD. The most important functions, which I think should be implement first are the cloning operations. So e.g. - take plasmid1, open the multi cloning site with BamHI and EcoRI - take plasmid2, cut out a fragment with BglII and EcoRI - ligate the fragment from plasmid2 into plasmid1 (BamHI and BglII have compatible ends which can be ligated, so an algorithm is needed to check this) - draw a map of the newly created plasmid3 This is a rather simple example. There can be more complex procedures like Klenow fill in (blunt end ligation of incompatible restriction sites) or the use of more than 2 fragments in one ligation. I think most of the functions needed for this are already present in EMBOSS. It shouldn't be so complicated to create an application which uses functions from restrict, cutseq, pasteseq to accomplish the above mentioned task. Although it's probably not so trivial to make this happen before the release next week ;-) Cheers, David. Peter Rice Gesendet von: emboss-bounces at lists.open-bio.org 04/12/2011 21:22 An Stefan Kopie emboss at lists.open-bio.org Thema Re: [EMBOSS] Plasmid drawing On 04/12/2011 10:25, Stefan wrote: > 2011/7/7 Peter Rice: >> Very close to release date next week, so hard to do anything immediately. > > are there any news in plasmid documentation and in-silico cloning with > EMBOSS? In the last month I was asked 4 times if this is possible with > EMBOSS. People like it to work with EMBOSS and want to do that things > also with this suite. Can you give us some examples of what you would like to see? Examples always help us to design new applicatons. We still only have cirdna, mainly because we are limited by the plplot graphics library (and would welcome suggestions of other graphics libraries we could try). We have extended the capabilities by creating input files from some other applications. regards, Peter Rice EMBOSS team _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From s.newslists at gmail.com Mon Dec 5 10:44:34 2011 From: s.newslists at gmail.com (Stefan) Date: Mon, 5 Dec 2011 11:44:34 +0100 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: > I think Stefan means not only the drawing of plasmid maps but the creation of new constructs in the computer before doing it at the bench. What David wrote is exactly what I wanted to explain with "in-silico cloning". With plasmid documentation I meant that it is possible to draw aplasmid with features and restriction sites. Thanks From pmr at ebi.ac.uk Mon Dec 5 13:23:23 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 05 Dec 2011 13:23:23 +0000 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: <4EDCC5CB.2090402@ebi.ac.uk> On 05/12/2011 12:52, Hotz, Hans-Rudolf wrote: > "cirdna" and "lindna" are nice and do their job most of the time. In my > view, the problem is the input file format. Maybe a converter from > embl/genbank to *.crip or *.linp would be a good start? Or adding the those > formats to the output options of seqret? We do already support "draw" an a report output format. If we add it as a feature output format you can use featcopy (or seqret -feat) to create a cirdna or lindna input file. We can then work on the feature format to add extra information by feature type. We could also add an application to merge a set of feature inputs so you could combine genbank/embl files with restriction map output, then use the results as input. I will see what I can do for the next release. Thanks for the very helpful suggestions Peter Rice EMBOSS Team From marko at cryst.bioc.cam.ac.uk Mon Dec 5 13:10:06 2011 From: marko at cryst.bioc.cam.ac.uk (Marko Hyvonen) Date: Mon, 5 Dec 2011 13:10:06 +0000 (GMT) Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: Message-ID: I would find this kind of in silico cloning great addition to EMBOSS. As an example of (a free?) drawing program for plasmids, I use PlasMapper (http://wishart.biology.ualberta.ca/PlasMapper/ , source code is at least available on the site). Here is an example I made from their own selection of plasmids (hopefully valid link for a while still): http://wishart.biology.ualberta.ca/PlasMapper/jsp/displayPlasmidMap.jsp?fileName=plasMap26_1323082022185.png&fileFormat=png I like the fact that it includes a number of predefined features in plasmids as built-ins, so that you only need to include the "unique" features in your own plasmids (and I am sure the set of built-ins could be expanded much further). Outputs svg too, which is brilliant for editing later on. Marko On Mon, 5 Dec 2011, david.bauer at bayer.com wrote: > Hi Peter, > > I think Stefan means not only the drawing of plasmid maps but the creation > of new constructs in the computer before doing it at the bench. > This is a field where there are currently only commercial packages > available like VectorNTI and Clone Manager ( > http://www.scied.com/pr_cmpro.htm) etc. > Clone manager is there since the times of MS-DOS but the prices have > increased substantially. They used to advertise the software with the > slogan: "It costs only as much as a cloning kit for the lab" - which was > in the range of 300,-USD. > > The most important functions, which I think should be implement first are > the cloning operations. > So e.g. > - take plasmid1, open the multi cloning site with BamHI and EcoRI > - take plasmid2, cut out a fragment with BglII and EcoRI > - ligate the fragment from plasmid2 into plasmid1 > (BamHI and BglII have compatible ends which can be ligated, so an > algorithm is needed to check this) > - draw a map of the newly created plasmid3 > This is a rather simple example. > There can be more complex procedures like Klenow fill in (blunt end > ligation of incompatible restriction sites) or the use of more than 2 > fragments in one ligation. > > I think most of the functions needed for this are already present in > EMBOSS. > It shouldn't be so complicated to create an application which uses > functions from restrict, cutseq, pasteseq to accomplish the above > mentioned task. > Although it's probably not so trivial to make this happen before the > release next week ;-) > > Cheers, > David. > > > > > Peter Rice > Gesendet von: emboss-bounces at lists.open-bio.org > 04/12/2011 21:22 > > An > Stefan > Kopie > emboss at lists.open-bio.org > Thema > Re: [EMBOSS] Plasmid drawing > > > > > > > On 04/12/2011 10:25, Stefan wrote: >> 2011/7/7 Peter Rice: >>> Very close to release date next week, so hard to do anything > immediately. >> >> are there any news in plasmid documentation and in-silico cloning with >> EMBOSS? In the last month I was asked 4 times if this is possible with >> EMBOSS. People like it to work with EMBOSS and want to do that things >> also with this suite. > > Can you give us some examples of what you would like to see? Examples > always help us to design new applicatons. > > We still only have cirdna, mainly because we are limited by the plplot > graphics library (and would welcome suggestions of other graphics > libraries we could try). > > We have extended the capabilities by creating input files from some > other applications. > > regards, > > Peter Rice > EMBOSS team > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > _____________________________________ Marko Hyvonen Department of Biochemistry, University of Cambridge marko at cryst.bioc.cam.ac.uk http://www-cryst.bioc.cam.ac.uk/groups/hyvonen tel: +44-(0)1223-766 044 mobile: +44-(0)7796-174 877 fax: +44-(0)1223-766 002 -------------------------------------- From hmenager at pasteur.fr Mon Dec 5 16:14:20 2011 From: hmenager at pasteur.fr (=?ISO-8859-1?Q?Herv=E9_M=E9nager?=) Date: Mon, 05 Dec 2011 17:14:20 +0100 Subject: [EMBOSS] ORF selection with EMBOSS Message-ID: <4EDCEDDC.3020200@pasteur.fr> Hi, I have been using transeq and checktrans to find ORFs in DNA sequences. I have an question regarding the selection of the ORFs in case checktrans finds multiple ones in the result of transeq: I would like to select the longest one automatically (assuming it is the correct one). Is there any existing tool in EMBOSS (or alternatively in any Bio* library) that does this job? Cheers, Herv? From mathog at caltech.edu Mon Dec 5 16:35:31 2011 From: mathog at caltech.edu (mathog) Date: Mon, 05 Dec 2011 08:35:31 -0800 Subject: [EMBOSS] Plasmid drawing In-Reply-To: References: <4E159971.9070509@ebi.ac.uk> Message-ID: On Sun, 4 Dec 2011 11:25:56 +0100, Stefan wrote: > are there any news in plasmid documentation and in-silico cloning > with > EMBOSS? Pierre Lindenbaum wrote a nice little program called "cloneit" for figuring out cloning strategies. That is, specifically what series of reactions to use to insert a particular piece of DNA into a vector. Not sure if that is what you mean by "in-silico cloning" or not. There is a separate cgi version for web use. Here is the paper: http://www.ncbi.nlm.nih.gov/pubmed/9682060?dopt=Abstract The original distribution site is long since off line, but the code says that it may be redistributed (and incorporated in noncommercial software, but see the exact wording for details.) Anyway, if anybody wants to have a look, I packed up our copies and put them here: http://saf.bio.caltech.edu/pub/software/molbio/cloneit.tar.gz http://saf.bio.caltech.edu/pub/software/molbio/cloneitcgi.tar.gz This is not a graphics program, it is a reaction planning program. As for the graphical output of plasmid diagrams, historically none of the drawing programs does exactly what the end users want. (These get closer and closer, but never quite cover all the bases). For that reason it is really important that the graphics driver be able to output to an object format that can then be imported into a drawing program and edited there. Modifying an image never turns out as nicely. When going to object formats it is key that text remain text and not be converted into line segments or paths. Users get really frustrated when they can't edit labels or change fonts or font sizes. Locally we have a hacked up GCG driver that emits cgm format for the GCG drawing programs. Nowadays going to SVG would make more sense. While it is sometimes possible to read PDF back into a drawing program like inkscape, text imported that way is very hit and miss, mostly miss. Often it comes through with each letter as a separate text object, which is no fun at all to work with. Other times it will come in with each letter separately kerned, and when that kerning is removed, the text jumps all over the place in unpredictable ways. Why would you remove the kerning? Because there is another bug/limitation in inkscape where kerned text is automatically converted to images when exporting to emf or wmf format, as when trying to move that image into a powerpoint document. Moving rotated text between various programs is the most unreliable operation, as far as I can tell. For instance, I have never found a way to get text which runs at an angle other than 0 degrees from inskcape into powerpoint with that angle intact. The text comes through, but the angle is lost. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From Scott.Markel at accelrys.com Tue Dec 6 01:14:19 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Mon, 5 Dec 2011 17:14:19 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. Here are EMBOSS command lines for embossversion and acdtable. > embossversion Reports the current EMBOSS version number 6.4.0.2 > acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . Lines 66-9: "-sequence" associated seqall qualifiers Lines 183-6: "-pattern" associated pattern qualifiers Lines 212-5: "-outfile" associated report qualifiers Scott Scott Markel, Ph.D. Principal Bioinformatics Architect? email:? smarkel at accelrys.com Accelrys (Pipeline Pilot R&D)?????? mobile: +1 858 205 3653 10188 Telesis Court, Suite 100????? voice:? +1 858 799 5603 San Diego, CA 92121???????????????? fax:??? +1 858 799 5222 USA???????????????????????????????? web:??? http://www.accelrys.com http://www.linkedin.com/in/smarkel Secretary, Board of Directors: ??? International Society for Computational Biology Chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics From cjfields at illinois.edu Tue Dec 6 02:03:35 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 6 Dec 2011 02:03:35 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> Message-ID: Scott, is this something that needs to be addressed on the bioperl end? chris On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > > Here are EMBOSS command lines for embossversion and acdtable. > >> embossversion > Reports the current EMBOSS version number > 6.4.0.2 > >> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html > > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . > > Lines 66-9: > > > "-sequence" associated seqall qualifiers > > Lines 183-6: > > > "-pattern" associated pattern qualifiers > > Lines 212-5: > > > "-outfile" associated report qualifiers > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From pandey.gaurav at gmail.com Tue Dec 6 04:07:10 2011 From: pandey.gaurav at gmail.com (Gaurav Pandey) Date: Mon, 5 Dec 2011 23:07:10 -0500 Subject: [EMBOSS] Research opportunities in systems biology at Mt. Sinai School of Medicine Message-ID: Please forward this announcement to potential applicants if it is not directly applicable to you. Apologies for cross-posting. --------------------------- The Institute for Genomics and Multi-scale Biology (multiscale.mssm.edu) at the Mount Sinai School of Medicine (www.mssm.edu) in New York City is seeking applications from enthusiastic and capable researchers for positions at Ph.D., postdoc and staff level. The Institute is the hub of genomics research at Mount Sinai, collaborating with13 other disease-oriented and core-technology-based institutes, and aims to translate genomics-based science into actionable medical knowledge. The Institute's faculty members have expertise in a wide range of areas such as machine learning, biostatistics, genetics, genomics, sequencing technology, data science, and high-performance computing (HPC). We wish to recruit capable and motivated graduate students, post-doctoral associates, and research staff members. Please look at the detailed announcement at the end of this email providing information on specific requirements for these positions. PhD and MD/PhD applicants should consult the requirements of the Graduate School at Mount Sinai ( http://www.mssm.edu/education/graduate-school) for application and potential admission. Others should send their requisite application material (see below) to multiscale.biology at mssm.edu. For more information, visit the Institute's website (multiscale.mssm.edu) and/or email multiscale.biology at mssm.edu. ------------------------------- The Multiscale Institute is actively recruiting graduate students, post-doctoral scholars and technical staff interested in working on challenging and important problems in computational systems biology, especially with a focus on translating this work to the medical domain. Prospective Students If you are a prospective MD ( http://www.mssm.edu/education/medical-education/programs/md-program), Ph.D. or MD/Ph.D. (http://www.mssm.edu/education/graduate-school) student, the Multiscale Institute provides a cutting-edge research environment, strong mentorship and comprehensive training in computational biology. Please check out the various degree programs within the Mount Sinai School of Medicine, and for Ph.D. students, the Genetics and Genomics Training Area (http://www.mssm.edu/education/graduate-school/degrees-and-programs/phd-program/multidisciplinary-training-areas/genetics-and-genomic-sciences)in particular, for more information. Alternatively, contact us directly at multiscale.biology at mssm.edu for information. The Multiscale Institute faculty members have backgrounds in Computer Science, Engineering, Statistics, Mathematics, Physics, Genetics, Molecular Biology, and related disciplines, and are looking for students from all of those fields. The essential requirement is an interest in and capability of thinking about real-world problems in quantitative terms. Additional desired (but not essential) characteristics will be (1) a background in biology, acquired through courses and/or research work and/or (2) experience with data-intensive real-world problems (from any domain) and/or (3) experience with programming. If you are not yet applying to graduate school, we recommend you to check out SURP ( http://www.mssm.edu/education/graduate-school/degrees-and-programs/summer-undergraduate-research-program), the Summer Undergraduate Research Program at Mount Sinai. Prospective Post-doctoral Scholars At the Multiscale Institute, you will have a chance to work with world class scientists and engineers tackling important and highly visible problems in computational biology, with an emphasis on translating such work into the medical domain. Some example targets of the Institute are identifying actionable biomarkers and therapeutics for complex human diseases. Candidates should have a recent PhD and/or MD degree in a science, technology, engineering or medical field, and discipline and high motivation to pursue independent research in computational biology. Applicants are expected to have a solid background in programming and computational techniques, with a working knowledge of molecular biology and genetics being highly desirable. To apply, send a CV, a research statement and contact information for three referees to the multiscale.biology at mssm.edu. More information about postdoctoral training at Mt. Sinai can be found at the Office of Postdoctoral Affairs ( http://www.mssm.edu/education/postdoctoral-training). Technical Staff We are looking for engineers and scientists to help Institute researchers extract clinical insight from petabyte-scale data sets. The ideal candidate will be comfortable in an academic environment, and will bring energy and creativity to the Institute?s work. In particular, the applicant is expected to have substantial experience working with big data and its associated infrastructure, including data retrieval from a variety of data-stores, statistics with R/C++/Java etc., and visualization for the web and print. A biology background is desired, but not required; we are primarily looking for people with strong data analysis and software engineering skills. To apply, send your resume to multiscale.biology at mssm.edu -- Gaurav Pandey, Ph.D. Assistant Professor Department of Genetics and Genomic Sciences Mount Sinai School of Medicine, New York City http://www.mssm.edu/profiles/gaurav-pandey From pmr at ebi.ac.uk Tue Dec 6 08:52:33 2011 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 06 Dec 2011 08:52:33 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> Message-ID: <4EDDD7D1.4030001@ebi.ac.uk> Dear Scott, On 06/12/2011 00:35, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing. Oops. Well spotted. Simple fix in ajax/acd/ajacd.c ... we have a new release due in January when our current funding ends, but will look to release a patch for you. From the version number, was this mEMBOSS you were using? regards, Peter Rice EMBOSS Team From jison at ebi.ac.uk Tue Dec 6 11:00:15 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 6 Dec 2011 11:00:15 -0000 (UTC) Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDCEDDC.3020200@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> Message-ID: <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> Hi Herv? Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the longest ORF then that should be an easy enough option to add. Cheers Jon > Hi, > I have been using transeq and checktrans to find ORFs in DNA sequences. > I have an question regarding the selection of the ORFs in case > checktrans finds multiple ones in the result of transeq: I would like to > select the longest one automatically (assuming it is the correct one). > Is there any existing tool in EMBOSS (or alternatively in any Bio* > library) that does this job? > Cheers, > Herv? > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From hmenager at pasteur.fr Tue Dec 6 11:07:11 2011 From: hmenager at pasteur.fr (=?ISO-8859-1?Q?Herv=E9_M=E9nager?=) Date: Tue, 06 Dec 2011 12:07:11 +0100 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> Message-ID: <4EDDF75F.8020400@pasteur.fr> Hi Jon, That's exactly what would make me happy. Assuming it makes it easier for you if I formulate officially this request, where do I need to send the request? Herv? On 12/06/2011 12:00 PM, Jon Ison wrote: > Hi Herv? > > Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the longest > ORF then that should be an easy enough option to add. > > Cheers > > Jon > > >> Hi, >> I have been using transeq and checktrans to find ORFs in DNA sequences. >> I have an question regarding the selection of the ORFs in case >> checktrans finds multiple ones in the result of transeq: I would like to >> select the longest one automatically (assuming it is the correct one). >> Is there any existing tool in EMBOSS (or alternatively in any Bio* >> library) that does this job? >> Cheers, >> Herv? >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > From jison at ebi.ac.uk Tue Dec 6 11:11:01 2011 From: jison at ebi.ac.uk (Jon Ison) Date: Tue, 6 Dec 2011 11:11:01 -0000 (UTC) Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDDF75F.8020400@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> Message-ID: <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> A note of it is already on the SourceForge "Feature Requests" ... J:) > Hi Jon, > > That's exactly what would make me happy. Assuming it makes it easier for > you if I formulate officially this request, where do I need to send the > request? > > Herv? > > On 12/06/2011 12:00 PM, Jon Ison wrote: >> Hi Herv? >> >> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >> longest >> ORF then that should be an easy enough option to add. >> >> Cheers >> >> Jon >> >> >>> Hi, >>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>> I have an question regarding the selection of the ORFs in case >>> checktrans finds multiple ones in the result of transeq: I would like to >>> select the longest one automatically (assuming it is the correct one). >>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>> library) that does this job? >>> Cheers, >>> Herv? >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss >>> >> >> > From andrespinzon at gmail.com Tue Dec 6 12:53:36 2011 From: andrespinzon at gmail.com (Andres Pinzon) Date: Tue, 6 Dec 2011 07:53:36 -0500 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> Message-ID: Herve, Have you tried using "getorf" and then "sizeseq"? I think it will work. Then you could get the first sequence from the output. Best, On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison wrote: > A note of it is already on the SourceForge "Feature Requests" ... > > J:) > > > >> Hi Jon, >> >> That's exactly what would make me happy. Assuming it makes it easier for >> you if I formulate officially this request, where do I need to send the >> request? >> >> Herv? >> >> On 12/06/2011 12:00 PM, Jon Ison wrote: >>> Hi Herv? >>> >>> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >>> longest >>> ORF then that should be an easy enough option to add. >>> >>> Cheers >>> >>> Jon >>> >>> >>>> Hi, >>>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>>> I have an question regarding the selection of the ORFs in case >>>> checktrans finds multiple ones in the result of transeq: I would like to >>>> select the longest one automatically (assuming it is the correct one). >>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>> library) that does this job? >>>> Cheers, >>>> Herv? >>>> _______________________________________________ >>>> EMBOSS mailing list >>>> EMBOSS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>> >>> >>> >> > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss -- Andr?s Pinz?n From hmenager at pasteur.fr Tue Dec 6 14:20:11 2011 From: hmenager at pasteur.fr (=?UTF-8?B?SGVydsOpIE3DqW5hZ2Vy?=) Date: Tue, 06 Dec 2011 15:20:11 +0100 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> Message-ID: <4EDE249B.7080207@pasteur.fr> Hi Andres, From my point of view, it would do the job if I didn't have a _list_ of genes as an input: I hence need to input a list of genes and get as an output a list of the longest ORF computed for each gene. If I do this directly with getorf and sizeseq, my understanding is that sizeseq won't allow me to select the longest ORF for each input sequence. I can make the loop myself, but I won't unless it is not ?here somewhere in EMBOSS ;). Cheers, Herv? On 12/06/2011 01:53 PM, Andres Pinzon wrote: > Herve, > Have you tried using "getorf" and then "sizeseq"? > I think it will work. Then you could get the first sequence from the output. > > Best, > > On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison wrote: >> A note of it is already on the SourceForge "Feature Requests" ... >> >> J:) >> >> >> >>> Hi Jon, >>> >>> That's exactly what would make me happy. Assuming it makes it easier for >>> you if I formulate officially this request, where do I need to send the >>> request? >>> >>> Herv? >>> >>> On 12/06/2011 12:00 PM, Jon Ison wrote: >>>> Hi Herv? >>>> >>>> Nothing in EMBOSS so far as I'm aware, but if all you want is for checktrans to report the >>>> longest >>>> ORF then that should be an easy enough option to add. >>>> >>>> Cheers >>>> >>>> Jon >>>> >>>> >>>>> Hi, >>>>> I have been using transeq and checktrans to find ORFs in DNA sequences. >>>>> I have an question regarding the selection of the ORFs in case >>>>> checktrans finds multiple ones in the result of transeq: I would like to >>>>> select the longest one automatically (assuming it is the correct one). >>>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>>> library) that does this job? >>>>> Cheers, >>>>> Herv? >>>>> _______________________________________________ >>>>> EMBOSS mailing list >>>>> EMBOSS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>>> >>>> >>>> >>> >> >> >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > > > From andrespinzon at gmail.com Tue Dec 6 14:19:39 2011 From: andrespinzon at gmail.com (Andres Pinzon) Date: Tue, 6 Dec 2011 09:19:39 -0500 Subject: [EMBOSS] ORF selection with EMBOSS In-Reply-To: <4EDE249B.7080207@pasteur.fr> References: <4EDCEDDC.3020200@pasteur.fr> <36253.172.22.100.208.1323169215.squirrel@webmail.ebi.ac.uk> <4EDDF75F.8020400@pasteur.fr> <48987.172.22.100.208.1323169861.squirrel@webmail.ebi.ac.uk> <4EDE249B.7080207@pasteur.fr> Message-ID: D'accord ;) Best On Tue, Dec 6, 2011 at 9:20 AM, Herv? M?nager wrote: > Hi Andres, > From my point of view, it would do the job if I didn't have a _list_ of > genes as an input: I hence need to input a list of genes and get as an > output a list of the longest ORF computed for each gene. If I do this > directly with getorf and sizeseq, my understanding is that sizeseq won't > allow me to select the longest ORF for each input sequence. I can make the > loop myself, but I won't unless it is not ?here somewhere in EMBOSS ;). > Cheers, > Herv? > > > On 12/06/2011 01:53 PM, Andres Pinzon wrote: >> >> Herve, >> Have you tried using "getorf" and then "sizeseq"? >> I think it will work. Then you could get the first sequence from the >> output. >> >> Best, >> >> On Tue, Dec 6, 2011 at 6:11 AM, Jon Ison ?wrote: >>> >>> A note of it is already on the SourceForge "Feature Requests" ... >>> >>> J:) >>> >>> >>> >>>> Hi Jon, >>>> >>>> That's exactly what would make me happy. Assuming it makes it easier for >>>> you if I formulate officially this request, where do I need to send the >>>> request? >>>> >>>> Herv? >>>> >>>> On 12/06/2011 12:00 PM, Jon Ison wrote: >>>>> >>>>> Hi Herv? >>>>> >>>>> Nothing in EMBOSS so far as I'm aware, but if all you want is for >>>>> checktrans to report the >>>>> longest >>>>> ORF then that should be an easy enough option to add. >>>>> >>>>> Cheers >>>>> >>>>> Jon >>>>> >>>>> >>>>>> Hi, >>>>>> I have been using transeq and checktrans to find ORFs in DNA >>>>>> sequences. >>>>>> I have an question regarding the selection of the ORFs in case >>>>>> checktrans finds multiple ones in the result of transeq: I would like >>>>>> to >>>>>> select the longest one automatically (assuming it is the correct one). >>>>>> Is there any existing tool in EMBOSS (or alternatively in any Bio* >>>>>> library) that does this job? >>>>>> Cheers, >>>>>> Herv? >>>>>> _______________________________________________ >>>>>> EMBOSS mailing list >>>>>> EMBOSS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/emboss >>>>>> >>>>> >>>>> >>>> >>> >>> >>> _______________________________________________ >>> EMBOSS mailing list >>> EMBOSS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/emboss >> >> >> >> > -- Andr?s Pinz?n From Scott.Markel at accelrys.com Tue Dec 6 20:21:05 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Tue, 6 Dec 2011 12:21:05 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <4EDDD7D1.4030001@ebi.ac.uk> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783783@EXCH1-COLO.accelrys.net> <4EDDD7D1.4030001@ebi.ac.uk> Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA78386B@EXCH1-COLO.accelrys.net> Peter, Yes, what I generated for you was Windows-based, but we run EMBOSS on both Windows and Linux in Pipeline Pilot. Regarding a patch, our next release is later in 2012, so whatever is more convenient for you regarding timing is fine with us. Scott -----Original Message----- From: Peter Rice [mailto:pmr at ebi.ac.uk] Sent: Tuesday, 06 December 06 2011 12:53 AM To: Scott Markel Cc: emboss at lists.open-bio.org; Kristine Briedis Subject: Re: HTML tag mismatch in acdtable output for fuzzpro Dear Scott, On 06/12/2011 00:35, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing. Oops. Well spotted. Simple fix in ajax/acd/ajacd.c ... we have a new release due in January when our current funding ends, but will look to release a patch for you. From the version number, was this mEMBOSS you were using? regards, Peter Rice EMBOSS Team From Scott.Markel at accelrys.com Tue Dec 6 20:21:18 2011 From: Scott.Markel at accelrys.com (Scott Markel) Date: Tue, 6 Dec 2011 12:21:18 -0800 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> Message-ID: <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> Chris, I don't think so. We certainly made changes to our BioPerl copy to work around the EMBOSS bug, but I don't think BioPerl needs to incorporate what we did (writing a little subroutine to fix the tags). An EMBOSS fix takes care of our problem. Scott -----Original Message----- From: Fields, Christopher J [mailto:cjfields at illinois.edu] Sent: Monday, 05 December 05 2011 6:04 PM To: Scott Markel Cc: emboss at lists.open-bio.org; Kristine Briedis Subject: Re: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro Scott, is this something that needs to be addressed on the bioperl end? chris On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. > > Here are EMBOSS command lines for embossversion and acdtable. > >> embossversion > Reports the current EMBOSS version number > 6.4.0.2 > >> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html > > And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . > > Lines 66-9: > > > "-sequence" associated seqall qualifiers > > Lines 183-6: > > > "-pattern" associated pattern qualifiers > > Lines 212-5: > > > "-outfile" associated report qualifiers > > Scott > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Secretary, Board of Directors: > International Society for Computational Biology > Chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss From cjfields at illinois.edu Tue Dec 6 22:00:52 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 6 Dec 2011 22:00:52 +0000 Subject: [EMBOSS] problem with EMBOSS configuration '--without x' Message-ID: <68A7E0DA-660A-48CC-9646-4E3CD42C2367@illinois.edu> Not sure when the next version is due out, but I thought this is worth mentioning in case someone else runs into it. We had an odd issue with our local EMBOSS installation (6.4.0) on a RHEL VM which is likely a bug in the configuration step. We were installing for use mainly with Galaxy, and didn't need X11 configuration, so we configured as follows: ./configure --prefix=/opt/local/Bio/EMBOSS --without-x However, the installed binaries couldn't find the acd files or data; the only way to find them (as well as the data files) was to explicitly set EMBOSS_ACDROOT and EMBOSS_DATA. Oddly, installing libx11-devel and removing the '--without-x' flag during configuration worked just fine, no need for default env variables or .embossrc files. Anyone else run into this? chris Christopher Fields Senior Research Scientist National Center for Supercomputing Applications Institute for Genomic Biology University of Illinois Urbana-Champaign 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From cjfields at illinois.edu Wed Dec 7 02:20:55 2011 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 7 Dec 2011 02:20:55 +0000 Subject: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro In-Reply-To: <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> References: <5ACBA19439E77B43A06F4CAB897EC97702FA783787@EXCH1-COLO.accelrys.net> <5ACBA19439E77B43A06F4CAB897EC97702FA78386C@EXCH1-COLO.accelrys.net> Message-ID: <7A5DAABA-33BE-4D94-8224-2353D3C2B90F@illinois.edu> Yeah, caught Peter's response on that. Just wanted to make sure there isn't anything we need to do from our end :) chris On Dec 6, 2011, at 2:21 PM, Scott Markel wrote: > Chris, > > I don't think so. We certainly made changes to our BioPerl copy to work around the EMBOSS bug, but I don't think BioPerl needs to incorporate what we did (writing a little subroutine to fix the tags). An EMBOSS fix takes care of our problem. > > Scott > > > -----Original Message----- > From: Fields, Christopher J [mailto:cjfields at illinois.edu] > Sent: Monday, 05 December 05 2011 6:04 PM > To: Scott Markel > Cc: emboss at lists.open-bio.org; Kristine Briedis > Subject: Re: [EMBOSS] HTML tag mismatch in acdtable output for fuzzpro > > Scott, is this something that needs to be addressed on the bioperl end? > > chris > > On Dec 5, 2011, at 7:14 PM, Scott Markel wrote: > >> We use BioPerl to build EMBOSS command lines in Pipeline Pilot. After updating BioPerl to 1.6.9 and EMBOSS to 6.4.0 we noticed a problem. There are HTML tag mismatches that BioPerl, via XML::Twig, can't handle and skips. In investigating a bit, it looks like there was a change in the acdtable output. >> >> Here are EMBOSS command lines for embossversion and acdtable. >> >>> embossversion >> Reports the current EMBOSS version number >> 6.4.0.2 >> >>> acdtable fuzzpro -help -verbose >& fuzzpro_6.4.0.html >> >> And here are the three sets of tag mismatches in the HTML. All involve an opening and a closing . >> >> Lines 66-9: >> >> >> "-sequence" associated seqall qualifiers >> >> Lines 183-6: >> >> >> "-pattern" associated pattern qualifiers >> >> Lines 212-5: >> >> >> "-outfile" associated report qualifiers >> >> Scott >> >> Scott Markel, Ph.D. >> Principal Bioinformatics Architect email: smarkel at accelrys.com >> Accelrys (Pipeline Pilot R&D) mobile: +1 858 205 3653 >> 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 >> San Diego, CA 92121 fax: +1 858 799 5222 >> USA web: http://www.accelrys.com >> >> http://www.linkedin.com/in/smarkel >> Secretary, Board of Directors: >> International Society for Computational Biology >> Chair: ISCB Publications Committee >> Associate Editor: PLoS Computational Biology >> Editorial Board: Briefings in Bioinformatics >> >> >> >> >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss > > >