From jbdundas at gmail.com Sat Aug 9 15:18:33 2008 From: jbdundas at gmail.com (jitesh dundas) Date: Sun, 10 Aug 2008 00:48:33 +0530 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> Message-ID: <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> Hi, Based on the idea from Mr. Paul, I am trying to parallelize some of the cpu intensive applications and hope if I can resolve some issues here. I am looking in Emboss Homepage for existing and past projects, to get a feel of the projects that may fit this requirement. Once done, we can plan and try to parallelize them. I have a few ideas here which I will share with you soon for your review and feedback. Anybody having any ideas/inputs is most welcome. Regards, Jitesh Dundas http://jiteshbdundas.blogspot.com Mobile- +91-9860925706 ---------- Forwarded message ---------- From: "Guermonprez, Paul" Date: Wed, 30 Jul 2008 13:44:55 +0100 Subject: RE: [emboss-dev] Request For Work To: jitesh dundas Hello, You may try to parallelize some of the most cpu intensive apps ? Regards, Paul. Paul Guermonprez - Intel Sr. Software Engineer - Intel Software EMEA email : paul.guermonprez at intel.com phone : +33 1 58 87 72 41 mobile : +33 6 26 23 67 62 -----Original Message----- From: emboss-dev-bounces at lists.open-bio.org [mailto:emboss-dev-bounces at lists.open-bio.org] On Behalf Of jitesh dundas Sent: Wednesday, July 30, 2008 2:32 PM To: emboss-dev at lists.open-bio.org Subject: [emboss-dev] Request For Work Dear All, Greetings! I am interested in contributing to this wonderful project. Can somebody please suggest where to start? *About MySelf* I want to persue a doctoral program in Bio-Informatics next year, for which I am preparing right now. However, I am still looking for some guidance and work in this field to help me get the experience needed. I have an inclination towards the fields of Finance,Medicine and Computers and constanlty look out for ways to use my knowledge and capabilities to innovate and discover in them. I have a passion for research-work, working on complex and ambiguous problems. I relentlessly work towards resolving them until they are solved permanently with the best possible solution. I am always ready to learn new things and teach others too. I can work independently as well as in teams with minimum supervision. I continuously try to improvize in my work by focussing on innovation and scientific reasoning. I am a hard-working, sincere and research-oriented professional with a scientific bend of mind and a never give-up attitude. I hope you will give me an opportunity to explore the world of research under you. Yours sincerely, Jitesh Dundas http://jiteshbdundas.blogspot.com Mobile- +91-9860925706 _______________________________________________ emboss-dev mailing list emboss-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss-dev --------------------------------------------------------------------- Intel Corporation SAS (French simplified joint stock company) Registered headquarters: "Les Montalets"- 2, rue de Paris, 92196 Meudon Cedex, France Registration Number: 302 456 199 R.C.S. NANTERRE Capital: 4,572,000 Euros This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From pmr at ebi.ac.uk Sun Aug 10 06:38:14 2008 From: pmr at ebi.ac.uk (Peter Rice) Date: Sun, 10 Aug 2008 11:38:14 +0100 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> Message-ID: <489EC516.7040105@ebi.ac.uk> Dear Jitesh, > Based on the idea from Mr. Paul, I am trying to parallelize some of > the cpu intensive applications and hope if I can resolve some issues > here. Interesting. Which platform are you working on? Are you planning to look only at the main application, or to parallelize some of the library code too? > I am looking in Emboss Homepage for existing and past projects, to get > a feel of the projects that may fit this requirement. We are working on updating the website. What information are you looking for? regards, Peter Rice From kpodesta at redbrick.dcu.ie Mon Aug 11 05:36:37 2008 From: kpodesta at redbrick.dcu.ie (Karl Podesta) Date: Mon, 11 Aug 2008 10:36:37 +0100 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> Message-ID: <20080811093637.GA9624@minerva.redbrick.dcu.ie> Hi Jitesh, You might be interested in a paper I wrote a good few years ago now; I did an exploratory performance comparison of all EMBOSS apps in the suite (at the time) by just running them with larger sequence sizes, and then wrote a shell script to parallelise them accross multiple cluster nodes on the basis of sequence only. Very basic, mostly exploratory and indicative, but it might be of some help. http://www.computing.dcu.ie/~kpodesta/papers/EMBOSSpaperLNCS.pdf Regards, Karl On Sun, Aug 10, 2008 at 12:48:33AM +0530, jitesh dundas wrote: > Hi, > > Based on the idea from Mr. Paul, I am trying to parallelize some of > the cpu intensive applications and hope if I can resolve some issues > here. > I am looking in Emboss Homepage for existing and past projects, to get > a feel of the projects that may fit this requirement. > Once done, we can plan and try to parallelize them. I have a few ideas > here which I will share with you soon for your review and feedback. > > Anybody having any ideas/inputs is most welcome. > > Regards, > Jitesh Dundas > http://jiteshbdundas.blogspot.com > Mobile- +91-9860925706 > > > ---------- Forwarded message ---------- > From: "Guermonprez, Paul" > Date: Wed, 30 Jul 2008 13:44:55 +0100 > Subject: RE: [emboss-dev] Request For Work > To: jitesh dundas > > Hello, > > You may try to parallelize some of the most cpu intensive apps ? > > Regards, Paul. > > > Paul Guermonprez - Intel > Sr. Software Engineer - Intel Software EMEA > email : paul.guermonprez at intel.com > phone : +33 1 58 87 72 41 > mobile : +33 6 26 23 67 62 > > > > -----Original Message----- > From: emboss-dev-bounces at lists.open-bio.org > [mailto:emboss-dev-bounces at lists.open-bio.org] On Behalf Of jitesh > dundas > Sent: Wednesday, July 30, 2008 2:32 PM > To: emboss-dev at lists.open-bio.org > Subject: [emboss-dev] Request For Work > > Dear All, > > Greetings! > > I am interested in contributing to this wonderful project. Can somebody > please suggest where to start? > > *About MySelf* > > I want to persue a doctoral program in Bio-Informatics next year, for > which I am preparing right now. > > However, I am still looking for some guidance and work in this field to help > me get the experience needed. > > I have an inclination towards the fields of Finance,Medicine and Computers > and constanlty look out for ways to use my knowledge and > capabilities to innovate and discover in them. > > I have a passion for research-work, working on complex and ambiguous > problems. I relentlessly work towards resolving them until they are > solved permanently with the best possible solution. > I am always ready to learn new things and teach others too. > > I can work independently as well as in teams with minimum supervision. > > I continuously try to improvize in my work by focussing on innovation and > scientific reasoning. > > I am a hard-working, sincere and research-oriented professional with a > scientific bend of mind and a never give-up attitude. > > I hope you will give me an opportunity to explore the world of > research under you. > > Yours sincerely, > Jitesh Dundas > http://jiteshbdundas.blogspot.com > Mobile- +91-9860925706 > _______________________________________________ > emboss-dev mailing list > emboss-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss-dev > --------------------------------------------------------------------- > Intel Corporation SAS (French simplified joint stock company) > Registered headquarters: "Les Montalets"- 2, rue de Paris, > 92196 Meudon Cedex, France > Registration Number: 302 456 199 R.C.S. NANTERRE > Capital: 4,572,000 Euros > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > _______________________________________________ > emboss-dev mailing list > emboss-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss-dev > From jbdundas at gmail.com Wed Aug 13 10:12:43 2008 From: jbdundas at gmail.com (jitesh dundas) Date: Wed, 13 Aug 2008 19:42:43 +0530 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <489EC516.7040105@ebi.ac.uk> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> <489EC516.7040105@ebi.ac.uk> Message-ID: <326ea8620808130712m6acd991bj5d0bea5727b7c21a@mail.gmail.com> Dear Peter, Thank you for your reply. Please excuse me for the delay in replying as I was out of town. I am looking at working on this issue in 2 ways:- 1) I wish to parallelize the phases of different softwares( if they are in develpment stage). 2) Next, if there is a connection or dependency between two or more projects( or applications), then we can try to give the output that is needed based on the current status of the output-supplying application. I think this should save us a lot of time in processing of information or requests, as we will not have to wait till the entire application is completed. I will need to know if there is any relationship identified between any of the applications defined in the EMBOSS project. If there are any relations already present between the applications, it will become easier to get a handle to move the execution from one point to another. Also, Running applications in parallel will require a change in the way we make our applications. We need to define a master relationship between all the apllications, so as to relate all the applications with each other. I am looking at ways to create a master relationship and also at ways to parallelize the execution of applications. Any feedback on this topic is most welcome. Regards, Jitesh Dundas http://jiteshbdundas.blogspot.com Mobile- +91-9860925706 On Sun, Aug 10, 2008 at 4:08 PM, Peter Rice wrote: > Dear Jitesh, > > Based on the idea from Mr. Paul, I am trying to parallelize some of >> the cpu intensive applications and hope if I can resolve some issues >> here. >> > > Interesting. Which platform are you working on? > > Are you planning to look only at the main application, or to parallelize > some of the library code too? > > I am looking in Emboss Homepage for existing and past projects, to get >> a feel of the projects that may fit this requirement. >> > > We are working on updating the website. What information are you looking > for? > > regards, > > Peter Rice > From pmr at ebi.ac.uk Wed Aug 13 10:21:58 2008 From: pmr at ebi.ac.uk (Peter Rice) Date: Wed, 13 Aug 2008 15:21:58 +0100 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <326ea8620808130712m6acd991bj5d0bea5727b7c21a@mail.gmail.com> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> <489EC516.7040105@ebi.ac.uk> <326ea8620808130712m6acd991bj5d0bea5727b7c21a@mail.gmail.com> Message-ID: <48A2EE06.3050708@ebi.ac.uk> Dear jitesh, > Thank you for your reply. Please excuse me for the delay in replying as I > was out of town. > I am looking at working on this issue in 2 ways:- > 1) I wish to parallelize the phases of different softwares( if they are in > develpment stage). > 2) Next, if there is a connection or dependency between two or more > projects( or applications), then we can try to give the output that is > needed based on the current status of the output-supplying application. Aha ... so you are looking at running several EMBOSS applications in parallel? That is a very interesting issue for us. > I will need to know if there is any relationship identified between any of > the applications defined in the EMBOSS project. If there are any relations > already present between the applications, it will become easier to get a > handle to move the execution from one point to another. The inputs and outputs of all EMBOSS applications are marked up in the .acd files with a "knowntype" that identifies common outputs that could, for example, be combined and visuallised together - and also which ooutput could be used as inputs by other applications. For sequences, features, alignments and reports this includes whether the type is nucleotide or protein. > Also, Running applications in parallel will require a change in the way we > make our applications. We need to define a master relationship between all > the apllications, so as to relate all the applications with each other. We are also looking at adding definitions for the algorithm used by an applications, and a standard way to represent the transformations of inputs into outputs. Any feedback on these issues would be very welcome. We are also interested in looking at executing EMBOSS code in parallel is anyone is looking at that. regards, Peter Rice From David.Lapointe at umassmed.edu Wed Aug 13 10:46:31 2008 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Wed, 13 Aug 2008 10:46:31 -0400 Subject: [emboss-dev] Fwd: Request For Work In-Reply-To: <48A2EE06.3050708@ebi.ac.uk> References: <326ea8620807300515n521c4f1ekf9042983fbc136f5@mail.gmail.com> <326ea8620807300531o43c7d25emad126ac26f0c376a@mail.gmail.com> <9F3EABD6E3419B4C81F34EAABB4D40181BF50DD2@irsmsx501.ger.corp.intel.com> <326ea8620808091218t12a6f1e8xd5354e9b585dc8ff@mail.gmail.com> <489EC516.7040105@ebi.ac.uk> <326ea8620808130712m6acd991bj5d0bea5727b7c21a@mail.gmail.com> <48A2EE06.3050708@ebi.ac.uk> Message-ID: <5ECA525B88314B48870E4AC72E3B9AF202469E86@EDUNIVMAIL05.ad.umassmed.edu> We are running Rocks (4.3) on our cluster currently and the Bio roll has EMBOSS installed (4.1.0). One peculiarity is that EMBOSS is installed on every node locally so that updating databases ( rebase, tfsites, etc) must be done on every node. Other than that there could some creative work with distributed computing ( distinct from mpi which would also be interesting ). Having a mechanism to share the data would be a plus. David -----Original Message----- From: emboss-dev-bounces at lists.open-bio.org [mailto:emboss-dev-bounces at lists.open-bio.org] On Behalf Of Peter Rice Sent: Wednesday, August 13, 2008 10:22 AM To: jitesh dundas Cc: emboss-dev at lists.open-bio.org Subject: Re: [emboss-dev] Fwd: Request For Work Dear jitesh, > Thank you for your reply. Please excuse me for the delay in replying > as I was out of town. > I am looking at working on this issue in 2 ways:- > 1) I wish to parallelize the phases of different softwares( if they > are in develpment stage). > 2) Next, if there is a connection or dependency between two or more > projects( or applications), then we can try to give the output that is > needed based on the current status of the output-supplying application. Aha ... so you are looking at running several EMBOSS applications in parallel? That is a very interesting issue for us. > I will need to know if there is any relationship identified between > any of the applications defined in the EMBOSS project. If there are > any relations already present between the applications, it will become > easier to get a handle to move the execution from one point to another. The inputs and outputs of all EMBOSS applications are marked up in the .acd files with a "knowntype" that identifies common outputs that could, for example, be combined and visuallised together - and also which ooutput could be used as inputs by other applications. For sequences, features, alignments and reports this includes whether the type is nucleotide or protein. > Also, Running applications in parallel will require a change in the > way we make our applications. We need to define a master relationship > between all the apllications, so as to relate all the applications with each other. We are also looking at adding definitions for the algorithm used by an applications, and a standard way to represent the transformations of inputs into outputs. Any feedback on these issues would be very welcome. We are also interested in looking at executing EMBOSS code in parallel is anyone is looking at that. regards, Peter Rice _______________________________________________ emboss-dev mailing list emboss-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss-dev