[Biojava-dev] [GSoC] Project Proposal
Nirmal Fernando
nirmal070125 at gmail.com
Wed Apr 6 16:58:10 UTC 2011
Hi Peter,
On Wed, Apr 6, 2011 at 9:28 PM, Peter Troshin <p.v.troshin at dundee.ac.uk>wrote:
> Hi Nirmal,
>
> Thanks for improving your proposal.
> Yes, this seems useful although it may be a little out of scope for this
> project. I think that calculating some other useful property of the
> peptide/protein or nucleic acid would have been a better fit.
>
>
I see! What you think about following calculations:
- Calculate volume of an amino acid sequence
- Calculate amino acid composition: eg: ACSGGS
- Alanine A 16.67%
- Cysteine C 16.67%
- Glycine G 33.33%
- Serine S 33.33%
- Calculate atomic composition of a protein
- Sequence word count: count the number of occurrences of a word in
sequence.
-
Sequence word count('GCTATAACGTATATATAT','TATA') = 3
- Count n-mers in a nucleotide sequence eg: AAGT
- dimer counts: AA - 1, AG- 1, GT- 1 & all others 0
Would these be too simple?
Thanks.
Regards,
> Peter
>
>
>
>
> On 06/04/2011 15:57, Nirmal Fernando wrote:
>
>> Hi,
>>
>> In addition to the functionalities provided in my proposal, I would like
>> to build a tool like http://gcua.schoedl.de/ which will be used to
>> display the codon quality in codon usage frequency values.
>>
>> It would be nice to get the feedback of the community regarding the
>> importance of a tool like this to BioJava3.
>>
>> Thanks.
>>
>>
>> On Tue, Apr 5, 2011 at 9:34 PM, Nirmal Fernando <nirmal070125 at gmail.com<mailto:
>> nirmal070125 at gmail.com>> wrote:
>>
>> Hi Peter,
>>
>> On Tue, Apr 5, 2011 at 9:18 PM, Peter Troshin
>> <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>> wrote:
>> >
>> > Hi Nirmal,
>> >
>> > First of all thanks for the proposal it looks good.
>> > However, I think that one of the benefits of my project idea is
>> that it lets you implement a few other methods that are of
>> interest to >you. It is a pity that you did not use this
>> opportunity. I strongly encourage to use your knowledge and to
>> look at the other properties >that you can implement for the
>> benefit of the community. Otherwise it looks like you are not
>> terribly interested in Bioinformatics!
>>
>> Sorry for the disappointment! This week is a bit busy week with me
>> having few events at the University, that's why I didn't get much time
>> to look for other methods. But I'll try my best to research and
>> propose some other methods which will benefit the community.
>>
>> >
>> > Also, I think that the best method of learning BioJava is trying
>> it. So I'd put in the project plan that you will write test cases
>> to check >out the parts of BioJava that you will be using. Apart
>> from helping you learning it in depth it will also help to ensure
>> that the BioJava >code behaves.
>>
>> Thanks for the tip! :)
>>
>> >
>> > Regards,
>> > Peter
>> >
>> > On 05/04/2011 14:40, Nirmal Fernando wrote:
>> >>
>> >> Hi All,
>> >>
>> >> I have prepared my GSoC proposal for BioJava [1]. I highly
>> appreciate your valuable feedback.
>> >>
>> >> Thanks.
>> >>
>> >> [1]
>> >>
>> >>
>> >> Google Summer of Code 2011 - Project Proposal
>> >>
>> >> Organization
>> >>
>> >>
>> >>
>> >> *Open Bioinformatics Foundation- BioJava*
>> >>
>> >> Project
>> >>
>> >>
>> >>
>> >> *Calculation of Physicochemical Properties of Amino Acids*
>> >>
>> >> Student Name
>> >>
>> >>
>> >>
>> >> C. S. Nirmal J. Fernando.
>> >>
>> >> E-mail
>> >>
>> >>
>> >>
>> >> nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>> <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>
>>
>> >>
>> >> IM
>> >>
>> >>
>> >>
>> >> nirmal070125 (Google Talk)
>> >>
>> >> nirmalfdo (IRC – freenode.net <http://freenode.net>
>> <http://freenode.net>)
>> >>
>> >> Address
>> >>
>> >>
>> >>
>> >> 47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.
>> >>
>> >> Mobile No.
>> >>
>> >>
>> >>
>> >> +94715779733
>> >>
>> >> *Why I am interested?*
>> >>
>> >> *
>> >> *
>> >>
>> >> I have recently finished a course module on Bio Informatics and
>> have a basic
>> >> understanding about bio informatics related algorithms which
>> made me interested in this area of computer science.
>> >>
>> >> *
>> >> *
>> >>
>> >> *Why I am well-suited?*
>> >>
>> >>
>> >> I participated in GSoC 2010 for Apache Derby (RDBMS in Java)
>> project and successfully finished the project. The sounding Java
>> knowledge, algorithmic knowledge on bio informatics and the
>> experiences of concurrent programming make me more comfortable and
>> matching.
>> >>
>> >>
>> >> “Nirmal joined the Apache Derby community as a *Google Summer
>> of Code *student for the summer of 2010. In this role, Nirmal
>> wrote a very useful tool called PlanExporter. This tool will help
>> users of the *Apache Derby *database understand and fix
>> performance issues in their data-rich applications. Nirmal fit
>> well into our open-source community, collaborating with other
>> engineers, proceeding incrementally, and seeking and taking advice
>> cheerfully. Nirmal's contributions to Apache Derby are highly
>> respected.”-//*Richard Hillegas*
>> <
>> http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> <
>> http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> >>*,
>> /Senior Software Engineer, Sun Microsystems./*
>> >>
>> >>
>> >> “Nirmal's work on the Derby PlanExporter tool as part of the
>> Google 2010 Summer of Code was clear, well-executed and
>> successful. Furthermore, every member of the Derby team that I've
>> spoken to has been pleased with Nirmal's contributions to the
>> community and we look forward to having Nirmal continuing to work
>> with Derby in the future.”- *Bryan Pendleton*
>> <
>> http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> <
>> http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>> >>*,
>> /Committer, Apache Derby/.*
>> >>
>> >> *
>> >> *
>> >>
>> >> *Programming Experiences and Skills*
>> >>
>> >> ·Completed the “short coding exercise” (all three goals) given
>> by the mentor.
>> >>
>> >> ·Final Year Project: SeMap is our final year project and a four
>> member one which is led by me. Objective is to develop a superior
>> framework for mapping English Language Semantic Dependency
>> Relationships to sets of semantic frames with reasonable accuracy
>> for complex sentences with an integrated statistical linguistics
>> based artificial intelligence component to allow automatic
>> extensibility.We are working under OpenCog.org, a FOSS foundation,
>> under the supervision of Dr. Ben Goertzel. Technologies: [Java,
>> Drools]
>> >>
>> >> *Contributions to Open Source world*
>> >>
>> >> *
>> >> *
>> >>
>> >> ·Implemented PlanExporter tool which allows Apache Derby users
>> to view and understand the query plan followed by the optimizer.
>> Technologies: [Java, XML, XSLT, HTML and CSS] (Google Summer of
>> Code – 2010 project)
>> >>
>> >> ·Solved many issues in Apache Derby
>> https://issues.apache.org/jira/secure/IssueNavigator.jspa?
>> >>
>> >> ·Continuing to work on Apache Derby even after the summer of code.
>> >>
>> >> **
>> >>
>> >> *Project Rationale*
>> >>
>> >> *
>> >> *
>> >>
>> >> The calculation of simple physicochemical properties for
>> biopolymers is an important tool in the arsenal of molecular
>> biologist. Theoretically calculated quantities like extinction
>> coefficients, isoelectric points, hydrophobicities and instability
>> indices are useful guides as to how a molecule behaves in an
>> experiment. Many tools for calculating these properties exist,
>> including widely used open-source implementations in EMBOSS and
>> BioPerl, but only some are currently available in BioJava3. The
>> aim of this project is to port or produce new implementations of
>> standard algorithms for a range of calculations within BioJava3.
>> >>
>> >> *Project Scope *
>> >>
>> >> *
>> >> *
>> >>
>> >> Primarily focus on developing following functionalities:
>> >>
>> >> 1. Finding molecular weight of a sequence
>> >> 2. Finding extinction coefficient of a protein
>> >> 3. Finding instability index of a protein
>> >> 4. Finding aliphatic index of a protein
>> >> 5. Finding GRAVY (Grand Average of Hydropathy) value of a peptide
>> >> or a protein
>> >> 6. Finding isoelectric point of a sequence
>> >> 7. Finding number of amino acids in a protein (His, Met, Cys)
>> >>
>> >> **
>> >>
>> >> *Project Plan*
>> >>
>> >> *April 20 - May 10*
>> >>
>> >> * Read on BioJava3 design
>> http://biojava.org/wiki/BioJava3_Design
>> >> * Read on BioJava3 data model
>> >> http://www.biojava.org/wiki/BioJava3_Proposal
>> >> * Get an understanding on how each BioJava3 module works and
>> their
>> >> functionalities.
>> >> * Find and read on algorithms to provide above mentioned
>> >> functionalities.
>> >> * Identify the possibility of using methods and tools in
>> BioJava3
>> >>
>> >> *May**11 - May 24*
>> >>
>> >> * Implement functions to calculate molecular weight of a
>> sequence
>> >> and extinction coefficient of a protein using multi threads
>> >> where it is possible.
>> >> * Implement functional test cases using Junit.
>> >> * Develop a high level documentation for end users.
>> >>
>> >> *May 24 - July 10*
>> >>
>> >> * Preparing for the mid-term evaluation of the project.
>> >>
>> >> *
>> >> *
>> >>
>> >> *July 12 - August 15*
>> >>
>> >> * Implement functions to calculate,
>> >>
>> >> o Instability index of a protein
>> >> o Aliphatic index of a protein
>> >> o GRAVY (Grand Average of Hydropathy) value for a
>> peptide or
>> >> a protein
>> >> o Isoelectric point of a sequence
>> >>
>> >> o number of amino acids in a protein (His, Met, Cys)
>> >>
>> >> ; using multi threads where it is possible.
>> >>
>> >> * Implement functional test cases using Junit.
>> >> * Update the high level documentation for end users.
>> >>
>> >> *August 16 - August 22*
>> >>
>> >> * Wrap up the work done, and polishing up the code.
>> >> * Creating Java-doc API
>> >> * Preparing for the final evaluation.
>> >>
>> >> *August 26*
>> >>
>> >> * Final evaluation deadline.
>> >>
>> >> *Project Deliverables*
>> >>
>> >> ·Java library with above mentioned functionalities.
>> >>
>> >> ·Command line executables.
>> >>
>> >> ·Java doc API of the library.
>> >>
>> >> ·Functional test cases.
>> >>
>> >> ·High level end user documentation
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >> Nirmal
>> >>
>> >> C.S.Nirmal J. Fernando
>> >> Department of Computer Science & Engineering,
>> >> Faculty of Engineering,
>> >> University of Moratuwa,
>> >> Sri Lanka.
>> >>
>> >> Blog: http://nirmalfdo.blogspot.com/
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Best Regards,
>> Nirmal
>>
>> C.S.Nirmal J. Fernando
>> Department of Computer Science & Engineering,
>> Faculty of Engineering,
>> University of Moratuwa,
>> Sri Lanka.
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>>
>> --
>> Best Regards,
>> Nirmal
>>
>> C.S.Nirmal J. Fernando
>> Department of Computer Science & Engineering,
>> Faculty of Engineering,
>> University of Moratuwa,
>> Sri Lanka.
>>
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>
--
Best Regards,
Nirmal
C.S.Nirmal J. Fernando
Department of Computer Science & Engineering,
Faculty of Engineering,
University of Moratuwa,
Sri Lanka.
Blog: http://nirmalfdo.blogspot.com/
More information about the biojava-dev
mailing list