[Biojava-dev] [GSoC] Project Proposal

Peter Troshin p.v.troshin at dundee.ac.uk
Wed Apr 6 18:00:29 UTC 2011


These are definitely in-line with the proposal. I think that despite 
their simplicity they may be useful. However you should not implement 
things just because you can, I'd suggest doing it only if you find they 
are useful. So do you know what atomic composition might be useful for?
May be it is worth thinking about what plots you can generate - e.g. 
Hydropathy Plot?

Regards,
Peter


On 06/04/2011 17:58, Nirmal Fernando wrote:
> Hi Peter,
>
> On Wed, Apr 6, 2011 at 9:28 PM, Peter Troshin 
> <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>> wrote:
>
>     Hi Nirmal,
>
>     Thanks for improving your proposal.
>     Yes, this seems useful although it may be a little out of scope
>     for this project. I think that calculating some other useful
>     property of the peptide/protein or nucleic acid would have been a
>     better fit.
>
>
> I see! What you think about following calculations:
>
>     * Calculate volume of an amino acid sequence
>     * Calculate amino acid composition: eg: ACSGGS
>           o Alanine A 16.67%
>           o Cysteine C 16.67%
>           o Glycine G 33.33%
>           o Serine S 33.33%
>     * Calculate atomic composition of a protein
>     * Sequence word count: count the number of occurrences of a word
>       in sequence.
>          o
>             Sequence word count('GCTATAACGTATATATAT','TATA') = 3
>     * Count n-mers in a nucleotide sequence eg: AAGT
>           o dimer counts: AA - 1, AG- 1, GT- 1 & all others 0
>
> Would these be too simple?
>
> Thanks.
>
>     Regards,
>     Peter
>
>
>
>
>     On 06/04/2011 15:57, Nirmal Fernando wrote:
>
>         Hi,
>
>         In addition to the functionalities provided in my proposal, I
>         would like to build a tool like http://gcua.schoedl.de/ which
>         will be used to display the codon quality in codon usage
>         frequency values.
>
>         It would be nice to get the feedback of the community
>         regarding the importance of a tool like this to BioJava3.
>
>         Thanks.
>
>
>         On Tue, Apr 5, 2011 at 9:34 PM, Nirmal Fernando
>         <nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com
>         <mailto:nirmal070125 at gmail.com>>> wrote:
>
>            Hi Peter,
>
>            On Tue, Apr 5, 2011 at 9:18 PM, Peter Troshin
>         <p.v.troshin at dundee.ac.uk <mailto:p.v.troshin at dundee.ac.uk>
>         <mailto:p.v.troshin at dundee.ac.uk
>         <mailto:p.v.troshin at dundee.ac.uk>>> wrote:
>         >
>         > Hi Nirmal,
>         >
>         > First of all thanks for the proposal it looks good.
>         > However, I think that one of the benefits of my project idea is
>            that it lets you implement a few other methods that are of
>            interest to >you. It is a pity that you did not use this
>            opportunity. I strongly encourage to use your knowledge and to
>            look at the other properties >that you can implement for the
>            benefit of the community. Otherwise it looks like you are not
>            terribly interested in Bioinformatics!
>
>            Sorry for the disappointment! This week is a bit busy week
>         with me
>            having few events at the University, that's why I didn't
>         get much time
>            to look for other methods. But I'll try my best to research and
>            propose some other methods which will benefit the community.
>
>         >
>         > Also, I think that the best method of learning BioJava is trying
>            it. So I'd put in the project plan that you will write test
>         cases
>            to check >out the parts of BioJava that you will be using.
>         Apart
>            from helping you learning it in depth it will also help to
>         ensure
>            that the BioJava >code behaves.
>
>            Thanks for the tip! :)
>
>         >
>         > Regards,
>         > Peter
>         >
>         > On 05/04/2011 14:40, Nirmal Fernando wrote:
>         >>
>         >> Hi All,
>         >>
>         >> I have prepared my GSoC proposal for BioJava [1]. I highly
>            appreciate your valuable feedback.
>         >>
>         >> Thanks.
>         >>
>         >> [1]
>         >>
>         >>
>         >>  Google Summer of Code 2011 - Project Proposal
>         >>
>         >> Organization
>         >>
>         >>
>         >>
>         >> *Open Bioinformatics Foundation- BioJava*
>         >>
>         >> Project
>         >>
>         >>
>         >>
>         >> *Calculation of Physicochemical Properties of Amino Acids*
>         >>
>         >> Student Name
>         >>
>         >>
>         >>
>         >> C. S. Nirmal J. Fernando.
>         >>
>         >> E-mail
>         >>
>         >>
>         >>
>         >> nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>
>         <mailto:nirmal070125 at gmail.com <mailto:nirmal070125 at gmail.com>>>
>
>         >>
>         >> IM
>         >>
>         >>
>         >>
>         >> nirmal070125 (Google Talk)
>         >>
>         >> nirmalfdo (IRC – freenode.net <http://freenode.net>
>         <http://freenode.net>
>         <http://freenode.net>)
>         >>
>         >> Address
>         >>
>         >>
>         >>
>         >> 47, Keels Housing Scheme, Pinwatte, Panadura, Sri Lanka.
>         >>
>         >> Mobile No.
>         >>
>         >>
>         >>
>         >> +94715779733
>         >>
>         >> *Why I am interested?*
>         >>
>         >> *
>         >> *
>         >>
>         >> I have recently finished a course module on Bio Informatics and
>            have a basic
>         >> understanding about bio informatics related algorithms which
>            made me interested in this area of computer science.
>         >>
>         >> *
>         >> *
>         >>
>         >> *Why I am well-suited?*
>         >>
>         >>
>         >> I participated in GSoC 2010 for Apache Derby (RDBMS in Java)
>            project and successfully finished the project. The sounding
>         Java
>            knowledge, algorithmic knowledge on bio informatics and the
>            experiences of concurrent programming make me more
>         comfortable and
>            matching.
>         >>
>         >>
>         >> “Nirmal joined the Apache Derby community as a *Google Summer
>            of Code *student for the summer of 2010. In this role, Nirmal
>            wrote a very useful tool called PlanExporter. This tool
>         will help
>            users of the *Apache Derby *database understand and fix
>            performance issues in their data-rich applications. Nirmal fit
>            well into our open-source community, collaborating with other
>            engineers, proceeding incrementally, and seeking and taking
>         advice
>            cheerfully. Nirmal's contributions to Apache Derby are highly
>            respected.”-//*Richard Hillegas*
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=31609548&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
>            /Senior Software Engineer, Sun Microsystems./*
>         >>
>         >>
>         >> “Nirmal's work on the Derby PlanExporter tool as part of the
>            Google 2010 Summer of Code was clear, well-executed and
>            successful. Furthermore, every member of the Derby team
>         that I've
>            spoken to has been pleased with Nirmal's contributions to the
>            community and we look forward to having Nirmal continuing
>         to work
>            with Derby in the future.”- *Bryan Pendleton*
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1
>         <http://www.linkedin.com/profile/view?id=492758&noCreateProposal=true&goback=%2Enpe_*1_*1_*1_*1>>>*,
>            /Committer, Apache Derby/.*
>         >>
>         >> *
>         >> *
>         >>
>         >> *Programming Experiences and Skills*
>         >>
>         >> ·Completed the “short coding exercise” (all three goals) given
>            by the mentor.
>         >>
>         >> ·Final Year Project: SeMap is our final year project and a four
>            member one which is led by me. Objective is to develop a
>         superior
>            framework for mapping English Language Semantic Dependency
>            Relationships to sets of semantic frames with reasonable
>         accuracy
>            for complex sentences with an integrated statistical
>         linguistics
>            based artificial intelligence component to allow automatic
>            extensibility.We are working under OpenCog.org, a FOSS
>         foundation,
>            under the supervision of Dr. Ben Goertzel. Technologies: [Java,
>            Drools]
>         >>
>         >> *Contributions to Open Source world*
>         >>
>         >> *
>         >> *
>         >>
>         >> ·Implemented PlanExporter tool which allows Apache Derby users
>            to view and understand the query plan followed by the
>         optimizer.
>            Technologies: [Java, XML, XSLT, HTML and CSS] (Google Summer of
>            Code – 2010 project)
>         >>
>         >> ·Solved many issues in Apache Derby
>         https://issues.apache.org/jira/secure/IssueNavigator.jspa?
>         >>
>         >> ·Continuing to work on Apache Derby even after the summer
>         of code.
>         >>
>         >> **
>         >>
>         >> *Project Rationale*
>         >>
>         >> *
>         >> *
>         >>
>         >> The calculation of simple physicochemical properties for
>            biopolymers is an important tool in the arsenal of molecular
>            biologist. Theoretically calculated quantities like extinction
>            coefficients, isoelectric points, hydrophobicities and
>         instability
>            indices are useful guides as to how a molecule behaves in an
>            experiment. Many tools for calculating these properties exist,
>            including widely used open-source implementations in EMBOSS and
>            BioPerl, but only some are currently available in BioJava3. The
>            aim of this project is to port or produce new
>         implementations of
>            standard algorithms for a range of calculations within
>         BioJava3.
>         >>
>         >> *Project Scope *
>         >>
>         >> *
>         >> *
>         >>
>         >> Primarily focus on developing following functionalities:
>         >>
>         >>   1. Finding molecular weight of a sequence
>         >>   2. Finding extinction coefficient of a protein
>         >>   3. Finding instability index of a protein
>         >>   4. Finding aliphatic index of a protein
>         >>   5. Finding GRAVY (Grand Average of Hydropathy) value of a
>         peptide
>         >>      or a protein
>         >>   6. Finding isoelectric point of a sequence
>         >>   7. Finding number of amino acids in a protein (His, Met, Cys)
>         >>
>         >> **
>         >>
>         >> *Project Plan*
>         >>
>         >> *April 20 - May 10*
>         >>
>         >>    * Read on BioJava3 design
>         http://biojava.org/wiki/BioJava3_Design
>         >>    * Read on BioJava3 data model
>         >> http://www.biojava.org/wiki/BioJava3_Proposal
>         >>    * Get an understanding on how each BioJava3 module works and
>            their
>         >>      functionalities.
>         >>    * Find and read on algorithms to provide above mentioned
>         >>      functionalities.
>         >>    * Identify the possibility of using methods and tools in
>            BioJava3
>         >>
>         >> *May**11 - May 24*
>         >>
>         >>    * Implement functions to calculate molecular weight of a
>            sequence
>         >>      and extinction coefficient of a protein using multi
>         threads
>         >>      where it is possible.
>         >>    * Implement functional test cases using Junit.
>         >>    * Develop a high level documentation for end users.
>         >>
>         >> *May 24 - July 10*
>         >>
>         >>    * Preparing for the mid-term evaluation of the project.
>         >>
>         >> *
>         >> *
>         >>
>         >> *July 12 - August 15*
>         >>
>         >>    * Implement functions to calculate,
>         >>
>         >>          o Instability index of a protein
>         >>          o Aliphatic index of a protein
>         >>          o GRAVY (Grand Average of Hydropathy) value for a
>            peptide or
>         >>            a protein
>         >>          o Isoelectric point of a sequence
>         >>
>         >>          o number of amino acids in a protein (His, Met, Cys)
>         >>
>         >> ; using multi threads where it is possible.
>         >>
>         >>    * Implement functional test cases using Junit.
>         >>    * Update the high level documentation for end users.
>         >>
>         >> *August 16 - August 22*
>         >>
>         >>    * Wrap up the work done, and polishing up the code.
>         >>    * Creating Java-doc API
>         >>    * Preparing for the final evaluation.
>         >>
>         >> *August 26*
>         >>
>         >>    * Final evaluation deadline.
>         >>
>         >> *Project Deliverables*
>         >>
>         >> ·Java library with above mentioned functionalities.
>         >>
>         >> ·Command line executables.
>         >>
>         >> ·Java doc API of the library.
>         >>
>         >> ·Functional test cases.
>         >>
>         >> ·High level end user documentation
>         >>
>         >>
>         >> --
>         >> Best Regards,
>         >> Nirmal
>         >>
>         >> C.S.Nirmal J. Fernando
>         >> Department of Computer Science & Engineering,
>         >> Faculty of Engineering,
>         >> University of Moratuwa,
>         >> Sri Lanka.
>         >>
>         >> Blog: http://nirmalfdo.blogspot.com/
>         >>
>         >
>         >
>
>
>
>            --
>            Best Regards,
>            Nirmal
>
>            C.S.Nirmal J. Fernando
>            Department of Computer Science & Engineering,
>            Faculty of Engineering,
>            University of Moratuwa,
>            Sri Lanka.
>            Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
>         -- 
>         Best Regards,
>         Nirmal
>
>         C.S.Nirmal J. Fernando
>         Department of Computer Science & Engineering,
>         Faculty of Engineering,
>         University of Moratuwa,
>         Sri Lanka.
>
>         Blog: http://nirmalfdo.blogspot.com/
>
>
>
>
>
> -- 
> Best Regards,
> Nirmal
>
> C.S.Nirmal J. Fernando
> Department of Computer Science & Engineering,
> Faculty of Engineering,
> University of Moratuwa,
> Sri Lanka.
>
> Blog: http://nirmalfdo.blogspot.com/
>




More information about the biojava-dev mailing list