[Biopython-dev] GSoC genomic variant proposal

Lenna Peterson arklenna at gmail.com
Mon Apr 2 02:10:45 UTC 2012


Hi Brad,

Thank you so much for your suggestions. My initial evaluation of the
strengths of existing software has led me to strongly agree with your
recommendation to focus on the usability of the API.

I submit this draft of my proposal to the dev list for feedback:

https://docs.google.com/document/d/116FDQLtNnYWnm0kojad4YmQrM3cjOO8D2Vr82aW6xyA/edit


Lenna


On Sun, Apr 1, 2012 at 3:13 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>
> Lenna;
> Thanks for the introduction and glad to hear about your interest in the
> variant project. I'm looking forward to seeing your proposal.
>
> The workflow for the variant project involves a biologist querying a VCF
> or GVF file with variants from an experiment. They should be able to
> easily subset and filter by file components:
>
> - Variant type: Homozygous/Heterozygous variants
> - Metrics: depth, strand bias, allele frequency..
> - Variants annotated in coding regions causing amino acid changes
>
> As well as rapid subsetting by chromosomal region.
>
> My syggestion would be to leverage external tools as much as possible to
> do file manipulation and focus on an API that lets users filter and
> extract information pre-contained in the INFO file.
>
> Hope this is helpful as a place to get started. We can provide
> additional feedback once you have your proposal ready. Thanks again,
> Brad
>
>> Hi all,
>>
>> I realize time is short, but I am still in the planning phase of my
>> GSoC proposal! I wanted to take a moment to formally introduce myself
>> to the dev list.
>>
>> I am affiliated with Purdue University, located in Indiana, USA and
>> best known for engineering (Neil Armstrong is a famous graduate). I
>> hold a bachelor of arts in biology from Mount Holyoke College in
>> Massachusetts. I have extensive wet lab experience with genetics; I'm
>> currently working in a lab genotyping mice (the research is intestinal
>> lipid metabolism). In August, I begin a PhD in interdisciplinary life
>> science at Purdue, and I anticipate that my research will fall
>> somewhere in the field of bioinformatics/computational biology. I hope
>> to use biopython extensively!
>>
>> In my spare time, other than programming, I enjoy ballroom dance,
>> science fiction novels, board games, and sailing.
>>
>> I've been programming for about 6 years and using python for 4; other
>> languages with which I'm familiar include Perl/CGI, HTML/CSS, PHP, SQL
>> (primarily MySQL and SQLite), and C++/C. I place a high value on
>> object oriented design and execution.
>>
>> I understand the basics of formal grammar and have some experience
>> with lex/flex as well as PLY (python lex/yacc). My work so far with
>> biopython has been on the CIF parsing module. One of my primary goals
>> for the genomic variants project would be to implement as much
>> polymorphism and abstraction as possible, for the benefit of both
>> users and future developers.
>>
>> I'm working on a proposal for the genomic variants project, and while
>> I understand the basics of molecular biology and genetics, I lack
>> firsthand experience with the type of workflow that would occur in the
>> context of genomic variants. If anyone can supply a few examples, it
>> would be greatly appreciated.
>>
>> I hope to have a proposal draft ready for feedback by Monday.
>>
>> Regards,
>>
>> Lenna Peterson
>> github.com/lennax
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev



More information about the Biopython-dev mailing list