[Biopython-dev] [Biopython] Update: call for Google Summer of Code project ideas

Peter Cock p.j.a.cock at googlemail.com
Fri Mar 2 11:53:54 UTC 2012


On Fri, Mar 2, 2012 at 1:43 AM, Brad Chapman <chapmanb at 50mail.com> wrote:
>
> Peter and Eric;
>
>> > Variants
>> > --------
>> > Synthesizing the above, we have a GSoC project that looks like:
>> >
>> > - Help merge PyVCF into Python (w/ James's support -- I
>> >   don't mean to volunteer him for this in absentia)?
>> > - Write a GVF parser that emits the same object type as
>> >   PyVCF, potentially also using existing GFF code
>> > - Time permitting, look into blocked gzip support for VCF
>> >   (BCF), also looking at SAM/BAM for inspiration and
>> >   reusable code.
>>
>> Sounds interesting - who might be willing to mentor it?
>
> This is a great idea. Reece and I proposed a variant project last year,
> and Reece has already e-mailed me this year about trying again. He was
> planning on re-vamping the description on the GSoC page for 2012:
>
> http://biopython.org/wiki/Google_Summer_of_Code

Excellent - can you and/or Reece polish that wiki text today? We
don't need it to be perfect or that detailed at this stage, do we?

> so hopefully we can incorporate several aspects of this. From my
> experience I would prioritize BCF/Tabix files since you see a lot of
> those in practice.

Right. It sounds like my BGZF code (blocked gzip) should be
helpful for BCF as well.

> For GVF we could certainly leverage the GFF parser since it is GFF with
> variant keywords. Practically, I would love to settle on one format for
> this and VCF seems to have the most tool uptake so far.

That could go in as a potential aim too then.

>> >> SearchIO?
>> >> ---------
>
> +1 for this as well. Great ideas,
> Brad

I've started to write up that on the wiki page now.

Peter




More information about the Biopython-dev mailing list