[GSoC] GSoC 2014 queries and inputs
Fields, Christopher J
cjfields at illinois.edu
Wed Mar 19 22:38:09 EDT 2014
You probably need to look this over first:
http://www.open-bio.org/wiki/Google_Summer_of_Code#Before_you_apply
http://www.open-bio.org/wiki/Google_Summer_of_Code#When_you_apply
Then you would go here:
http://www.google-melange.com/gsoc/homepage/google/gsoc2014
You should start on this ASAP, as the deadline is Friday at noon.
(BTW, if Eric and Raoul are reading this, great job on organization!)
chris
On Mar 19, 2014, at 11:23 AM, Ujjwal Thaakar <ujjwalthaakar at gmail.com<mailto:ujjwalthaakar at gmail.com>> wrote:
Is there a template for the application proposal?
On 19 March 2014 19:56, Fields, Christopher J <cjfields at illinois.edu<mailto:cjfields at illinois.edu>> wrote:
On Mar 19, 2014, at 8:28 AM, Artem Tarasov <lomereiter at gmail.com<mailto:lomereiter at gmail.com>> wrote:
On Tue, Mar 18, 2014 at 11:44 PM, Ujjwal Thaakar <ujjwalthaakar at gmail.com<mailto:ujjwalthaakar at gmail.com>> wrote:
What's the difference between SAM and VCF?
SAM: mapping software aligns reads against the reference genome (and its reverse-complement) and writes to SAM/BAM file information about best alignment of each read (to which strand it aligned, what are the differences compared to the reference, and so on)
VCF: not reads but positions on the reference genome are considered, and each record contains information about whether there's variability at a position. They are produced from SAM files by considering reads overlapping each position - if statistically significant number of reads have a base different from the reference (or an insertion/deletion), this is probably a true mutation which might have biological significance as well.
For JRuby, I'd recommend using Picard. No need to reinvent the wheel. Plus, you might also want to support the binary counterpart, BCF format.
--
Artem
Yep, if planning on going through jvm then Picard is nice and supports VCF (and BCF it seems). No CRAM support, but there is this:
https://github.com/enasequence/cramtools
(section on picard integration)
chris
--
Thanks
Ujjwal
More information about the GSoC
mailing list