[Biopython-dev] Mentor for GSoC

Adam Kurkiewicz adam at kurkiewicz.pl
Mon Feb 13 15:37:20 UTC 2017


Hi Everybody,

I'm looking for a mentor for GSoC. I've got one small contribution in
Biopython already: 

https://github.com/biopython/biopython/commit/6c14a26dda32ad6d3147036a01f3d0c4d306c647

and some other programming/scientific experience: 

http://cv.adam.kurkiewicz.pl/

I attach my project proposal.

If you'd be interested in mentoring me, I should be quite
self-sufficient,  but always open to any remarks about the code/ blog
posts.

Cheers,

Adam

<PROJECT PROPOSAL>

### Microarray Analysis in Biopython

#### Rationale

DNA Microarray (also referred to as DNA chip) is a commonly used device
in genomics and transcriptomics research, with applications in medical
diagnosis, drug testing and discovery, detection of doping and others.
Currently Biopython lacks support for most tasks required for microarray
data processing, and support for some tasks is incomplete.

It would be desirable that Biopython supports the following data
analysis tasks for some of the file formats from at least 2
manufacturers of the microarray technology  (Illumina and Affymetrix).

The tasks, which need to be supported are:

1. File format-specific parsing and data extraction.
2. Quality Assurance and Data Normalization.
3. Gene/exon aggregation and annotation.
4. Statistical inference of differential expression/alternative
splicing/ other biologically significant effects.

#### Approach

A pilot effort to implement task 1 (File format-specific parsing and
data extraction) for an Affymetrix CEL file version 4, showed that such
task can be accomplished within a week of development. With regards to
steps 2, 3 and 4, these tasks are well-studied and understood
academically, with algorithm description and code often available from
various research groups. The focus of the project will be on choosing
the most tractable and easily-implementable of these algorithms and
implementing them in biopython.

#### Languages and skills
a bullet list of

* good knowledge of Python
* good software engineering & software design skills
* basic understanding of biology/ bioinformatics

#### Project links
https://github.com/biopython/biopython

#### Difficulty
medium


#### Mentors
<TODO> Hopefully one mentor could be one of my supervisors. I'm not sure
about the Biopython mentor </TODO>

</PROJECT PROPOSAL>

Adam


More information about the Biopython-dev mailing list