[Biopython] Google Summer of Code Project: SearchIO in Biopython

Wibowo Arindrarto w.arindrarto at gmail.com
Sat Apr 28 08:08:35 EDT 2012


Hello everyone,

This is Wibowo Arindrarto (or Bow, for short), one of the Google Summer of
Code students who will work on Biopython over this summer.

I will be working with Peter to add support for parsing search outputs from
programs like BLAST and HMMER to Biopython, so that it's easier to extract
information from their outputs. Having used some of these programs quite a
lot myself, I'm really looking forward to implementing the feature.
However, I do understand that it won't be just me who will use the module,
but also many other Biopython user. So for everyone who is interested in
giving a say, input, or critiques along the way, feel free to do so :).

The official coding period starts in about a month from now. Until then, I
will be doing all the preparatory work required so that coding will proceed
as smooth as possible. These will include preparing the test cases and
preparing the SearchIO attribute / object naming convention as well as
discussing anything related to its proposed implementation.

Finally, here are some links related to the project that might interest you.

1. My main biopython branch for development:
https://github.com/bow/biopython/tree/searchio. Since I will be building on
top of Peter's SearchIO branch (
https://github.com/peterjc/biopython/tree/search-io-test), right now it
only contains Peter's branch rebased against the latest master.

2. My GSoC proposal, which outlines my plans and timeline for the project:
http://bit.ly/searchio-proposal

3. The proposed SearchIO naming convention (not 100% complete as of now,
but will be filled along the way): http://bit.ly/searchio-terms. One of the
main goals of the project is to implement a common interface for BLAST et
al, which requires SearchIO to have common attribute names that refers to
different search output attributes. The link contains my proposed naming
convention, which is still very open to change and discussion. Feel free to
comment on the document and add your own ideas.

4. My blog, in which I will write weekly posts about the project's
progress: http://bow.web.id/blog

5. An extra repo for all other auxiliary files and scripts that doesn't go
into Biopython's code: https://github.com/bow/gsoc.

That's it for now. Thanks for taking time to read it :). I'm looking
forward to a productive summer with Biopython.

Have a nice weekend,
Bow


More information about the Biopython mailing list