[Biopython] Google Summer of Code (GSoC)

Zhigang Wu zhigang.wu at email.ucr.edu
Fri Mar 23 15:00:12 EDT 2012


Hi Biopython community,
I am, Zhigang Wu, a third year graduate student in UC Riverside with a
research focus on miRNA evolution. I am interested in implementing the
Biopython SearchIO module, which is used to parse the blast reports from
currently popular sequence alignment tools like NCBI BLAST+, FASTA, HMMER3
and etc.

I was a BioPerl user until one year ago, since then I have been a Biopython
user. I have been using BioPerl's SearchIO extensively in my research
project. BioPerl's SearchIO module provides a common API capable of
handling all popular formats and is great. I'd like to write one in Python.
As mentioned briefly, I have approximately one year experience of Perl
programming experience, 1 year Python programming experience; and
occasionally I also writing C++ programs; Other than this, I also have a
bit experience on R.

Right now, I am preparing my proposal that is due by April 6. I am listing
below the core methods that the Biopythonic SearchIO module is going to
support. For the sake of consistency, the moethods are very similar to
existing SeqIO <http://biopython.org/wiki/SeqIO>and
AlignIO<http://biopython.org/wiki/AlignIO>modules.

   1. SearchIO.parse(handle, format), is a generator function.
   2. SearchIO.to_dict(iterator): this function takes in an iterator
   arguments which is produced by SearchIO.parse(...) function.
   3. SearchIO.read(handle, format): provide fasta access to blast report
   have only one record
   4. SearchIO.write(....) outputs specified blast output
   5. SearchIO.convert(...) provide format conversion between different
   formats
   6. ...

I'd like to hear back from you any feedback or suggestions on the method or
any format that in your research field is considered to be popular and you
want it to be supported in Biopythonic SearchIO module.

Regards,

Zhigang Wu


More information about the Biopython mailing list