[Biopython-dev] Biopython 1.60 plans and beyond

Peter Cock p.j.a.cock at googlemail.com
Sat Feb 18 18:14:34 UTC 2012


On Sat, Feb 18, 2012 at 11:52 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> --- On Sat, 2/18/12, Tiago Antão <tiagoantao at gmail.com> wrote:
>> Another side issue that I would like to discuss (maybe a
>> different thread): Is how people are coping with large
>> amounts of data using Python (or Perl/Ruby for that
>> matter)? Specifically the problem of performance. As
>> I see it, there is more and more the case of depending
>> on external (fast) programs or CLib extensions or Java
>> extensions to do the bulk of the work. Inner-loops in
>> Python simply do not cut for speed.
>
> C extensions to Python such as pysam, together with
> outer-loops in Python/Biopython have been working
> very well for me. Perhaps at some point pysam can
> be included into Biopython, but as samtools is still
> evolving it makes sense for it to be a separate package
> so that it can be updated more quickly.

I've got some partial SAM/BAM code in pure Python,
partly as a learning exercise for the format itself and
issues around that.

> I am more concerned about relying on external programs,
> in particular R. Notwithstanding the usefulness of rpy and
> rpy2, I would prefer to have a pure-Python or
> Python-with-C-extension solution, ideally as part of
> Biopython.

Python with C extensions (e.g. via CPython?) certainly
have their role to play - and should be much faster than
calling separate binaries and parsing their output as
the payback.

However, pure Python is also getting a lot more
interesting with PyPy getting better and better.

Peter




More information about the Biopython-dev mailing list