[Biojava-dev] Parallel Programming

Daniel Asarnow dasarnow at gmail.com
Fri May 30 19:08:56 UTC 2014


Thoughts below.

On Fri, May 30, 2014 at 12:25 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
>
>
> I am all for good support of parallelism. Many (all?) of the algorithms
> that are in BioJava are building blocks that already now can easily run in
> multi threaded environments.
>

This was my experience writing a batch structure alignment utility, with
local parallelism provided by Java's concurrency API.

Maybe there is a testing approach which could identify any remaining
components which aren't thread safe (if any)?


> In your experience, did you have any problems running BioJava in a multi
> CPU context? What is needed to improve support? Documentation?
>

I didn't have any problems - I think it's a question of whether local
parallelism or broader distributed processing capability than what's in
bio.structure.align should be something supported in the library, rather
than left to BioJava users.

There may also be some code in BioJava which can be parallelized directly
as an optimization, but the JVM is good at auto-vectorizing and it's
probably best not to create built-in expectations of hardware capability.
For example, I had to learn the hard way that a cluster computing
environment I was using did not support multithreading, which actually led
to a hacky solution to compile BioJava with GCJ to avoid the JVM startup
penalty.

In terms of documentation for local parallelism, I could write a cookbook
entry based on the utility I mentioned (i.e. how to do parallel alignments
using a thead pool).

Best,
-da



More information about the biojava-dev mailing list