[Biopython] multiprocessing problem with pysam

Michal mictadlo at gmail.com
Mon Jun 6 21:59:39 UTC 2011


On 05/16/2011 01:53 AM, Brad Chapman wrote:
> Michal;
>
> [multiprocessing]
> multiprocessing is sensitive to passing or calling complex class
> objects. My suggestion is to use functions without associated state
> attributes and pass in your information as standard python objects
> (strings, lists, dicts). I use a little decorator to make writing
> the functions passed easier:
>
> import functools
> def map_wrap(f):
>      @functools.wraps(f)
>      def wrapper(*args, **kwargs):
>          return apply(f, *args, **kwargs)
>      return wrapper
>
> Then would write your function as:
>
> @map_wrap
> def run_test(bam_filename, cultivars, ref_name):
>      bam_fh = pysam.Samfile(bam_filename, "rb")
>      print os.getpid(), ref_name, cultivars
>      return (os.getpid(), ref_name)
>
> and call it with:
>
> cultivars = 'Ja,Ea,As'.replace(' ', '').split(',')
> bam_filename = "/media/usb/tests/test.bam"
> bamfile = pysam.Samfile(bam_filename, "rb")
> ref_names = bamfile.references
> bamfile.close()
>
> pool = Pool()
> results = dict(pool.imap(run_test, ((bam_filename, cultivars, ref)
>                                      for ref in ref_names)))
> pool.close()
>
> Hope this helps,
> Brad
Thank you Brad it works and I also found the following solution:

import os
from multiprocessing import Pool
from pprint import pprint
import functools


def calc_p(fname, start_pos, end_pos, reference_name):
     print os.getpid()
     print "fname", fname
     print "reference_name", reference_name
     print "start_pos", start_pos
     print "end_pos", end_pos
     print
     return (reference_name, [os.getpid(), 'x1', 'x2'])


if __name__ == '__main__':
     pool = Pool()

     fname = "ex1.txt"
     references = ['Test1', 'Test2', 'Test3', 'Test4']

     run_test = functools.partial(calc_p, fname, 100, 120)
     result = dict(pool.imap_unordered(run_test, references))

     pprint(result)


Michal



More information about the Biopython mailing list