[Biopython] multiprocessing and SeqIO.index_db()

Chris Friedline cfriedline at vcu.edu
Tue Sep 18 21:34:11 UTC 2012


Hi,

I ran into this today, and wondering if there is a work around.  If I attempt to index multiple files with multiprocessing using SeqIO.index_db(), I can create the databases, but I'm unable to open them after they come back from the async process.  Instead, I get this when trying to (say) print the dictionary:

File "/Users/chris/.virtualenvs/default/lib/python2.7/site-packages/Bio/SeqIO/_index.py", line 112, in __str__
    return "{%s : SeqRecord(...), ...}" % repr(self.keys()[0])
  File "/Users/chris/.virtualenvs/default/lib/python2.7/site-packages/Bio/SeqIO/_index.py", line 416, in keys
    self._con.execute("SELECT key FROM offset_data;").fetchall()]
sqlite3.ProgrammingError: Base Connection.__init__ not called.

As a workaround, I'm just calling index_db again.  From the source code, it appears that the index is not rebuilt by doing this, and it seems to work OK.  Is this just a multiprocessing/pickling issue?

Thanks,
Chris



More information about the Biopython mailing list