[Biopython-dev] [Biopython - Bug #3395] Biopython trie implementation can't load large data sets
redmine at redmine.open-bio.org
redmine at redmine.open-bio.org
Tue Nov 20 12:02:48 EST 2012
Issue #3395 has been updated by Peter Cock.
Well that is progress - it means this isn't a problem coming from reading a compressed file on disk - you've made the test case simpler. Can you actually share a self contained example script? If not, I suggest you try halving the dataset (only record the first half of the tries), and retest. Then repeat - this should tell you if the problem is as you suspect a large dataset, or something specific about a special value.
Alternatively can you share the (compressed) file? I could at least check if it fails the same way here, and perhaps add some debugging code to get more information.
The error message itself is coming from some C code, which hasn't changed for some time:
https://github.com/biopython/biopython/blob/master/Bio/triemodule.c
The error itself is likely triggered in function _deserialize_transition in trie.c:
https://github.com/biopython/biopython/blob/master/Bio/triemodule.c
You still haven't told us the important information of which OS, which version of Python, which version of Biopython. Given it is C code, I'd also like to know how Biopython was installed (e.g. did you compile it from source yourself).
----------------------------------------
Bug #3395: Biopython trie implementation can't load large data sets
https://redmine.open-bio.org/issues/3395
Author: Michał Nowotka
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version:
URL:
Imagine I have Biopython trie:
from Bio import trie
import gzip
f = gzip.open('/tmp/trie.dat.gz', 'w')
tr = trie.trie()
#fill in the trie
trie.save(f, trie)
Now /tmp/trie.dat.gz is about 50MB. Let's try to read it:
from Bio import trie
import gzip
f = gzip.open('/tmp/trie.dat.gz', 'r')
tr = trie.load(f)
Unfortunately I'm getting meaningless error saying:
"loading failed for some reason"
Any hints?
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org
More information about the Biopython-dev
mailing list