[Biopython-dev] [Biopython - Bug #3395] Biopython trie implementation can't load large data sets
redmine at redmine.open-bio.org
redmine at redmine.open-bio.org
Thu Nov 29 17:12:31 UTC 2012
Issue #3395 has been updated by Peter Cock.
File trie_debug.patch added
I can reproduce the problem with your saved file under Mac OS X, using the latest Biopython from github, e.g.
$ python
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import trie
>>> import gzip
>>> with gzip.open("trie.4.dat.gz") as handle:
... t = trie.load(handle)
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
RuntimeError: loading failed for some reason
Adding a little debugging to the C code tells us where this fails (see attachment), line 669:
668 if(has_value) {
669 if(!(trie->value = (*read_value)(data)))
670 goto _deserialize_trie_error;
371 }
What kind of CPU does your machine have? i.e. is it a normal Intel or AMD CPU, or something unusual like a PowerPC where we have to worry about the bit order interpretation?
We may need a complete example creating the trie as well - the problem could be in the trie itself, the serialisation (writing to disk), or de-serialisation (loading from disk).
----------------------------------------
Bug #3395: Biopython trie implementation can't load large data sets
https://redmine.open-bio.org/issues/3395
Author: Michał Nowotka
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version:
URL:
Imagine I have Biopython trie:
from Bio import trie
import gzip
f = gzip.open('/tmp/trie.dat.gz', 'w')
tr = trie.trie()
#fill in the trie
trie.save(f, trie)
Now /tmp/trie.dat.gz is about 50MB. Let's try to read it:
from Bio import trie
import gzip
f = gzip.open('/tmp/trie.dat.gz', 'r')
tr = trie.load(f)
Unfortunately I'm getting meaningless error saying:
"loading failed for some reason"
Any hints?
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org
More information about the Biopython-dev
mailing list