[Biopython] Retrieving fasta seqs
Brad Chapman
chapmanb at 50mail.com
Tue Feb 2 15:54:35 UTC 2010
Kevin and Peter;
> On Tue, Feb 2, 2010 at 3:30 PM, Kevin Lam <aboulia at gmail.com> wrote:
> > Traceback (most recent call last):
> > File "test.py", line 22, in ?
> > ids.add(recordf3)
> > # Then add each line to .ids.
> > MemoryError
>
> OK, so it fails way before you do anything with Biopython - the
> problem is simply building a very large set of strings in memory.
> You could try using a list instead of a set (trivial code change),
> which I would expect to use less memory but run slower.
This is a nice discussion on stack overflow of the lookup/run time
versus memory trade off of lists versus sets/dictionaries:
http://stackoverflow.com/questions/513882/python-list-vs-dict-for-look-up-table
My guess is building the hash table for the string IDs gets memory
expensive.
Brad
More information about the Biopython
mailing list