[Biopython] Retrieving fasta seqs

Tue Feb 2 16:44:29 UTC 2010

My apologies! I didn't realize it's a off topic problem. Thanks for  
the link it is quite informative!

So can I presume the index method to have failed is due to memory  
issues as well?

Cheers
Kevin

Sent from my iPod

On 02-Feb-2010, at 11:54 PM, Brad Chapman <chapmanb at 50mail.com> wrote:

> Kevin and Peter;
>
>> On Tue, Feb 2, 2010 at 3:30 PM, Kevin Lam <aboulia at gmail.com> wrote:
>>> Traceback (most recent call last):
>>>  File "test.py", line 22, in ?
>>>    ids.add(recordf3)
>>> # Then add each line to .ids.
>>> MemoryError
>>
>> OK, so it fails way before you do anything with Biopython - the
>> problem is simply building a very large set of strings in memory.
>> You could try using a list instead of a set (trivial code change),
>> which I would expect to use less memory but run slower.
>
> This is a nice discussion on stack overflow of the lookup/run time
> versus memory trade off of lists versus sets/dictionaries:
>
> http://stackoverflow.com/questions/513882/python-list-vs-dict-for-look-up-table
>
> My guess is building the hash table for the string IDs gets memory
> expensive.
>
> Brad
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython