[Biopython] SeqIO.index for csfasta files memory issues

Peter biopython at maubp.freeserve.co.uk
Thu Jan 21 13:03:33 UTC 2010


On Thu, Jan 21, 2010 at 11:58 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Thu, Jan 21, 2010 at 11:31 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>> ... or look for a big memory Linux box to try.
>
> This may be easier for me!

That worked :)

This was a 48 million entry ~3GB faked color space FASTA file. It took
about 10 mins and about 7GB (I missed the final memory usage figure
as I was only checking in top), using Biopython 1.53 on a 64bit
installation of Python 2.4.3:

$ python
Python 2.4.3 (#1, Jan 21 2009, 01:11:33)
[GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.architecture()
('64bit', 'ELF')

Could you double check the version of Python on the nodes of your
cluster (just in case the head node is using something different, or
some of the nodes are 32bit and others are 64bit)?

Peter



More information about the Biopython mailing list