[BioPython] calculate F-Statistics from SNP data
Peter
biopython at maubp.freeserve.co.uk
Wed Oct 22 06:34:21 EDT 2008
On Wed, Oct 22, 2008 at 11:25 AM, Giovanni Marco Dall'Olio
<dalloliogm at gmail.com> wrote:
> Maybe the biggest issue is that I will have to use this library to parse
> very big files, so there are a few things we could change in the
> implementation of the parser.
> Is there any way in python to force the interpreter to store variables in
> temporary files instead of RAM memory?
> I was thinking about modules like shelve, cPickle, but I am not sure they
> work in this way.
I have not looked at the specifics here, but adopting an iterator
approach might make sense - returning the entries one by one as parsed
from the file. This is the idea for the Bio.SeqIO and Bio.AlignIO
parsers. The user can then turn the entries into a list (if they have
enough memory), filter them as the arrive, etc. For example, you
could compile a list of only those desired population entries,
discarding the others on the fly.
Peter
More information about the BioPython
mailing list