[Biopython-dev] SeqFeature and FeatureLocation objects (was Bio.GFF)

Brad Chapman chapmanb at 50mail.com
Fri Apr 24 12:45:15 UTC 2009


Hi Peter;

> > I took a look at the resource usage of these objects versus
> > a lightweight implementation. For a GFF file with 70k features, the
> > maximum memory usage is 128M versus 111M for the lightweight
> > version. So the improvement is rather modest, ~15%.
> 
> How did you measure these memory figures?

With the unix 'time' command; those are the values reported by %M,
which is the maximum memory used during the process.

> And was your 15% comparison between the current "heavy" SeqFeature +
> FeatureLocation system as in CVS, and my lightweight alternative
> described earlier?

This was with an even lighter version. I just added start/end as
attributes to the SeqFeatures. So there was no FeatureLocation or
individual position objects. This was a hack to look at the best case
scenario to save memory. The baseline was the default SeqFeatures
before we started thinking about changing them.

> How does this version look? It should save more memory that the
> version I sent you three days ago, and again aims for 100% backwards
> compatibility - all the unit tests pass.

That is nice. Do we still want to keep a FeatureLocation, or
condense this all onto the SeqFeature itself?

Brad



More information about the Biopython-dev mailing list