[Biopython-dev] fast N-mer seach?

Thomas Sicheritz-Ponten thomas at cbs.dtu.dk
Fri Nov 8 06:22:21 EST 2002


What is the current best/fastest way to count all oligos of a given size
in a sequence? Is this something one could use mx.TextTools for? 
Or is it still faster to use the naive way by looping over i, extract all
seq[i:i+size] and increase the count in a dictionary ...?

What if I'd like to count all word/oligo frequencies up to size N?

stepping-into-the-optimization-swamp'ly y'rs

Sicheritz-Ponten Thomas, Ph.D, thomas at biopython.org      (
Center for Biological Sequence Analysis                   \
BioCentrum-DTU, Technical University of Denmark            )
CBS: +45 45 252485      Building 208, DK-2800 Lyngby  ##----->
Fax: +45 45 931585      http://www.cbs.dtu.dk/thomas       )
     ... damn arrow eating trees ...                     (

More information about the Biopython-dev mailing list