[Biopython] how to find closest genes for a given location

Brad Chapman chapmanb at 50mail.com
Thu Feb 25 13:34:31 UTC 2010

Hi Sameet;

> I have multiple locations from human genomes.  I want to determine
> what are the closest genes on either side of the location, and if it
> is in the location how far from the TSS the given location is.  I was
> thinking of using the CCDS database, because it contains information
> for the genes that have been verified.  Is there any other
> better/smarter way of doing it.

I don't know of a ready to go library in Python that does this, but
you could put something together using the Interval intersection
library in bx-python:


You would build up an interval tree of gene features from someplace
like CCDS, and then loop through your BED file and intersect with
the tree. For finding closest non-overlapping genes, look at
upstream_of_interval and downstream_of_interval.

For a non-python approach the ChIPpeakAnno R package in Bioconductor
provides a library that does what you are looking for:


rpy2 is an excellent gateway to R from Python:


Hope this helps,

More information about the Biopython mailing list