[Biopython] how to find closest genes for a given location
Peter
biopython at maubp.freeserve.co.uk
Thu Feb 25 13:37:40 UTC 2010
On Thu, Feb 25, 2010 at 1:34 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Hi Sameet;
>
>> I have multiple locations from human genomes. I want to determine
>> what are the closest genes on either side of the location, and if it
>> is in the location how far from the TSS the given location is. I was
>> thinking of using the CCDS database, because it contains information
>> for the genes that have been verified. Is there any other
>> better/smarter way of doing it.
>
> I don't know of a ready to go library in Python that does this, but
> you could put something together using the Interval intersection
> library in bx-python:
>
> http://bitbucket.org/james_taylor/bx-python/src/tip/lib/bx/intervals/intersection.pyx
>
> You would build up an interval tree of gene features from someplace
> like CCDS, and then loop through your BED file and intersect with
> the tree. For finding closest non-overlapping genes, look at
> upstream_of_interval and downstream_of_interval.
Or, if you don't have too many locations to deal with, a simple brute
force approach looping over the features to find the closest ones
would work just fine. How many is "multiple locations"?
Peter
More information about the Biopython
mailing list