[Biopython] Working with genomic intervals
Aaron Quinlan
aaronquinlan at gmail.com
Mon Aug 15 23:54:31 UTC 2011
Dear Peter, Sean, and Laurent,
Thanks so much for the useful suggestions.
Best,
Aaron
On Aug 15, 2011, at 2:17 AM, Laurent Gautier wrote:
> On 2011-08-14 18:00, biopython-request at lists.open-bio.org wrote:
>> On Sun, Aug 14, 2011 at 7:11 AM, Peter Cock<p.j.a.cock at googlemail.com> wrote:
>>> > On Friday, August 12, 2011, Aaron Quinlan<aaronquinlan at gmail.com> wrote:
>>>> >> All,
>>>> >>
>>>> >> I apologize in advance if this is a naive question.
>>>> >> I am wondering if BioPython provides libraries for
>>>> >> working with genomic intervals in BED, GFF, or
>>>> >> any other like format? ?I am looking for libraries
>>>> >> that handle the parsing of files in these formats
>>>> >> into Python objects, as well as libraries for
>>>> >> manipulating (intersection, merging, counting,
>>>> >> etc.) intervals. ?I know this exists in Galaxy's
>>>> >> bx-python, but am wondering if there are similar
>>>> >> libraries in BioPython?
>>>> >>
>>>> >> Gratefully,
>>>> >> Aaron
>>> >
>>> > Hi Aaron,
>>> >
>>> > Have a look athttp://biopython.org/wiki/GFF_Parsing
>>> > wher Brad is working on this. He's also spoken
>>> > highly of bx-python as I recall.
>> I would second the bx-python vote. Not only are the "normal" interval
>> classes covered, but there are also some variants (clustering is one
>> that comes to mind).
>>
>> Sean
>
> One can also access from Python the utilities for ranges available in
> bioconductor, for example using the bioconductor extension to rpy2 or rpy2
> directly (may be using dynamic class mapping features, as shown below):
>
> from rpy2.robjects.packages import importr
> iranges = importr("IRanges")
> # Python class IRanges as an API to Bioconductors IRanges::IRanges
> from rpy2.robjects.methods import RS4, RS4Auto_Type
> class IRanges(RS4):
> __metaclass__ = RS4Auto_Type
> __rpackagename__ = "IRanges"
> __rname__ = "IRanges"
>
> # now in action
>
> >>> from rpy2.robjects.vectors import IntVector
> >>> ir = IRanges(iranges.IRanges(start = IntVector(range(10)), width = 11))
> >>> print(ir)
> IRanges of length 10
> start end width
> [1] 0 10 11
> [2] 1 11 11
> [3] 2 12 11
> [4] 3 13 11
> [5] 4 14 11
> [6] 5 15 11
> [7] 6 16 11
> [8] 7 17 11
> [9] 8 18 11
> [10] 9 19 11
> >>> print(IRanges(ir.reduce__IRanges(ir)))
> IRanges of length 1
> start end width
> [1] 0 19 20
>
>
More information about the Biopython
mailing list