[BioRuby] A new plugin: bio-genomic-interval

MISHIMA, Hiroyuki missy at be.to
Sat Apr 23 14:12:11 UTC 2011


Hi Jan and BioRuby-ML,

Thank you your comment on the bio-genomic-interval library.

After I have released bio-genomic-interval, I found Jan's ruby-ucsc-api 
and the Slice class. The Slice class also plays an important role in the 
Ruby Ensembl API.

In conclusion, I think that it is good to merge my bio-genomic-interval
into the Slice class and make a BioRuby plugin like "bio-slice" to share
among libraries. To do this, we have to separate ensemble/core/slice.rb
into common methods to share and the Ensembl API specific methods.

The strong points of the Slice class are including the followings:
* Using BioRuby objects such as Bio::Sequence::NA.
* Slice#excise, #sub_slice, and #split
* A short simple class name (Matz says "Name is important")

So far, I use bio-genomic-interval in bio-ucsc-api because
bio-genomic-interval has methods to convert between the common "1-based
full-closed" and UCSC-internal "0-based half-closed" intervals. This
conversion is trivial arithmetic but prevents bugs.

Other methods only in GenomicInterval are the followings:
#parse (parses UCSC style "chr1:1,234-3,456")
#comparison (returns :left_adjacent [i.e. distance <20bp], :left_off,
:contains, :left_overlapped etc.)
#overlap (returns a distance or an overlap-length between two intervals)
#expand (returns a minimum interval containing two intervals),
#center (returns a center position)

Sincerely yours,
Hiro.

Jan Aerts wrote (2011/04/23 18:31):
> Hi Hiro,
>
> You might take a look at the Slice class in the ruby-ensembl-api at
> https://github.com/jandot/ruby-ensembl-api/blob/master/lib/ensembl/core/slice.rb.
> This has very similar functionality, and might give some additional
> ideas as well, for example:
>
> - Slice#overlaps?(other_slice) => checks whether 2 slices overlap
> - Slice#within?(other_slice) => checks whether this slices is contained
> within another slice
> - Slice#contains?(other_slice) => checks whether this slices contains
> another slice
> - Slice#excise(array_of_regions) => takes a slices and removes a certain
> region, returning an array of smaller slices. For example:
>
> # original_slice = Slice.new('chrX',1,10000)
> # new_slices = original_slice.excise([500..750, 1050..1075])
> # new_slices.each do |s|
> #   puts s.display_name
> # end
> #
> # # result:
> # # chromosome:X:1:499:1
> # # chromosome:X:751:1049:1
> # # chromosome:X:1076:10000:1
>
> - Slice#sub_slice(start, stop) => creates a truncated version of this slice
> - Slice#split(length, overlap) => splits a slice into smaller bits (e.g.
> creates an array of slices of 100bp long), optionally with an overlap
> between the resultant slices.
>
> Just my 2c. (Also: if you have feature requests, ideas or other comments
> for the ruby-ensembl-api, see rubyensemblapi.userecho.com
> <http://rubyensemblapi.userecho.com>)
>
> jan.

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences



More information about the BioRuby mailing list