[Biopython-dev] [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond

Mic mictadlo at gmail.com
Fri May 25 02:49:13 EDT 2012


I think Pircard-tools does parallel compression/decompression of BGZF.

Cheers,
Mic

On Thu, May 24, 2012 at 7:18 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov
> <lomereiter at googlemail.com> wrote:
> > Hi all,
> >
> > it's a good point that many line-based formats need some sort of
> compression
> > with indexing, and BGZF is good enough in that sense.
>
> BGZF doesn't have to be used with line-based formats, anything
> with sequential records would work (like BAM files of course). I've not
> tried it to see how well it compressed, but SFF files in BGZF should
> work too as another example.
>
> >> So far, I think Artem's BGZF implementation is entirely in D; I may just
> >> add Ruby support for BGZF separately.
> >
> > The only problem I see with that approach is that it's hardly possible to
> > get parallel compression with MRI. But overall I tend to agree with
> Clayton.
> > Firstly, it's hard to abstract away some common interface right now, not
> > writing any code and looking at it. Secondly, there're still problems
> with D
> > shared library support. We were assured by GDC developer that they'll get
> > solved soon, but at the moment the situation is far from perfect.
>
> My BGZF code is pure Python (using C zlib via Python's zlib library),
> and does not currently tackle parallel compression or decompression.
> There as been recent work in samtools for this.
>
> We don't need parallel compression/decompression of BGZF for it to
> be useful.
>
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


More information about the Biopython-dev mailing list