[Biopython] pd.df.to_csv(..., compression="bgzip")?

Dan Bolser dan.bolser at outsee.co.uk
Wed Mar 13 07:22:57 EDT 2024


Nice idea, I would never have thought of that.

Thanks Peter!

On Wed, Mar 13, 2024, 11:18 AM Peter Cock <p.j.a.cock at googlemail.com> wrote:

> Ah. I would give it a file handle then:
>
> with bgzf.open("example.txt.bgz", "w") as bgzf_handle:
>     my_data_frame.to_csv(bgzf_handle, ...)
>
> I would expect that to work according to
>
> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
> - possibly with an explicit compression=None added?
>
> Peter
>
>
> On Wed, Mar 13, 2024 at 11:08 AM Dan Bolser <dan.bolser at outsee.co.uk>
> wrote:
> >
> > pandas.to_csv is the function that writes data. pandas.read_csv silently
> handles decompression as needed.
> >
> >
> >
> > On Wed, Mar 13, 2024, 10:49 AM Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >>
> >> Yes. BGZF is just a special kind of GZIP file, if all you are doing is
> >> decompressing it for reading it then the standard gzip.open(...)
> >> is fine.
> >>
> >> Peter
> >>
> >>
> >> On Wed, Mar 13, 2024 at 10:03 AM Dan Bolser <dan.bolser at outsee.co.uk>
> wrote:
> >>>
> >>> bgzip is a 'bio' thing, so thought I'd ask here. It's perhaps not
> 'biopython', but it's bio/python.
> >>>
> >>> On Tue, 12 Mar 2024 at 19:11, Sean Brimer <skbrimer at gmail.com> wrote:
> >>>>
> >>>> Hi Dan,
> >>>>
> >>>> This feels more like a panda's issue than a biopython issue. That
> said, I think you could just use gzip. I think. bgzip for samtools was
> built on top of gzip so it probably decompresses in a similar way.
> >>>>
> >>>> On Tue, Mar 12, 2024 at 12:52 PM Dan Bolser <dan.bolser at outsee.co.uk>
> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> I can pass `compression="gzip"` to pandas.DataFrame.to_csv, but not
> bgzip... how to update pandas to support bgzip?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> _______________________________________________
> >>>>> Biopython mailing list  -  Biopython at biopython.org
> >>>>> https://mailman.open-bio.org/mailman/listinfo/biopython
> >>>
> >>> _______________________________________________
> >>> Biopython mailing list  -  Biopython at biopython.org
> >>> https://mailman.open-bio.org/mailman/listinfo/biopython
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20240313/abe554fc/attachment.htm>


More information about the Biopython mailing list