[Biojava-dev] Fwd: Bug in org/biojava/utils/io/UncompressInputStream.java

Mark Schreiber markjschreiber at gmail.com
Tue Apr 10 11:29:03 UTC 2007


Without looking at the code I would guess that dropping 88 bytes could
be because of a buffered reader or writer not flushing before it is
closed??

- Mark

On 4/10/07, Andy Yates <ayates at ebi.ac.uk> wrote:
> Okay a quick run of uncompress on the mac with the files in question
> does produce a file which is equivalent to the file produced by gzip but
> not to the one produced by UncompressInputStream.
>
> The required md5sum for a pass should be (after a md5 digest):
>
> 9f0924237d20288793172091d61f85b8  uncompressed_by_gzip
>
> But we get:
>
> 17447efd34a245e430f20bc8d9b28a7b  uncompressed_by_uncompressInputStream
>
> Okay so looks like there is something "wrong". Seems like it drops 88
> bytes from the decompression.
>
> Wonder what happens if we pass this file type through the
> GZIPInputStream from the JDK?
>
> Andy Yates wrote:
> > I don't think there are standard classes for this compression format in
> > the SDK. There are ones for GZIP & ZIP but not for LZW which this one is
> > dealing with. Also I'm not sure about using GZIP to unzip a file
> > compressed with LZW since GZIP uses DEFLATE.
> >
> > We need to decompress the file using uncompress (which is missing from
> > my Linux box but is on the mac ... go figure) and then match that up to
> > the output from UncompressInputStream & see if they agree or not.
> >
> > Andy
> >
> > Richard Holland wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> I have no idea what it is for. There are generic Java classes provided
> >> with the SDK that do the same job. I think we should probably drop it.
> >> Lets wait to see if anyone shouts first.
> >>
> >> mark.schreiber at novartis.com wrote:
> >>> Does anyone maintain this class??
> >>>
> >>> More to the point, does anyone know what it is for??? If I look at the
> >>> Uses link in javadoc there are aparently none at the public or package
> >>> level. Additionally why does biojava need one, are there not java.io
> >>> classes that can handle compressed streams??
> >>>
> >>> Is there a good reason why we cannot just clean it out?
> >>>
> >>> - Mark
> >>>
> >>> Mark Schreiber
> >>> Research Investigator (Bioinformatics)
> >>>
> >>> Novartis Institute for Tropical Diseases (NITD)
> >>> 10 Biopolis Road
> >>> #05-01 Chromos
> >>> Singapore 138670
> >>> www.nitd.novartis.com
> >>>
> >>> phone +65 6722 2973
> >>> fax  +65 6722 2910
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Chris Dagdigian <dag at sonsorol.org>
> >>> Sent by: biojava-dev-bounces at lists.open-bio.org
> >>> 04/07/2007 09:52 AM
> >>>
> >>>
> >>>         To:     biojava-dev at biojava.org
> >>>         cc:     (bcc: Mark Schreiber/GP/Novartis)
> >>>         Subject:        [Biojava-dev] Fwd: Bug in org/biojava/utils/io/UncompressInputStream.java
> >>>
> >>>
> >>>
> >>> Passing on this email that came to me ...
> >>>
> >>> Regards,
> >>> Chris Dagdigian
> >>> OBF
> >>>
> >>>
> >>> Begin forwarded message:
> >>>
> >>>> From: "Miguel Duarte" <malduarte at gmail.com>
> >>>> Date: April 6, 2007 2:16:52 PM EDT
> >>>> To: dag at sonsorol.org
> >>>> Subject: Bug in org/biojava/utils/io/UncompressInputStream.java
> >>>>
> >>>> Hi Chris,
> >>>>
> >>>>> From http://sourceforge.net/project/shownotes.php?
> >>>>> release_id=314770&group_id=18598,
> >>>> i've learned that you're maintaining the class
> >>>> org/biojava/utils/io/UncompressInputStream.java. If that's not the
> >>>> case please forward this mail to the maintainer.
> >>>>
> >>>> I've discovered a nasty bug: With some read block sizes the algorithm
> >>>> truncates a few bytes from the end of the stream. I've verified this
> >>>> comparing the gzip/uncompress output for some files versus what
> >>>> org/biojava/utils/io/UncompressInputStream.java generates.
> >>>>
> >>>> Unfortunately i've not discovered the bug yet, but i can contribute
> >>>> with the attached test case. How to verify the bug:
> >>>> uncompress BH_03834.MCR.Z with gzip and with UncompressInputStream and
> >>>> compare the results.
> >>>>
> >>>> Thanks,
> >>>> Miguel Duarte
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> biojava-dev mailing list
> >>> biojava-dev at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >>>
> >>> [ Attachment ''BH_03834.MCR.Z'' removed by Mark Schreiber ]
> >>> [ Attachment ''UNCOMPRESSED_BY_GZIP'' removed by Mark Schreiber ]
> >>> [ Attachment ''UNCOMPRESSED_BY_UNCOMPRESSINPUTSTREAM'' removed by Mark
> >>> Schreiber ]
> >>>
> >>>
> >>> _______________________________________________
> >>> biojava-dev mailing list
> >>> biojava-dev at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >>>
> >> -----BEGIN PGP SIGNATURE-----
> >> Version: GnuPG v1.4.2.2 (GNU/Linux)
> >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> >>
> >> iD8DBQFGG1Hz4C5LeMEKA/QRAvTuAJ9F1AClFCV4WwBNP170mbC2+6JVDgCfVB17
> >> HoCuWrx5k2ONg/9oxIfVVPI=
> >> =cGTy
> >> -----END PGP SIGNATURE-----
> >> _______________________________________________
> >> biojava-dev mailing list
> >> biojava-dev at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list