gzip vs. zlib speed

gzip vs. zlib speed

Post by Mikhail Te » Fri, 24 Dec 2004 09:27:14


I have a utility, that can use either fgetc or gzgetc to read its data. It
auto detects the input format.

For some reason, using `gzip -dc input.gz | utility /dev/stdin' is MUCH
faster, than `utility input.gz'. In fact, the former is almost as fast as
reading uncompressed data (`utility input'), but the latter is about 11
times slower on FreeBSD (libz 1.2.1, P4 @3GHz) and almost 14 times slower
on Solaris9 (libz 1.1.4, Sparc @1.2GHz).

Is this a known issue, and I need to either keep using gzip or redesign the
utility to read bigger buffers instead of one character at a time?

Is there any work ongoing to bring libz's "stdio emulation" up to speed?



gzip vs. zlib speed

Post by madle » Fri, 24 Dec 2004 10:23:42

character at a time?

That's exactly what you need to do. gzgetc() is a function that calls
gzread() to get one byte, and so is much, much slower than using
gzread() on large chunks of data.


Not currently. It should be quite easy for the user, you, to provide
that speed using gzread() with large buffers and your own macro to
access what was read a byte at a time, if that's what you need. (I
recommend 128K or 256K output buffers for the best speed.) To provide
that efficiency in zlib, I would need to provide internal buffering for
the gz* functions, which is not currently there, and the guts of the
gzFile structure would need to be exposed to the application, which has
other issues in terms of maintaining interface compatibility across
versions of zlib. I may tackle this someday, but it's not high on my
priority list.



gzip vs. zlib speed

Post by madle » Fri, 24 Dec 2004 10:31:32

I don't know what's with the question marks and funny line breaks in my
last post. I want the old google groups back!