0
0
mirror of https://github.com/nodejs/node.git synced 2024-11-29 15:06:33 +01:00
nodejs/doc/api/zlib.md
Myles Borins 241eb6122e
zlib: gracefully set windowBits from 8 to 9
On 4 April 2017, Node.js versions v4.8.2 and v6.10.2 were
released. These versions bumped the vendored zlib library from
v1.2.8 to v1.2.11 in response to what it describes as low-severity
CVEs. In zlib v1.2.9, a change was made that causes an error to be
raised when a raw deflate stream is initialised with windowBits set
to 8.

In zlib v1.2.9, 8 become an invalid value for this parameter, and Node's zlib
module will crash if you call this:

```
zlib.createDeflateRaw({windowBits: 8})
```

On some versions this crashes Node and you cannot recover from it, while on some
versions it throws an exception. The permessage-deflate library up to
version v0.1.5 does make such a call with no try/catch

This commit reverts to the original behavior of zlib by gracefully changed
windowBits: 8 to windowBits: 9 for raw deflate streams.

Original-PR-URL: https://github.com/nodejs-private/node-private/pull/95
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Evan Lucas <evanlucas@me.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>

PR-URL: https://github.com/nodejs/node/pull/16511
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
2017-10-29 20:14:01 +01:00

21 KiB

Zlib

Stability: 2 - Stable

The zlib module provides compression functionality implemented using Gzip and Deflate/Inflate. It can be accessed using:

const zlib = require('zlib');

Compressing or decompressing a stream (such as a file) can be accomplished by piping the source stream data through a zlib stream into a destination stream:

const gzip = zlib.createGzip();
const fs = require('fs');
const inp = fs.createReadStream('input.txt');
const out = fs.createWriteStream('input.txt.gz');

inp.pipe(gzip).pipe(out);

It is also possible to compress or decompress data in a single step:

const input = '.................................';
zlib.deflate(input, (err, buffer) => {
  if (!err) {
    console.log(buffer.toString('base64'));
  } else {
    // handle error
  }
});

const buffer = Buffer.from('eJzT0yMAAGTvBe8=', 'base64');
zlib.unzip(buffer, (err, buffer) => {
  if (!err) {
    console.log(buffer.toString());
  } else {
    // handle error
  }
});

Threadpool Usage

Note that all zlib APIs except those that are explicitly synchronous use libuv's threadpool, which can have surprising and negative performance implications for some applications, see the UV_THREADPOOL_SIZE documentation for more information.

Compressing HTTP requests and responses

The zlib module can be used to implement support for the gzip and deflate content-encoding mechanisms defined by HTTP.

The HTTP Accept-Encoding header is used within an http request to identify the compression encodings accepted by the client. The Content-Encoding header is used to identify the compression encodings actually applied to a message.

Note: the examples given below are drastically simplified to show the basic concept. Using zlib encoding can be expensive, and the results ought to be cached. See Memory Usage Tuning for more information on the speed/memory/compression tradeoffs involved in zlib usage.

// client request example
const zlib = require('zlib');
const http = require('http');
const fs = require('fs');
const request = http.get({ host: 'example.com',
                           path: '/',
                           port: 80,
                           headers: { 'Accept-Encoding': 'gzip,deflate' } });
request.on('response', (response) => {
  const output = fs.createWriteStream('example.com_index.html');

  switch (response.headers['content-encoding']) {
    // or, just use zlib.createUnzip() to handle both cases
    case 'gzip':
      response.pipe(zlib.createGunzip()).pipe(output);
      break;
    case 'deflate':
      response.pipe(zlib.createInflate()).pipe(output);
      break;
    default:
      response.pipe(output);
      break;
  }
});
// server example
// Running a gzip operation on every request is quite expensive.
// It would be much more efficient to cache the compressed buffer.
const zlib = require('zlib');
const http = require('http');
const fs = require('fs');
http.createServer((request, response) => {
  const raw = fs.createReadStream('index.html');
  let acceptEncoding = request.headers['accept-encoding'];
  if (!acceptEncoding) {
    acceptEncoding = '';
  }

  // Note: This is not a conformant accept-encoding parser.
  // See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3
  if (/\bdeflate\b/.test(acceptEncoding)) {
    response.writeHead(200, { 'Content-Encoding': 'deflate' });
    raw.pipe(zlib.createDeflate()).pipe(response);
  } else if (/\bgzip\b/.test(acceptEncoding)) {
    response.writeHead(200, { 'Content-Encoding': 'gzip' });
    raw.pipe(zlib.createGzip()).pipe(response);
  } else {
    response.writeHead(200, {});
    raw.pipe(response);
  }
}).listen(1337);

By default, the zlib methods will throw an error when decompressing truncated data. However, if it is known that the data is incomplete, or the desire is to inspect only the beginning of a compressed file, it is possible to suppress the default error handling by changing the flushing method that is used to decompress the last chunk of input data:

// This is a truncated version of the buffer from the above examples
const buffer = Buffer.from('eJzT0yMA', 'base64');

zlib.unzip(
  buffer,
  { finishFlush: zlib.constants.Z_SYNC_FLUSH },
  (err, buffer) => {
    if (!err) {
      console.log(buffer.toString());
    } else {
      // handle error
    }
  });

This will not change the behavior in other error-throwing situations, e.g. when the input data has an invalid format. Using this method, it will not be possible to determine whether the input ended prematurely or lacks the integrity checks, making it necessary to manually check that the decompressed result is valid.

Memory Usage Tuning

From zlib/zconf.h, modified to node.js's usage:

The memory requirements for deflate are (in bytes):

(1 << (windowBits + 2)) + (1 << (memLevel + 9))

That is: 128K for windowBits=15 + 128K for memLevel = 8 (default values) plus a few kilobytes for small objects.

For example, to reduce the default memory requirements from 256K to 128K, the options should be set to:

const options = { windowBits: 14, memLevel: 7 };

This will, however, generally degrade compression.

The memory requirements for inflate are (in bytes) 1 << windowBits. That is, 32K for windowBits=15 (default value) plus a few kilobytes for small objects.

This is in addition to a single internal output slab buffer of size chunkSize, which defaults to 16K.

The speed of zlib compression is affected most dramatically by the level setting. A higher level will result in better compression, but will take longer to complete. A lower level will result in less compression, but will be much faster.

In general, greater memory usage options will mean that Node.js has to make fewer calls to zlib because it will be able to process more data on each write operation. So, this is another factor that affects the speed, at the cost of memory usage.

Flushing

Calling .flush() on a compression stream will make zlib return as much output as currently possible. This may come at the cost of degraded compression quality, but can be useful when data needs to be available as soon as possible.

In the following example, flush() is used to write a compressed partial HTTP response to the client:

const zlib = require('zlib');
const http = require('http');

http.createServer((request, response) => {
  // For the sake of simplicity, the Accept-Encoding checks are omitted.
  response.writeHead(200, { 'content-encoding': 'gzip' });
  const output = zlib.createGzip();
  output.pipe(response);

  setInterval(() => {
    output.write(`The current time is ${Date()}\n`, () => {
      // The data has been passed to zlib, but the compression algorithm may
      // have decided to buffer the data for more efficient compression.
      // Calling .flush() will make the data available as soon as the client
      // is ready to receive it.
      output.flush();
    });
  }, 1000);
}).listen(1337);

Constants

All of the constants defined in zlib.h are also defined on require('zlib').constants. In the normal course of operations, it will not be necessary to use these constants. They are documented so that their presence is not surprising. This section is taken almost directly from the zlib documentation. See http://zlib.net/manual.html#Constants for more details.

Note: Previously, the constants were available directly from require('zlib'), for instance zlib.Z_NO_FLUSH. Accessing the constants directly from the module is currently still possible but should be considered deprecated.

Allowed flush values.

  • zlib.constants.Z_NO_FLUSH
  • zlib.constants.Z_PARTIAL_FLUSH
  • zlib.constants.Z_SYNC_FLUSH
  • zlib.constants.Z_FULL_FLUSH
  • zlib.constants.Z_FINISH
  • zlib.constants.Z_BLOCK
  • zlib.constants.Z_TREES

Return codes for the compression/decompression functions. Negative values are errors, positive values are used for special but normal events.

  • zlib.constants.Z_OK
  • zlib.constants.Z_STREAM_END
  • zlib.constants.Z_NEED_DICT
  • zlib.constants.Z_ERRNO
  • zlib.constants.Z_STREAM_ERROR
  • zlib.constants.Z_DATA_ERROR
  • zlib.constants.Z_MEM_ERROR
  • zlib.constants.Z_BUF_ERROR
  • zlib.constants.Z_VERSION_ERROR

Compression levels.

  • zlib.constants.Z_NO_COMPRESSION
  • zlib.constants.Z_BEST_SPEED
  • zlib.constants.Z_BEST_COMPRESSION
  • zlib.constants.Z_DEFAULT_COMPRESSION

Compression strategy.

  • zlib.constants.Z_FILTERED
  • zlib.constants.Z_HUFFMAN_ONLY
  • zlib.constants.Z_RLE
  • zlib.constants.Z_FIXED
  • zlib.constants.Z_DEFAULT_STRATEGY

Class Options

Each class takes an options object. All options are optional.

Note that some options are only relevant when compressing, and are ignored by the decompression classes.

  • flush {integer} (default: zlib.constants.Z_NO_FLUSH)
  • finishFlush {integer} (default: zlib.constants.Z_FINISH)
  • chunkSize {integer} (default: 16*1024)
  • windowBits {integer}
  • level {integer} (compression only)
  • memLevel {integer} (compression only)
  • strategy {integer} (compression only)
  • dictionary {Buffer|TypedArray|DataView} (deflate/inflate only, empty dictionary by default)
  • info {boolean} (If true, returns an object with buffer and engine)

See the description of deflateInit2 and inflateInit2 at http://zlib.net/manual.html#Advanced for more information on these.

Class: zlib.Deflate

Compress data using deflate.

Class: zlib.DeflateRaw

Compress data using deflate, and do not append a zlib header.

Class: zlib.Gunzip

Decompress a gzip stream.

Class: zlib.Gzip

Compress data using gzip.

Class: zlib.Inflate

Decompress a deflate stream.

Class: zlib.InflateRaw

Decompress a raw deflate stream.

Class: zlib.Unzip

Decompress either a Gzip- or Deflate-compressed stream by auto-detecting the header.

Class: zlib.Zlib

Not exported by the zlib module. It is documented here because it is the base class of the compressor/decompressor classes.

zlib.bytesRead

  • {number}

The zlib.bytesRead property specifies the number of bytes read by the engine before the bytes are processed (compressed or decompressed, as appropriate for the derived class).

zlib.flush([kind], callback)

kind defaults to zlib.constants.Z_FULL_FLUSH.

Flush pending data. Don't call this frivolously, premature flushes negatively impact the effectiveness of the compression algorithm.

Calling this only flushes data from the internal zlib state, and does not perform flushing of any kind on the streams level. Rather, it behaves like a normal call to .write(), i.e. it will be queued up behind other pending writes and will only produce output when data is being read from the stream.

zlib.params(level, strategy, callback)

Dynamically update the compression level and compression strategy. Only applicable to deflate algorithm.

zlib.reset()

Reset the compressor/decompressor to factory defaults. Only applicable to the inflate and deflate algorithms.

zlib.constants

Provides an object enumerating Zlib-related constants.

zlib.createDeflate(options)

Creates and returns a new Deflate object with the given options.

zlib.createDeflateRaw(options)

Creates and returns a new DeflateRaw object with the given options.

Note: An upgrade of zlib from 1.2.8 to 1.2.11 changed behavior when windowBits is set to 8 for raw deflate streams. zlib does not have a working implementation of an 8-bit Window for raw deflate streams and would automatically set windowBit to 9 if initially set to 8. Newer versions of zlib will throw an exception. This creates a potential DOS vector, and as such the behavior ahs been reverted in Node.js 8, 6, and 4. Node.js version 9 and higher will throw when windowBits is set to 8.

zlib.createGunzip(options)

Creates and returns a new Gunzip object with the given options.

zlib.createGzip(options)

Creates and returns a new Gzip object with the given options.

zlib.createInflate(options)

Creates and returns a new Inflate object with the given options.

zlib.createInflateRaw(options)

Creates and returns a new InflateRaw object with the given options.

zlib.createUnzip(options)

Creates and returns a new Unzip object with the given options.

Convenience Methods

All of these take a Buffer, TypedArray, DataView, or string as the first argument, an optional second argument to supply options to the zlib classes and will call the supplied callback with callback(error, result).

Every method has a *Sync counterpart, which accept the same arguments, but without a callback.

zlib.deflate(buffer[, options], callback)

zlib.deflateSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Compress a chunk of data with Deflate.

zlib.deflateRaw(buffer[, options], callback)

zlib.deflateRawSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Compress a chunk of data with DeflateRaw.

zlib.gunzip(buffer[, options], callback)

zlib.gunzipSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Decompress a chunk of data with Gunzip.

zlib.gzip(buffer[, options], callback)

zlib.gzipSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Compress a chunk of data with Gzip.

zlib.inflate(buffer[, options], callback)

zlib.inflateSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Decompress a chunk of data with Inflate.

zlib.inflateRaw(buffer[, options], callback)

zlib.inflateRawSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Decompress a chunk of data with InflateRaw.

zlib.unzip(buffer[, options], callback)

zlib.unzipSync(buffer[, options])

  • buffer {Buffer|TypedArray|DataView|string}

Decompress a chunk of data with Unzip.