V8 5.5 changed how invalid characters are handled and it now appears
to follow the WHATWG Encoding standard, where all of an invalid
character's bytes are replaced by a single replacement character
(\ufffd) instead of replacing each invalid byte with separate
replacement characters.
Example: the byte sequence 0xF0,0xB8,0x41 is decoded as '\ufffdA' in
V8 5.5, but is decoded as '\ufffd\ufffdA' in previous versions of V8.
PR-URL: https://github.com/nodejs/node/pull/9618
Reviewed-By: Ali Ijaz Sheikh <ofrobots@google.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
This commit fixes an issue when extra utf8 continuation bytes appear
at the end of a chunk of data, causing miscalculations to be made
when checking how many bytes are needed to decode a complete
character.
Fixes: https://github.com/nodejs/node/issues/7308
PR-URL: https://github.com/nodejs/node/pull/7310
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com>
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.
PR-URL: https://github.com/nodejs/node/pull/7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
When node began using the OneByte API (f150d56) it also switched to
officially supporting ISO-8859-1. Though at the time no new encoding
string was introduced.
Introduce the new encoding string 'latin1' to be more explicit. The
previous 'binary' and documented as an alias to 'latin1'. While many
tests have switched to use 'latin1', there are still plenty that do both
'binary' and 'latin1' checks side-by-side to ensure there is no
regression.
PR-URL: https://github.com/nodejs/node/pull/7111
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
This commit provides a rewrite of StringDecoder that both improves
performance (for non-single-byte encodings) and understandability.
Additionally, StringDecoder instantiation performance has increased
considerably due to inlinability and more efficient encoding name
checking.
PR-URL: https://github.com/nodejs/node/pull/6777
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Use new API of Buffer to developers in most documents.
PR-URL: https://github.com/nodejs/node/pull/6367
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com>
Several changes:
* Soft-Deprecate Buffer() constructors
* Add `Buffer.from()`, `Buffer.alloc()`, and `Buffer.allocUnsafe()`
* Add `--zero-fill-buffers` command line option
* Add byteOffset and length to `new Buffer(arrayBuffer)` constructor
* buffer.fill('') previously had no effect, now zero-fills
* Update the docs
PR-URL: https://github.com/nodejs/node/pull/4682
Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com>
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
This commit reverts the const usage introduced by 68a6abc
because v8 currently cannot optimize functions that contain
these uses of const (unsupported phi use of const variable).
The performance difference in this case can be up to ~130%
for non-ascii/binary string encodings.
PR-URL: https://github.com/nodejs/node/pull/5134
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
By limiting property getting/setting to only where they are
absolutely necessary, we can achieve greater performance
especially with small utf8 inputs and any size base64 inputs.
PR-URL: https://github.com/iojs/io.js/pull/1209
Reviewed-By: Rod Vagg <rod@vagg.org>
Reviewed-By: Nicu Micleușanu <micnic90@gmail.com>
Reviewed-By: Chris Dickinson <christopher.s.dickinson@gmail.com>
This commit replaces a number of var statements throughout
the lib code with const statements.
PR-URL: https://github.com/iojs/io.js/pull/541
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
The copyright and license notice is already in the LICENSE file. There
is no justifiable reason to also require that it be included in every
file, since the individual files are not individually distributed except
as part of the entire package.
Turn on strict mode for the files in the lib/ directory. It helps
catch bugs and can have a positive effect on performance.
PR-URL: https://github.com/node-forward/node/pull/64
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
This patch simplifies the implementation of StringDecoder, fixes the
failures from the new test cases, and also no longer relies on v8's
WriteUtf8 function to encode individual surrogates.