0
0
mirror of https://github.com/nodejs/node.git synced 2024-11-30 07:27:22 +01:00
Commit Graph

30 Commits

Author SHA1 Message Date
Brian White
24ef1e6775 string_decoder: align UTF-8 handling with V8
V8 5.5 changed how invalid characters are handled and it now appears
to follow the WHATWG Encoding standard, where all of an invalid
character's bytes are replaced by a single replacement character
(\ufffd) instead of replacing each invalid byte with separate
replacement characters.

Example: the byte sequence 0xF0,0xB8,0x41 is decoded as '\ufffdA' in
V8 5.5, but is decoded as '\ufffd\ufffdA' in previous versions of V8.

PR-URL: https://github.com/nodejs/node/pull/9618
Reviewed-By: Ali Ijaz Sheikh <ofrobots@google.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2017-01-26 22:46:18 +01:00
Brian White
79ef3b6b5d
string_decoder: fix bad utf8 character handling
This commit fixes an issue when extra utf8 continuation bytes appear
at the end of a chunk of data, causing miscalculations to be made
when checking how many bytes are needed to decode a complete
character.

Fixes: https://github.com/nodejs/node/issues/7308
PR-URL: https://github.com/nodejs/node/pull/7310
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Fedor Indutny <fedor.indutny@gmail.com>
2016-06-23 23:18:10 -04:00
James M Snell
6dd093da26 buffer,string_decoder: consolidate encoding validation logic
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: https://github.com/nodejs/node/pull/7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
2016-06-21 09:28:38 -07:00
Trevor Norris
54cc7212df buffer: introduce latin1 encoding term
When node began using the OneByte API (f150d56) it also switched to
officially supporting ISO-8859-1. Though at the time no new encoding
string was introduced.

Introduce the new encoding string 'latin1' to be more explicit. The
previous 'binary' and documented as an alias to 'latin1'.  While many
tests have switched to use 'latin1', there are still plenty that do both
'binary' and 'latin1' checks side-by-side to ensure there is no
regression.

PR-URL: https://github.com/nodejs/node/pull/7111
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
2016-06-07 13:51:14 -06:00
Brian White
d23b7d2656
string_decoder: rewrite implementation
This commit provides a rewrite of StringDecoder that both improves
performance (for non-single-byte encodings) and understandability.

Additionally, StringDecoder instantiation performance has increased
considerably due to inlinability and more efficient encoding name
checking.

PR-URL: https://github.com/nodejs/node/pull/6777
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2016-05-29 14:48:11 -04:00
Jackson Tian
e556dd3856 doc: use Buffer.from() instead of new Buffer()
Use new API of Buffer to developers in most documents.

PR-URL: https://github.com/nodejs/node/pull/6367
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com>
2016-04-27 13:23:41 +08:00
James M Snell
85ab4a5f12 buffer: add .from(), .alloc() and .allocUnsafe()
Several changes:

* Soft-Deprecate Buffer() constructors
* Add `Buffer.from()`, `Buffer.alloc()`, and `Buffer.allocUnsafe()`
* Add `--zero-fill-buffers` command line option
* Add byteOffset and length to `new Buffer(arrayBuffer)` constructor
* buffer.fill('') previously had no effect, now zero-fills
* Update the docs

PR-URL: https://github.com/nodejs/node/pull/4682
Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com>
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
2016-03-16 08:34:02 -07:00
Brian White
ae244a26c1 string_decoder: fix performance regression
This commit reverts the const usage introduced by 68a6abc
because v8 currently cannot optimize functions that contain
these uses of const (unsupported phi use of const variable).
The performance difference in this case can be up to ~130%
for non-ascii/binary string encodings.

PR-URL: https://github.com/nodejs/node/pull/5134
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
2016-02-11 10:41:49 -05:00
Rich Trott
68a6abc806 lib: remove string_decoder.js var redeclarations
PR-URL: https://github.com/nodejs/node/pull/4978
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
2016-02-01 08:43:40 -08:00
Roman Reiss
b5b8ff117c lib: don't use global Buffer
Port of https://github.com/joyent/node/pull/8603

The race condition present in the original PR didn't occur, so no
workaround was needed.

PR-URL: https://github.com/nodejs/io.js/pull/1794
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
2015-06-11 20:24:44 +02:00
Brian White
0fa6c4a6fc string_decoder: don't cache Buffer.isEncoding
Some modules are monkey-patching Buffer.isEncoding, so without this
they cannot do that.

Fixes: https://github.com/iojs/io.js/issues/1547
PR-URL: https://github.com/iojs/io.js/pull/1548
Reviewed-By: Evan Lucas <evanlucas@me.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2015-04-28 11:13:39 -04:00
Brian White
8a945814dd string_decoder: optimize write()
By limiting property getting/setting to only where they are
absolutely necessary, we can achieve greater performance
especially with small utf8 inputs and any size base64 inputs.

PR-URL: https://github.com/iojs/io.js/pull/1209
Reviewed-By: Rod Vagg <rod@vagg.org>
Reviewed-By: Nicu Micleușanu <micnic90@gmail.com>
Reviewed-By: Chris Dickinson <christopher.s.dickinson@gmail.com>
2015-03-25 00:34:34 -04:00
cjihrig
804e7aa9ab lib: use const to define constants
This commit replaces a number of var statements throughout
the lib code with const statements.

PR-URL: https://github.com/iojs/io.js/pull/541
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2015-01-21 16:21:31 -05:00
isaacs
3e1b1dd4a9 Remove excessive copyright/license boilerplate
The copyright and license notice is already in the LICENSE file.  There
is no justifiable reason to also require that it be included in every
file, since the individual files are not individually distributed except
as part of the entire package.
2015-01-12 15:30:28 -08:00
Ben Noordhuis
21130c7d6f lib: turn on strict mode
Turn on strict mode for the files in the lib/ directory.  It helps
catch bugs and can have a positive effect on performance.

PR-URL: https://github.com/node-forward/node/pull/64
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-11-22 17:23:30 +01:00
Fedor Indutny
9d9fc3fa30 lib: jslint string_decoder.js 2014-07-15 12:43:59 +04:00
Felix Geisendörfer
80eff96829 string_decoder: Add more comments 2014-06-06 15:07:29 -07:00
Felix Geisendörfer
9fbd0f0f7d string_decoder: Fix failures from new test cases
This patch simplifies the implementation of StringDecoder, fixes the
failures from the new test cases, and also no longer relies on v8's
WriteUtf8 function to encode individual surrogates.
2014-06-06 15:07:29 -07:00
isaacs
314c6b3060 Don't allow invalid encodings in StringDecoder class 2012-12-13 17:00:22 -08:00
isaacs
061f2075cf string_decoder: Add 'end' method, do base64 properly 2012-10-11 16:46:18 -07:00
koichik
40c4beeb57 string_decoder: added support for UTF-16LE
Fixes #3223.
2012-05-05 22:47:24 +09:00
koichik
ceb51ddaa1 string_decoder: add support for CESU-8
Fixes #3217.
2012-05-05 12:24:01 +09:00
Ryan Dahl
55048cdf79 Update copyright headers 2011-03-14 17:37:05 -07:00
Ryan Dahl
a0159b4b29 Fix global leaks 2010-12-04 15:58:50 -08:00
Ryan Dahl
dd53ceebe4 lint 2010-12-01 20:59:06 -08:00
Ryan Dahl
069d973d74 Remove require('buffer') in built-in libraries. 2010-09-28 02:31:31 -07:00
Blake Mizerany
8c8534046c fix whitespace errors 2010-06-29 23:59:24 -07:00
Ryan Dahl
5e86d01385 Revert "Buffer.copy should copy through sourceEnd, as specified."
This reverts commit a2f70da4c9.

Keep tests modifies a few edge checks on Copy()
2010-06-29 19:40:20 -07:00
Matt Ranney
a2f70da4c9 Buffer.copy should copy through sourceEnd, as specified.
Improve test-buffer.js to cover all copy error cases.

Fix off by one error in string_decoder.
2010-06-25 09:10:49 -07:00
Ryan Dahl
6bed15e074 Refactor: Utf8Decoder -> StringDecoder
Instead of just decoding Utf8, this will proxy requests to buffer.toString()
for other encodings. This makes for a simpler interface.
2010-06-15 18:19:27 -07:00