0
0
mirror of https://github.com/nodejs/node.git synced 2024-12-01 16:10:02 +01:00
Commit Graph

162 Commits

Author SHA1 Message Date
Jonathan Johnson
c90eac7e0e url: change hostname regex to negate invalid chars
Regarding joyent/node#8520

This changes hostname validation from a whitelist regex approach
to a blacklist regex approach as described in https://url.spec.whatwg.org/#host-parsing.

url.parse misinterpreted `https://good.com+.evil.org/`
as `https://good.com/+.evil.org/`.  If we use url.parse to check the
validity of the hostname, the test passes, but in the browser the
user is redirected to the evil.org website.
2014-12-09 17:57:10 +01:00
Jonathan Johnson
6120472036 url: change hostname regex to negate invalid chars
Regarding joyent/node#8520

This changes hostname validation from a whitelist regex approach
to a blacklist regex approach as described in https://url.spec.whatwg.org/#host-parsing.

url.parse misinterpreted `https://good.com+.evil.org/`
as `https://good.com/+.evil.org/`.  If we use url.parse to check the
validity of the hostname, the test passes, but in the browser the
user is redirected to the evil.org website.
2014-12-02 17:24:18 -08:00
Yazhong Liu
ba687f6eb4 url: support path for url.format
this adds support for a "path" field that overrides
"query", "search", and "pathname" if given.

Fixes: https://github.com/joyent/node/issues/8722
PR-URL: https://github.com/joyent/node/pull/8755
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-12-02 11:32:08 -08:00
Yazhong Liu
d312b6d15c url: support path for url.format
this adds support for a "path" field that overrides
"query", "search", and "pathname" if given.

Fixes: https://github.com/joyent/node/issues/8722
PR-URL: https://github.com/joyent/node/pull/8755
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-12-02 10:55:22 -08:00
Ben Noordhuis
21130c7d6f lib: turn on strict mode
Turn on strict mode for the files in the lib/ directory.  It helps
catch bugs and can have a positive effect on performance.

PR-URL: https://github.com/node-forward/node/pull/64
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-11-22 17:23:30 +01:00
Trevor Norris
7b4a540422 src: fix jslint warning
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-10-08 01:13:43 -07:00
Evan Rutledge Borden
640ad632e3 url: fixed encoding for slash switching emulation.
Fixes: https://github.com/joyent/node/issues/8458
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-10-06 19:25:25 -05:00
Gabriel Wicke
b705b73e46 url: make query() consistent
Match the behavior of the slow path by setting url.query to an empty
object when the url contains no query, but query parsing is requested.

Also add a test for this case, and update the documents to clearly
reflect this behavior.

Fixes: https://github.com/joyent/node/issues/8332
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-10-01 12:23:01 -07:00
Timothy J Fontaine
7ca5af87a0 Merge remote-tracking branch 'upstream/v0.10' into v0.12
Conflicts:
	ChangeLog
	deps/v8/src/hydrogen.cc
	lib/http.js
	lib/querystring.js
	src/node_crypto.cc
	src/node_version.h
	test/simple/test-querystring.js
2014-09-16 17:48:09 -07:00
Majid Arif Siddiqui
176f0bd3df lib: improved forEach object performance
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-09-05 09:19:32 -07:00
Mathias Bynens
b869797a2d url: Add support for RFC 3490 separators
There is no need to split the host by hand in `url.js` – Punycode.js
takes care of it anyway. This not only simplifies the code, but also
adds support for RFC 3490 separators (i.e. not just U+002E, but U+3002,
U+FF0E, and U+FF61 as well).

Closes #6055.

Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-08-27 14:36:04 +04:00
Ezequiel Rabinovich
678ead2608 querystring: remove prepended ? from query field
Fixes an issue that caused the first querystring to be parsed prepending
a "?" in the first variable name on relative urls with no #fragment

Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-08-12 20:38:30 -07:00
Gabriel Wicke
4b59db008c Add fast path for simple URL parsing
This patch adds a fast path for parsing of simple path-only URLs, as commonly
found in HTTP requests received by a server.

Benchmark results [ms], before / after patch:
/foo/bar              0.008956   0.000418 (fast path used)
http://example.com/   0.011426   0.011437 (normal slow path, no change)

In a simple 'ab' benchmark of a single-threaded web server, this patch
increases the request rate from around 6400 to 7400 req/s.

Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-07-31 22:56:46 +04:00
isaacs
f7ede33f09 url: treat \ the same as /
See https://code.google.com/p/chromium/issues/detail?id=25916

Parse URLs with backslashes the same as web browsers, by replacing all
backslashes with forward slashes, except those that occur after the
first # character.

Manual rebase of 9520ade

Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-05-07 14:07:00 -07:00
isaacs
9520adeb37 url: treat \ the same as /
See https://code.google.com/p/chromium/issues/detail?id=25916

Parse URLs with backslashes the same as web browsers, by replacing all
backslashes with forward slashes, except those that occur after the
first # character.
2014-04-15 15:30:43 -07:00
Yuki KAN
006d42786e lib: use triple equals
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-04-02 02:12:18 -07:00
isaacs
22c68fdc1d src: Replace macros with util functions 2013-08-01 15:08:01 -07:00
Ben Noordhuis
0330bdf519 lib: macro-ify type checks
Increases the grep factor. Makes it easier to harmonize type checks
across the code base.
2013-07-24 21:49:35 +02:00
isaacs
e71d9fd834 Merge remote-tracking branch 'ry/v0.10'
Conflicts:
	doc/api/stream.markdown
	lib/tls.js
2013-07-17 18:32:23 -07:00
Shuan Wang
48a4600c56 url: Fix edge-case when protocol is non-lowercase
When using url.parse(), path and pathname usually return '/' when there
is no path available. However when you have a protocol that contains
non-lowercase letters and the input string does not have a trailing
slash, both path and pathname will be undefined.
2013-07-17 15:59:28 -07:00
isaacs
0882a75063 Merge remote-tracking branch 'ry/v0.10'
Conflicts:
	ChangeLog
	deps/uv/AUTHORS
	deps/uv/ChangeLog
	deps/uv/src/unix/linux-core.c
	deps/uv/src/version.c
	deps/uv/src/win/timer.c
	lib/url.js
	src/node_version.h
	test/simple/test-url.js
2013-06-05 13:38:38 -07:00
Ben Noordhuis
414a909d01 url: remove unused global variable 2013-06-04 11:43:42 -07:00
isaacs
5dd91b0147 url: Set href to null by default 2013-06-03 16:02:51 -07:00
isaacs
5dc51d4e21 url: Properly parse certain oddly formed urls
In cases where there are multiple @-chars in a url, Node currently
parses the hostname and auth sections differently than web browsers.

This part of the bug is serious, and should be landed in v0.10, and
also ported to v0.8, and releases made as soon as possible.

The less serious issue is that there are many other sorts of malformed
urls which Node either accepts when it should reject, or interprets
differently than web browsers.  For example, `http://a.com*foo` is
interpreted by Node like `http://a.com/*foo` when web browsers treat
this as `http://a.com%3Bfoo/`.

In general, *only* the `hostEndingChars` should be the characters that
delimit the host portion of the URL.  Most of the current `nonHostChars`
that appear in the hostname should be escaped, but some of them (such as
`;` and `%` when it does not introduce a hex pair) should raise an
error.

We need to have a broader discussion about whether it's best to throw in
these cases, and potentially break extant programs, or return an object
that has every field set to `null` so that any attempt to read the
hostname/auth/etc. will appear to be empty.
2013-06-03 15:56:16 -07:00
isaacs
881ef7cc5f url: ~ is not actually an unwise char 2013-04-12 16:27:49 -07:00
isaacs
17a379ec39 url: Escape all unwise characters
This makes node's http URL handling logic identical to Chrome's

Re #5284
2013-04-12 11:39:28 -07:00
J. Lee Coltrane
54d293da56 url: make url.format escape delimiters in path and query
`url.format` should escape ? and # chars in pathname, and # chars in
search, because they change the semantics of the operation otherwise.
Don't escape % chars, or anything else. (see: #4082)
2012-10-30 09:16:13 -07:00
Ben Noordhuis
9b61f570d8 Merge remote-tracking branch 'origin/v0.8'
Conflicts:
	configure
	deps/v8/build/common.gypi
2012-10-25 16:08:58 +02:00
Ben Noordhuis
de0303d3ad url: parse hostnames that start with - or _
Allow hostnames like '-lovemonsterz.tumblr.com' and '_jabber._tcp.google.com'.

Fixes #4177.
2012-10-25 01:06:00 +02:00
isaacs
7144be70db url: Go much faster by using Url class
V8 loves it when JavaScript pretends to be a Classic inheritance
type of language.

Before:

$ ./node benchmark/url.js
benchmarking parse() ... 1.868 sec
benchmarking format() ... 1.906 sec
benchmarking resolve("../foo/bar?baz=boom") ... 7.800 sec
benchmarking resolve("foo/bar") ... 7.099 sec
benchmarking resolve("http://nodejs.org") ... 8.403 sec
benchmarking resolve("./foo/bar?baz") ... 7.974 sec

After:

$ ./node benchmark/url.js
benchmarking parse() ... 1.769 sec
benchmarking format() ... 1.793 sec
benchmarking resolve("../foo/bar?baz=boom") ... 4.254 sec
benchmarking resolve("foo/bar") ... 3.932 sec
benchmarking resolve("http://nodejs.org") ... 4.382 sec
benchmarking resolve("./foo/bar?baz") ... 4.293 sec
2012-09-17 10:44:23 -07:00
isaacs
9fc7283a40 Fix #3270 Escape url.parse delims
Rather than omitting them.
2012-05-16 15:41:28 -07:00
Łukasz Walukiewicz
677c2c112c Ignore an empty port component when parsing URLs. 2012-03-12 12:46:56 -07:00
Ben Noordhuis
86f4846c21 url: decode url entities in auth section
Fixes #2736.
2012-02-20 17:11:21 +01:00
Łukasz Walukiewicz
a94ffdaec5 url: Support for IPv6 addresses in URLs.
Fixes #1138, #2610.
2012-01-31 16:58:41 +09:00
Jordan Sissel
358f0ce96b url: add '.' '+' and '-' in url protocol
- Based on BNF from RFC 1738 section 5.
- Added tests to cover svn+ssh and some other examples
2011-11-04 13:36:06 +01:00
seebees
216570b5e1 Lint 2011-10-22 14:14:40 +09:00
seebees
1ead20f274 remove auth from host
Fixes #1626
2011-10-22 14:14:40 +09:00
seebees
be4576de7a url.resolveObject(url.parse(x), y) == url.parse(url.resolve(x, y));
added a .path property = .pathname + .search for use with http.request

And tests to verify everything.
With the tests, I changed over to deepEqual, and I would note the comment on the test
['.//g', 'f:/a', 'f://g'], which I think is a fundamental problem

This supersedes pull 1596
2011-10-22 14:14:39 +09:00
Maciej Małecki
d0552949b9 url: add plus sign to protocol pattern 2011-09-06 17:03:52 +02:00
Robert Mustacchi
de0b8d601c jslint cleanup: path.js, readline.js, repl.js, tls.js, tty_win32.js, url.js 2011-07-29 11:58:02 -07:00
Ben Noordhuis
1b0e054737 url: throw descriptive error if url argument to parse() is not a string
Fixes #568.
2011-07-21 00:51:48 +02:00
isaacs
dcecfc5f1b Close #1360 url: Allow _ in hostnames. 2011-07-19 09:55:01 -07:00
isaacs
87900b14da url: Don't swallow punycode errors 2011-07-06 13:17:50 -07:00
Jeremy Selier
2a848fa727 Close #1149 IDNA and Punycode support in url.parse
Using @bnoordhuis's punycode lib.

Close #1174 also
2011-07-06 13:17:50 -07:00
Ryan Petrello
58a1d7ec30 Close #562 Close #1078 Parse file:// urls properly
The file:// protocol *always* has a hostname; it's frequently
abbreviated as an empty string, which represents 'localhost'
implicitly.

According to RFC 1738 (http://tools.ietf.org/html/rfc1738):

A file URL takes the form:

   file://<host>/<path>

where <host> is the fully qualified domain name of the system on
which the <path> is accessible...

As a special case, <host> can be the string "localhost" or the empty
string; this is interpreted as 'the machine from which the URL is
being interpreted'.
2011-05-27 10:47:34 -07:00
isaacs
307f39ce9e Fix a url regression
The change for #954 introduced a regression that would cause
the url parser to fail on special chars found in the auth
segment.  Fix that, and also don't create invalid urls when
format() is called on an object containing an auth member
containing '@' characters or delimiters.
2011-05-10 13:57:25 -07:00
isaacs
90802d628d Close #954 URL parsing/formatting corrections
1. Allow single-quotes in urls, but escape them.
2. Add comments about which RFCs we're following for guidance.
3. Handle any invalid character in the hostname portion.
4. lcase protocol and hostname portions, since they are
case-insensitive.
2011-04-20 15:44:34 -07:00
Ryan Dahl
55048cdf79 Update copyright headers 2011-03-14 17:37:05 -07:00
isaacs
d664bf376d Closes GH-711 URL parse more safely
This does 3 things:

1. Delimiters and "unwise" characters are never included in the
   hostname or path.
2. url.format will sanitize string URLs that are passed to it.
3. The parsed url's 'href' member will be the sanitized url, which may
   not match the argument to url.parse.
2011-02-27 17:17:40 -08:00
Ryan Dahl
5a05992155 Lint 2011-01-06 16:06:27 -08:00
Bert Belder
af15b4e45a Remove path module dependency from url module
Now the path module can be adapted to support windows paths without breaking
the url module.  It also allows the undocumented keepBlanks flag to be
removed from path.join and path.normalizeArray.
2011-01-05 11:27:22 -08:00
Jeremy Martin
6f726cf8c7 url.parse(url, true) defaults query field to {} 2010-12-20 13:48:44 -08:00
isaacs
7c57eb2aec lint url.js 2010-12-02 11:46:32 -08:00
isaacs
9996b459e1 Implement new path.join behavior
1. Express desired path.join behavior in tests.
2. Update fs.realpath to reflect new path.join behavior
3. Update url.resolve() to use new path.join behavior.
2010-11-14 22:49:26 -08:00
Joshaven Potter
3d4e4d8909 syntax fixes to pass jslint 2010-10-06 20:40:57 -07:00
isaacs
0e311717b5 Treat "//some_path" as pathname rather than hostname by default.
Note that "//" is still a special indicator for the hostname, and this does
not change the parsing of mailto: and other "slashless" url schemes.  It
does however remove some oddness in url.parse(req.url) which is the most
common use-case for the url.parse function.
2010-09-02 09:24:21 -07:00
Tim Caswell
62d9852c3d Replace slow and broken for..in loops with faster for loops over the keys. 2010-04-12 10:34:35 -07:00
isaacs
57fbb627ca trailing whitespace fixes 2010-04-11 14:48:23 -07:00
Ryan Dahl
b8dee2eb20 camel case variables in url module 2010-02-22 06:49:14 -08:00
cloudhead
3669c75f4d removed inline require call for querystring 2010-01-24 14:25:31 -08:00
Benjamin Thomas
947c577c0d Fix bug in the url module's url_parse method if 'parseQueryString' is true 2010-01-06 02:12:11 -08:00
isaacs
7ff04c1f86 Add URL and QueryString modules, and tests for each.
Also, make a slight change from original on url-module to put the
spacePattern into the function.  On closer inspection, it turns out that the
nonlocal-var cost is higher than the compiling-a-regexp cost.

Also, documentation.
2010-01-04 21:03:54 -08:00