fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
// Submit a mix of short and long jobs to the threadpool.
|
|
|
|
// Report total job throughput.
|
|
|
|
// If we partition the long job, overall job throughput goes up significantly.
|
|
|
|
// However, this comes at the cost of the long job throughput.
|
|
|
|
//
|
|
|
|
// Short jobs: small zip jobs.
|
|
|
|
// Long jobs: fs.readFile on a large file.
|
|
|
|
|
|
|
|
'use strict';
|
|
|
|
|
|
|
|
const path = require('path');
|
|
|
|
const common = require('../common.js');
|
|
|
|
const filename = path.resolve(__dirname,
|
|
|
|
`.removeme-benchmark-garbage-${process.pid}`);
|
|
|
|
const fs = require('fs');
|
|
|
|
const zlib = require('zlib');
|
|
|
|
|
|
|
|
const bench = common.createBenchmark(main, {
|
2022-08-21 16:28:04 +02:00
|
|
|
duration: [5],
|
|
|
|
encoding: ['', 'utf-8'],
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
len: [1024, 16 * 1024 * 1024],
|
2023-02-01 20:16:18 +01:00
|
|
|
concurrent: [1, 10],
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
});
|
|
|
|
|
2022-08-21 16:28:04 +02:00
|
|
|
function main({ len, duration, concurrent, encoding }) {
|
2022-02-03 06:28:06 +01:00
|
|
|
try {
|
|
|
|
fs.unlinkSync(filename);
|
|
|
|
} catch {
|
|
|
|
// Continue regardless of error.
|
|
|
|
}
|
2020-01-24 18:12:49 +01:00
|
|
|
let data = Buffer.alloc(len, 'x');
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
fs.writeFileSync(filename, data);
|
|
|
|
data = null;
|
|
|
|
|
2019-03-26 05:21:27 +01:00
|
|
|
const zipData = Buffer.alloc(1024, 'a');
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
|
2023-10-15 12:55:27 +02:00
|
|
|
let waitConcurrent = 0;
|
|
|
|
|
|
|
|
// Plus one because of zip
|
|
|
|
const targetConcurrency = concurrent + 1;
|
|
|
|
const startedAt = Date.now();
|
|
|
|
const endAt = startedAt + (duration * 1000);
|
|
|
|
|
2020-01-24 18:12:49 +01:00
|
|
|
let reads = 0;
|
|
|
|
let zips = 0;
|
2023-10-15 12:55:27 +02:00
|
|
|
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
bench.start();
|
2023-10-15 12:55:27 +02:00
|
|
|
|
|
|
|
function stop() {
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
const totalOps = reads + zips;
|
|
|
|
bench.end(totalOps);
|
2023-10-15 12:55:27 +02:00
|
|
|
|
2022-02-03 06:28:06 +01:00
|
|
|
try {
|
|
|
|
fs.unlinkSync(filename);
|
|
|
|
} catch {
|
|
|
|
// Continue regardless of error.
|
|
|
|
}
|
2023-10-15 12:55:27 +02:00
|
|
|
}
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
|
|
|
|
function read() {
|
2022-08-21 16:28:04 +02:00
|
|
|
fs.readFile(filename, encoding, afterRead);
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
function afterRead(er, data) {
|
|
|
|
if (er) {
|
|
|
|
throw er;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (data.length !== len)
|
|
|
|
throw new Error('wrong number of bytes returned');
|
|
|
|
|
|
|
|
reads++;
|
2023-10-15 12:55:27 +02:00
|
|
|
const benchEnded = Date.now() >= endAt;
|
|
|
|
|
|
|
|
if (benchEnded && (++waitConcurrent) === targetConcurrency) {
|
|
|
|
stop();
|
|
|
|
} else if (!benchEnded) {
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
read();
|
2023-10-15 12:55:27 +02:00
|
|
|
}
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
function zip() {
|
|
|
|
zlib.deflate(zipData, afterZip);
|
|
|
|
}
|
|
|
|
|
|
|
|
function afterZip(er, data) {
|
|
|
|
if (er)
|
|
|
|
throw er;
|
|
|
|
|
|
|
|
zips++;
|
2023-10-15 12:55:27 +02:00
|
|
|
const benchEnded = Date.now() >= endAt;
|
|
|
|
|
|
|
|
if (benchEnded && (++waitConcurrent) === targetConcurrency) {
|
|
|
|
stop();
|
|
|
|
} else if (!benchEnded) {
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
zip();
|
2023-10-15 12:55:27 +02:00
|
|
|
}
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// Start reads
|
2023-10-15 12:55:27 +02:00
|
|
|
for (let i = 0; i < concurrent; i++) read();
|
fs: partition readFile against pool exhaustion
Problem:
Node implements fs.readFile as:
- a call to stat, then
- a C++ -> libuv request to read the entire file using the stat size
Why is this bad?
The effect is to place on the libuv threadpool a potentially-large
read request, occupying the libuv thread until it completes.
While readFile certainly requires buffering the entire file contents,
it can partition the read into smaller buffers
(as is done on other read paths)
along the way to avoid threadpool exhaustion.
If the file is relatively large or stored on a slow medium, reading
the entire file in one shot seems particularly harmful,
and presents a possible DoS vector.
Solution:
Partition the read into multiple smaller requests.
Considerations:
1. Correctness
I don't think partitioning the read like this raises
any additional risk of read-write races on the FS.
If the application is concurrently readFile'ing and modifying the file,
it will already see funny behavior. Though libuv uses preadv where
available, this doesn't guarantee read atomicity in the presence of
concurrent writes.
2. Performance
Downside: Partitioning means that a single large readFile will
require into many "out and back" requests to libuv,
introducing overhead.
Upside: In between each "out and back", other work pending on the
threadpool can take a turn.
In short, although partitioning will slow down a large request,
it will lead to better throughput if the threadpool is handling
more than one type of request.
Fixes: https://github.com/nodejs/node/issues/17047
PR-URL: https://github.com/nodejs/node/pull/17054
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
2017-11-15 18:28:04 +01:00
|
|
|
|
|
|
|
// Start a competing zip
|
|
|
|
zip();
|
|
|
|
}
|