2022-03-22 02:57:53 +01:00
|
|
|
# FSM-based Concurrency Testing Framework
|
|
|
|
|
|
|
|
## Overview
|
2024-02-27 20:47:14 +01:00
|
|
|
|
2022-03-22 02:57:53 +01:00
|
|
|
The FSM tests are meant to exercise concurrency within MongoDB. The suite
|
|
|
|
consists of workloads, which define discrete units of work as states in a FSM,
|
|
|
|
and runners, which define which tests to run and how they should be run. Each
|
|
|
|
workload defines states, which are JS functions that perform some meaningful
|
|
|
|
series of tasks and assertions, and transitions, which define how to move
|
|
|
|
between those states. A single workload begins by executing its setup function,
|
|
|
|
which is called once during the runner's thread of execution. Next, the runner
|
|
|
|
generates the number of threads specified by the workload, and each spawned
|
|
|
|
thread executes the start state (typically named "init") defined by the
|
|
|
|
workload. From this point on, each worker thread executes its own independent
|
|
|
|
copy of the FSM, and will randomly move between states (after executing the
|
|
|
|
function) based on the probabilities defined in the workload's transition table.
|
|
|
|
Each worker thread continues doing so until the number of transitions it makes
|
|
|
|
has reached the number of iterations defined by the workload. Once all the
|
|
|
|
worker threads have finished, the runner executes the workload's teardown
|
|
|
|
function.
|
|
|
|
|
2022-11-18 14:35:50 +01:00
|
|
|
![fsm.png](../images/testing/fsm.png)
|
2022-03-22 02:57:53 +01:00
|
|
|
|
2024-01-29 21:14:29 +01:00
|
|
|
The runner provides two modes of execution for workloads: serial and parallel.
|
|
|
|
Serial mode runs the provided workloads one after the other,
|
2022-03-22 02:57:53 +01:00
|
|
|
waiting for all threads of a workload to complete before moving on to the next
|
|
|
|
workload. Parallel mode runs subsets of the provided workloads in separate
|
2024-01-29 21:14:29 +01:00
|
|
|
threads simultaneously.
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
New methods were added to allow for finer-grained assertions under different
|
|
|
|
situations. For example, a test that inserts a document into a collection, and
|
|
|
|
wants to assert its existence will fail if another test removes that document.
|
|
|
|
One option would have been to disable all assertions when running a mixture of
|
|
|
|
different workloads together, but doing so would make the system incapable of
|
|
|
|
detecting anything other than server crashes. Another option would have been to
|
|
|
|
design the workloads to be conflict-free (e.g. writing to separate collections,
|
|
|
|
using commutative operators), but this would leave large gaps in the achievable
|
|
|
|
test coverage. Neither of those options were found to be very appealing.
|
|
|
|
Instead, we chose to introduce the concept of an "assertion level" that acts as
|
|
|
|
a precondition for when an assertion is evaluated. This allows us to still make
|
|
|
|
some assertions, even when running a mixture of different workloads together.
|
|
|
|
There are three assertion levels: `ALWAYS`, `OWN_COLL`, and `OWN_DB`. They can
|
|
|
|
be thought of as follows:
|
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- `ALWAYS`: A statement that remains unequivocally true, regardless of what
|
|
|
|
another workload might be doing to the collection I was given (hint: think
|
|
|
|
defensively). Examples include "1 = 1" or inserting a document into a
|
|
|
|
collection (disregarding any unique indices).
|
2022-03-22 02:57:53 +01:00
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- `OWN_COLL`: A statement that is true only if I am the only workload operating
|
|
|
|
on the collection I was given. Examples include counting the number of
|
|
|
|
documents in a collection or updating a previously inserted document.
|
2022-03-22 02:57:53 +01:00
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- `OWN_DB`: A statement that is true only if I am the only workload operating on
|
|
|
|
the database I was given. Examples include renaming a collection or verifying
|
|
|
|
that a collection is capped. The workload typically relies on the use of
|
|
|
|
another collection aside from the one given.
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
## Creating your own workload
|
|
|
|
|
|
|
|
All workloads are stored in `jstests/concurrency/fsm_workloads` and as specific
|
|
|
|
examples you can refer to
|
|
|
|
|
|
|
|
1. `jstests/concurrency/fsm_example.js`
|
|
|
|
1. `jstests/concurrency/fsm_example_inheritance.js`
|
|
|
|
|
|
|
|
for writing new workloads. Every workload is loaded in as inline JavaScript
|
|
|
|
using the "load" function, which is a lot more like a `#include` than
|
|
|
|
`require.js`. This means that whatever variables are declared in the global
|
|
|
|
scope of the file will become part of the scope where load is called. The runner
|
|
|
|
will be looking for a variable called `$config` which will store the
|
|
|
|
configuration of your workload.
|
|
|
|
|
|
|
|
### The $config object
|
|
|
|
|
|
|
|
There should be exactly one `$config` per workload. For style consistency as
|
|
|
|
well as safety, be sure to wrap the value of `$config` in an anonymous function.
|
|
|
|
This will create a JS closure and a new scope:
|
|
|
|
|
|
|
|
```javascript
|
|
|
|
$config = (function() {
|
|
|
|
/* ... */
|
|
|
|
return {
|
|
|
|
threadCount: "<number of threads>",
|
|
|
|
iterations: "<number of steps>",
|
|
|
|
startState: "<start state for this workload>",
|
|
|
|
states: "<state functions>",
|
|
|
|
transitions: "<transition probability map>",
|
|
|
|
setup: "<function to initialize workload>",
|
|
|
|
teardown: "<function to cleanup workload if necessary>",
|
|
|
|
data: "<'this' property available to each state function>",
|
|
|
|
};
|
|
|
|
)();
|
|
|
|
```
|
|
|
|
|
|
|
|
When finished executing, `$config` must return an object containing the properties
|
|
|
|
above (some of which are optional, see below).
|
|
|
|
|
|
|
|
### Defining states
|
2024-02-27 20:47:14 +01:00
|
|
|
|
2022-03-22 02:57:53 +01:00
|
|
|
It's best to also declare states within its own closure so as not to interfere
|
|
|
|
with the scope of $config. Each state takes two arguments, the db object and the
|
|
|
|
collection name. For later, note that this db and collection are the only one
|
|
|
|
that you can be guaranteed to "own" when asserting. Try to make each state a
|
2024-01-29 21:14:29 +01:00
|
|
|
discrete unit of work that can stand alone without the other states.
|
|
|
|
Additionally, try to define each function that makes up a state
|
2022-03-22 02:57:53 +01:00
|
|
|
with a name as opposed to anonymously - this makes easier to read backtraces
|
|
|
|
when things go wrong.
|
|
|
|
|
|
|
|
```javascript
|
2024-02-27 20:47:14 +01:00
|
|
|
$config = (function () {
|
2024-04-04 01:12:53 +02:00
|
|
|
/* ... */
|
|
|
|
var states = (function () {
|
|
|
|
function getRand() {
|
|
|
|
return Random.randInt(10);
|
|
|
|
}
|
2023-06-24 15:08:03 +02:00
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
function init(db, collName) {
|
|
|
|
this.start = getRand() * this.tid;
|
|
|
|
}
|
|
|
|
|
|
|
|
function scanGT(db, collName) {
|
|
|
|
db[collName].find({_id: {$gt: this.start}}).itcount();
|
|
|
|
}
|
|
|
|
|
|
|
|
function scanLTE(db, collName) {
|
|
|
|
db[collName].find({_id: {$lte: this.start}}).itcount();
|
|
|
|
}
|
2023-06-24 15:08:03 +02:00
|
|
|
|
2022-03-22 02:57:53 +01:00
|
|
|
return {
|
2024-04-04 01:12:53 +02:00
|
|
|
init: init,
|
|
|
|
scanGT: scanGT,
|
|
|
|
scanLTE: scanLTE,
|
2022-03-22 02:57:53 +01:00
|
|
|
};
|
2024-04-04 01:12:53 +02:00
|
|
|
})();
|
|
|
|
|
|
|
|
/* ... */
|
|
|
|
|
|
|
|
return {
|
|
|
|
/* ... */
|
|
|
|
states: states,
|
|
|
|
/* ... */
|
|
|
|
};
|
2022-03-22 02:57:53 +01:00
|
|
|
})();
|
|
|
|
```
|
|
|
|
|
|
|
|
### Defining transitions
|
|
|
|
|
|
|
|
The transitions object defines the probabilities of moving from one state to a
|
|
|
|
different state. When a state's function is finished executing, the FSM randomly
|
|
|
|
chooses the next state using the probabilities provided in the transitions
|
|
|
|
object. The probabilities of the transitions object do not necessarily need to
|
|
|
|
sum to 1.0, since the mechanism for choosing the next state uses normalized
|
|
|
|
random values. Here it is not necessary to use a separate closure. In the
|
|
|
|
example below, we're denoting an equal probability of moving to either of the
|
|
|
|
scan states from the init state:
|
|
|
|
|
|
|
|
```javascript
|
2024-02-27 20:47:14 +01:00
|
|
|
$config = (function () {
|
2024-04-04 01:12:53 +02:00
|
|
|
/* ... */
|
|
|
|
var transitions = {
|
|
|
|
init: {scanGT: 0.5, scanLTE: 0.5},
|
|
|
|
scanGT: {scanGT: 0.8, scanLTE: 0.2},
|
|
|
|
scanLTE: {scanGT: 0.2, scanLTE: 0.8},
|
|
|
|
};
|
|
|
|
/* ... */
|
|
|
|
return {
|
2022-03-22 02:57:53 +01:00
|
|
|
/* ... */
|
2024-04-04 01:12:53 +02:00
|
|
|
transitions: transitions,
|
2022-03-22 02:57:53 +01:00
|
|
|
/* ... */
|
2024-04-04 01:12:53 +02:00
|
|
|
};
|
2022-03-22 02:57:53 +01:00
|
|
|
})();
|
|
|
|
```
|
|
|
|
|
|
|
|
### Setup and teardown functions
|
|
|
|
|
|
|
|
The setup and teardown functions are special in that they'll only be executed in
|
|
|
|
one thread. See the Runners section for more information about when they're run
|
|
|
|
relative to other workloads in various modes. The setup and teardown functions
|
|
|
|
take three arguments: db, coll, and cluster. The setup function (and
|
|
|
|
corresponding teardown) should perform most of the initialization your workload
|
|
|
|
needs, for example setting parameters on the server, adding seed data, or
|
|
|
|
setting up indexes. Note that rather than executing adminCommands (and others)
|
|
|
|
against the provided `db` you should use the provided
|
|
|
|
`cluster.executeOnMongodNodes` and `cluster.executeOnMongosNodes` functionality.
|
|
|
|
|
|
|
|
```javascript
|
2024-02-27 20:47:14 +01:00
|
|
|
$config = (function () {
|
2024-04-04 01:12:53 +02:00
|
|
|
/* ... */
|
|
|
|
function setup(db, collName, cluster) {
|
|
|
|
// Workloads should NOT drop the collection db[collName], as doing so
|
|
|
|
// is handled by jstests/concurrency/fsm_libs/runner.js before 'setup' is called.
|
|
|
|
for (var i = 0; i < 1000; ++i) {
|
|
|
|
db[collName].insert({_id: i});
|
2022-03-22 02:57:53 +01:00
|
|
|
}
|
2024-04-04 01:12:53 +02:00
|
|
|
cluster.executeOnMongodNodes(function (db) {
|
|
|
|
db.adminCommand({
|
|
|
|
setParameter: 1,
|
|
|
|
internalQueryExecYieldIterations: 5,
|
|
|
|
});
|
|
|
|
});
|
|
|
|
cluster.executeOnMongosNodes(function (db) {
|
|
|
|
printjson(db.serverCmdLineOpts());
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
|
|
|
function teardown(db, collName, cluster) {
|
|
|
|
cluster.executeOnMongodNodes(function (db) {
|
|
|
|
db.adminCommand({
|
|
|
|
setParameter: 1,
|
|
|
|
internalQueryExecYieldIterations: 128,
|
|
|
|
});
|
|
|
|
});
|
|
|
|
}
|
|
|
|
/* ... */
|
|
|
|
return {
|
2022-03-22 02:57:53 +01:00
|
|
|
/* ... */
|
2024-04-04 01:12:53 +02:00
|
|
|
setup: setup,
|
|
|
|
teardown: teardown,
|
|
|
|
/* ... */
|
|
|
|
};
|
2022-03-22 02:57:53 +01:00
|
|
|
})();
|
|
|
|
```
|
|
|
|
|
|
|
|
### The `data` object
|
|
|
|
|
|
|
|
The `data` object preserves information between different states of an FSM within
|
|
|
|
an individual thread. Within a single state, the data object becomes the 'this'
|
|
|
|
context in which the state executes. Additionally, a tid attribute is added to
|
|
|
|
data by the runner to allow each thread to access a unique ID. Data is usually
|
|
|
|
defined above states inside the config, but left below it in the returned
|
|
|
|
object. Data is also available as the 'this' context in setup and teardown
|
|
|
|
functions. Note that once the FSM begins, the context data that was passed to
|
|
|
|
the setup function is copied into each thread - meaning each thread has its own
|
|
|
|
copy of the data and modifications to data will not be passed back to the
|
|
|
|
teardown function outside of what was changed in setup. Additionally, in
|
|
|
|
composition, each workload has its own data, meaning you don't have to worry
|
|
|
|
about properties being overridden by workloads other than the current one.
|
|
|
|
|
|
|
|
```javascript
|
2024-02-27 20:47:14 +01:00
|
|
|
$config = (function () {
|
2024-04-04 01:12:53 +02:00
|
|
|
var data = {
|
|
|
|
start: 0,
|
|
|
|
};
|
|
|
|
/* ... */
|
|
|
|
return {
|
2022-03-22 02:57:53 +01:00
|
|
|
/* ... */
|
2024-04-04 01:12:53 +02:00
|
|
|
data: data,
|
|
|
|
/* ... */
|
|
|
|
};
|
2022-03-22 02:57:53 +01:00
|
|
|
})();
|
|
|
|
```
|
|
|
|
|
|
|
|
### Other properties of $config
|
|
|
|
|
|
|
|
#### `threadCount`
|
|
|
|
|
|
|
|
threadCount is the number of threads that will be used to run your workload in
|
|
|
|
Serial and Parallel modes. In both modes, the number of threads you provide will
|
|
|
|
execute the FSM simultaneously, cycling through different states of the
|
|
|
|
workload. Note that in serial mode, no other threads will be running outside of
|
|
|
|
those pertaining to this workload, and in parallel mode, other workloads will
|
|
|
|
also be given threads to execute their FSM. In some cases in parallel mode, this
|
|
|
|
number will be scaled down to make sure that all workloads can fit within the
|
|
|
|
number of threads available due to system or performance constraints.
|
|
|
|
|
|
|
|
#### `iterations`
|
|
|
|
|
|
|
|
This is just the number of states the FSM will go through before exiting. NOTE:
|
2024-02-27 20:47:14 +01:00
|
|
|
it is _not_ the number of times each state will be executed.
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
#### `startState` (optional)
|
|
|
|
|
|
|
|
Default value is 'init'. If your workload does not have an init state than you
|
|
|
|
must specify in which state to begin.
|
|
|
|
|
|
|
|
### Workload helpers
|
|
|
|
|
|
|
|
`jstests/concurrency/fsm_workload_helpers` contains a few files that you can
|
|
|
|
include using 'load' at the top of a workload. These provide auxiliary
|
|
|
|
functionality that might be necessary for some workloads. The most important of
|
|
|
|
which is probably server_types.js
|
|
|
|
|
|
|
|
#### server_types.js
|
|
|
|
|
|
|
|
This helper file contains four functions: isMongos, isMongod, isMMAPv1, and
|
|
|
|
isWiredTiger. These can be used to restrict operations on different
|
|
|
|
functionality available in sharded environments, as well as based on storage
|
|
|
|
engine, and work as you would expect. One thing to note is that before calling
|
|
|
|
either isMMAPv1 or isWiredTiger, first verify isMongod. When special casing
|
|
|
|
functionality for sharded environments or storage engines, try to special case a
|
|
|
|
test for the exceptionality while still leaving in place assertions for either
|
|
|
|
case.
|
|
|
|
|
|
|
|
#### indexed_noindex.js
|
|
|
|
|
|
|
|
This helper can be used along with inheritance, to create a workload that is
|
|
|
|
exactly the same as an existing workload, but with the index created during
|
|
|
|
setup removed. In order to use this replace the function you provide to the
|
|
|
|
extendWorkload function with indexedNoindex. Additionally, ensure that the
|
|
|
|
workload you are extending has a function in its data object called
|
|
|
|
"getIndexSpec" that returns the spec for the index to be removed.
|
|
|
|
|
|
|
|
```javascript
|
|
|
|
import {extendWorkload} from "jstests/concurrency/fsm_libs/extend_workload.js";
|
2024-02-27 20:47:14 +01:00
|
|
|
load("jstests/concurrency/fsm_workload_modifiers/indexed_noindex.js"); // for indexedNoindex
|
|
|
|
import {$config as $baseConfig} from "jstests/concurrency/fsm_workloads/workload_with_index.js";
|
2023-06-24 15:08:03 +02:00
|
|
|
|
2022-03-22 02:57:53 +01:00
|
|
|
export const $config = extendWorkload($baseConfig, indexedNoIndex);
|
|
|
|
```
|
|
|
|
|
|
|
|
#### drop_utils.js
|
|
|
|
|
|
|
|
These helpers provide safe methods for dropping collections, databases, roles,
|
|
|
|
and users created during a workload's execution. The methods take a regular
|
|
|
|
expression that the collection, database, role, or user name must match for it
|
|
|
|
to be dropped. Prefixing the items in any of these categories you create with a
|
|
|
|
prefix defined by your workload name is a good idea since the workload file name
|
|
|
|
can be assumed unique and will allow you to only affect your workload in these
|
|
|
|
cases.
|
|
|
|
|
|
|
|
## Test runners
|
|
|
|
|
|
|
|
By default, all runners below are allowed to open a maximum of
|
|
|
|
`maxAllowedConnections` (= 100 by default) explicit connections. In replicated
|
|
|
|
and sharded environments, implicit connections are created to the original
|
|
|
|
mongod provided to the mongo shell executing the runner (one for each thread).
|
|
|
|
This behavior cannot be controlled, but it highlights the importance of always
|
|
|
|
using the db object provided in the FSM states rather than the global db which
|
|
|
|
will always correspond to the mongod the mongo shell initially connected to.
|
|
|
|
|
|
|
|
### Execution modes
|
|
|
|
|
|
|
|
#### Serial
|
|
|
|
|
|
|
|
Serial is the simplest of all three modes and basically works as explained
|
|
|
|
above. Setup is run single threaded, data is copied into multiple threads where
|
|
|
|
the states are executed, and once all the threads have finished a teardown
|
|
|
|
function is run and the runner moves onto the next workload.
|
|
|
|
|
2022-11-18 14:35:50 +01:00
|
|
|
![fsm_serial_example.png](../images/testing/fsm_serial_example.png)
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
#### Parallel (Simultaneous)
|
|
|
|
|
|
|
|
In parallel or simultaneous mode (the naming convention has been slightly
|
|
|
|
inconsistent), the ordering becomes a little different. All workloads have their
|
|
|
|
setup functions run, then threads are spawned for each workload, and once they
|
|
|
|
all complete, all threads have their teardown function run.
|
|
|
|
|
2022-11-18 14:35:50 +01:00
|
|
|
![fsm_simultaneous_example.png](../images/testing/fsm_simultaneous_example.png)
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
### Existing runners
|
|
|
|
|
|
|
|
The existing runners all use `jstests/concurrency/fsm_libs/runner.js` to
|
|
|
|
actually execute the workloads. Most information about arguments and available
|
|
|
|
runWorkloads methods can be found by inspecting the source. Below you can find
|
|
|
|
the existing runners explained. The first argument to the three runWorkloads
|
|
|
|
methods (each corresponding to a different run mode), is an array of workload
|
|
|
|
files to run. clusterOptions, the second argument to the runWorkloads functions,
|
|
|
|
is explained in the other components section below. Execution options for
|
|
|
|
runWorkloads functions, the third argument, can contain the following options
|
|
|
|
(some depend on the run mode):
|
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- `numSubsets` - Not available in serial mode, determines how many subsets of
|
|
|
|
workloads to execute in parallel mode
|
|
|
|
- `subsetSize` - Not available in serial mode, determines how large each subset of
|
|
|
|
workloads executed is
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
#### fsm_all.js
|
|
|
|
|
|
|
|
Runs all workloads serially. For each workload, `$config.threadCount` threads
|
|
|
|
are spawned and each thread runs for exactly `$config.iterations` steps starting
|
|
|
|
at `$config.startState` and transitioning to other states based on the
|
|
|
|
transition probabilities defined in $config.transitions.
|
|
|
|
|
|
|
|
#### fsm_all_simultaneous.js
|
|
|
|
|
|
|
|
options: numSubsets, subsetSize
|
|
|
|
|
|
|
|
Runs numSubsets subsets of size subsetSize of all workloads. The workloads in
|
|
|
|
each subset are started in parallel and each workload is run according to
|
|
|
|
settings in `$config`.
|
|
|
|
|
|
|
|
#### fsm_all_replication.js
|
|
|
|
|
2024-01-29 21:14:29 +01:00
|
|
|
Sets up a replica set (with 3 mongods by default) and runs workloads serially or
|
|
|
|
in parallel. For example,
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
`runWorkloadsSerially([<workload1>, <workload2>, ...], { replication: true } )`
|
|
|
|
|
|
|
|
creates a replica set with 3 members and runs some workloads serially on the
|
|
|
|
primary.
|
|
|
|
|
|
|
|
#### fsm_all_sharded.js
|
|
|
|
|
|
|
|
Sets up a sharded cluster (with 2 shards and 1 mongos by default) and runs
|
2024-01-29 21:14:29 +01:00
|
|
|
workloads serially or in parallel. For example,
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
`runWorkloadsInParallel([<workload1>, <workload2>, ...], { sharded: true } )`
|
|
|
|
|
|
|
|
creates a sharded cluster and runs workloads in parallel.
|
|
|
|
|
|
|
|
#### fsm_all_sharded_replication.js
|
|
|
|
|
|
|
|
Sets up a sharded cluster (with 2 shards, each having 3 replica set members, and
|
2024-01-29 21:14:29 +01:00
|
|
|
1 mongos by default) and runs workloads serially or in parallel.
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
### Excluding a workload
|
|
|
|
|
|
|
|
If any workloads fail because of known bugs in MongoDB, persistent MCI failures
|
|
|
|
or timeouts, the troublesome workload can be excluded from running by placing it
|
|
|
|
in the exclusion array in the corresponding runner. Please remember to place a
|
|
|
|
comment next to the excluded workload name identifying the reason a workload is
|
|
|
|
being excluded. For example,
|
|
|
|
|
|
|
|
`'agg_sort_external.js', // SERVER-16700 Deadlock on WiredTiger LSM`
|
|
|
|
|
|
|
|
Each file should also have two predefined sections - one for known bugs and one
|
|
|
|
for restrictions. The one above would be considered a known bug. However,
|
|
|
|
excluding a compact workload from sharded runners would be a restriction because
|
|
|
|
compact can only be run against individual mongods.
|
|
|
|
|
|
|
|
## Other components of the FSM library
|
|
|
|
|
|
|
|
Most of these components live in jstests/concurrency/fsm_libs and provide the
|
|
|
|
functionality used by the runner.
|
|
|
|
|
|
|
|
### ThreadManager
|
|
|
|
|
|
|
|
Responsible for spawning and joining worker threads. Each spawned thread is
|
|
|
|
wrapped in a try/finally block to ensure that the database connection implicitly
|
|
|
|
created during the thread's execution is eventually closed explicitly. The
|
|
|
|
ThreadManager sets a random seed `([0, randInt(1e13))` which is the range of
|
|
|
|
`new Date().getTime())` before executing each workload.
|
|
|
|
|
|
|
|
### Worker Thread
|
|
|
|
|
|
|
|
Thread spawned by ThreadManager and used to run a Finite State Machine.
|
|
|
|
|
|
|
|
### Cluster
|
|
|
|
|
|
|
|
cluster.js is responsible for providing the cluster object that is passed to
|
|
|
|
setup and teardown functions, and the initial connection to a db to be used by
|
|
|
|
runner to pass to the workloads. For anything except for standalone, it makes
|
|
|
|
use of the shell's built-in cluster test helpers like `ShardingTest` and
|
|
|
|
`ReplSetTest`. clusterOptions are passed to cluster.js for initialization.
|
|
|
|
clusterOptions include:
|
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- `replication`: boolean, whether or not to use replication in the cluster
|
|
|
|
- `sameCollection`: boolean, whether or not all workloads are passed the same
|
|
|
|
collection
|
|
|
|
- `sameDB`: boolean, whether or not all workloads are passed the same DB
|
|
|
|
- `setupFunctions`: object, containing at most two functions under the keys
|
|
|
|
'mongod' and 'mongos'. This allows you to run a function against all mongod or
|
|
|
|
mongos nodes in the cluster as part of the cluster initialization. Each
|
|
|
|
function takes a single argument, the db object against which configuration
|
|
|
|
can be run (will be set for each mongod/mongos)
|
|
|
|
- `sharded`: boolean, whether or not to use sharding in the cluster
|
2022-03-22 02:57:53 +01:00
|
|
|
|
|
|
|
Note that sameCollection and sameDB can increase contention for a resource, but
|
|
|
|
will also decrease the strength of the assertions by ruling out the use of OwnDB
|
|
|
|
and OwnColl assertions.
|
|
|
|
|
|
|
|
### Miscellaneous Execution Notes
|
|
|
|
|
2024-04-04 01:12:53 +02:00
|
|
|
- A `CountDownLatch` (exposed through the v8-based mongo shell, as of MongoDB 3.0)
|
|
|
|
is used as a synchronization primitive by the ThreadManager to wait until all
|
|
|
|
spawned threads have finished being spawned before starting workload
|
|
|
|
execution.
|
|
|
|
- If more than 20% of the threads fail while spawning, we abort the test. If
|
|
|
|
fewer than 20% of the threads fail while spawning we allow the non-failed
|
|
|
|
threads to continue with the test. The 20% threshold is somewhat arbitrary;
|
|
|
|
the goal is to abort if "mostly all" of the threads failed but to tolerate "a
|
|
|
|
few" threads failing.
|