GitOrigin-RevId: 35db3811d8f749edd5b79ba910adcbc1ceb54cc4
22 KiB
MongoDB Build System Reference
MongoDB Build System Requirements
Recommended minimum requirements
Python modules
External libraries
Enterprise module requirements
Testing requirements
MongoDB customizations
SCons modules
Development tools
Compilation database generator
Build tools
IDL Compiler
Auxiliary tools
Ninja generator
Icecream tool
ccache tool
LIBDEPS
Libdeps is a subsystem within the build, which is centered around the LIBrary DEPendency graph. It tracks and maintains the dependency graph as well as lints, analyzes and provides useful metrics about the graph.
Design
The libdeps subsystem is divided into several stages, described in order of use as follows.
SConscript LIBDEPS
definitions and built time linting
During the build, the SConscripts are read and all the library relationships are setup via the LIBDEPS
variables. Some issues can be identified early during processing of the SConscripts via the build-time linter. Most of these will be style and usage issues which can be realized without needing the full graph. This component lives within the build and is executed through the SCons emitters added via the libdeps subsystem.
Libdeps graph generation for post-build analysis
For more advanced analysis and linting, a full graph is necessary. The build target generate-libdeps-graph
builds all libdeps and things which use libdeps, and generates the graph to a file in graphml format.
The libdeps analyzer python module
The libdeps analyzer module is a python library which provides and Application Programming Interface (API) to analyze and lint the graph. The library internally leverages the networkx python module for the generic graph interfaces.
The CLI and Visualizer tools
The libdeps analyzer module is used in the libdeps Graph Analysis Command Line Interface (gacli) tool and the libdeps Graph Visualizer web service. Both tools read in the graph file generated from the build and provide the Human Machine Interface (HMI) for analysis and linting.
The LIBDEPS
variables
The variables include several types of lists to be added to libraries per a SCons builder instance:
Variable | Use |
---|---|
LIBDEPS |
transitive dependencies |
LIBDEPS_PRIVATE |
local dependencies |
LIBDEPS_INTERFACE |
transitive dependencies excluding self |
LIBDEPS_DEPENDENTS |
reverse dependencies |
PROGDEPS_DEPENDENTS |
reverse dependencies for Programs |
LIBDEPS
is the 'public' type, such that libraries that are added to this list become a dependency of the current library, and also become dependencies of libraries which may depend on the current library. This propagation also includes not just the libraries in the LIBDEPS
list, but all LIBDEPS
of those LIBDEPS
recursively, meaning that all dependencies of the LIBDEPS
libraries, also become dependencies of the current library and libraries which depend on it.
LIBDEPS_PRIVATE
should be a list of libraries which creates dependencies only between the current library and the libraries in the list. However, in static linking builds, this will behave the same as LIBDEPS
due to the nature of static linking.
LIBDEPS_INTERFACE
is very similar to LIBDEPS
, however it does not create propagating dependencies for the libraries themselves in the LIBDEPS_INTERFACE
list. Only the dependencies of those LIBDEPS_INTERFACE
libraries are propagated forward.
LIBDEPS_DEPENDENTS
are added to libraries which will force themselves as dependencies of the libraries in the supplied list. This is conceptually a reverse dependency, where the library which is a dependency is the one declaring itself as the dependency of some other library. By default this creates a LIBDEPS_PRIVATE
like relationship, but a tuple can be used to force it to a LIBDEPS
like or other relationship.
PROGDEPS_DEPENDENTS
are the same as LIBDEPS_DEPENDENTS
, but intended for use only with Program builders.
LIBDEPS_TAGS
The LIBDEPS_TAGS
variable is used to mark certain libdeps for various reasons. Some LIBDEPS_TAGS
are used to mark certain libraries for LIBDEPS_TAG_EXPANSIONS
variable which is used to create a function which can expand to a string on the command line. Below is a table of available LIBDEPS
tags:
Tag | Description | ||
---|---|---|---|
illegal_cyclic_or_unresolved_dependencies_allowlisted |
SCons subst expansion tag to handle dependency cycles | ||
init-no-global-side-effects |
SCons subst expansion tag for causing linkers to avoid pulling in all symbols | ||
lint-public-dep-allowed |
Linting exemption tag exempting the lint-no-public-deps tag |
||
lint-no-public-deps |
Linting inclusion tag ensuring a libdep has no LIBDEPS declared |
||
lint-allow-non-alphabetic |
Linting exemption tag allowing LIBDEPS variable lists to be non-alphabetic |
||
lint-leaf-node-allowed-dep |
Linting exemption tag exempting the lint-leaf-node-no-deps tag |
||
lint-leaf-node-no-deps |
Linting inclusion tag ensuring a libdep has no libdeps and is a leaf node | ||
lint-allow-nonlist-libdeps |
Linting exemption tag allowing a LIBDEPS variable to not be a list |
lint-allow-bidirectional-edges |
Linting exemption tag allowing reverse dependencies to also be a forward dependencies |
lint-allow-nonprivate-on-deps-dependents |
Linting exemption tag allowing reverse dependencies to be transitive | ||
lint-allow-dup-libdeps |
Linting exemption tag allowing LIBDEPS variables to contain duplicate libdeps on a given library |
||
lint-allow-program-links-private |
Linting exemption tag allowing Program s to have PRIVATE_LIBDEPS |
The illegal_cyclic_or_unresolved_dependencies_allowlisted
tag
This tag should not be used anymore because the library dependency graph has been successfully converted to a Directed Acyclic Graph (DAG). Prior to this accomplishment, it was necessary to handle cycles specifically with platform specific options on the command line.
The init-no-global-side-effects
tag
Adding this flag to a library turns on platform specific compiler flags which will cause the linker to pull in just the symbols it needs. Note that by default, the build is configured to pull in all symbols from libraries because of the use of static initializers, however if a library is known to not have any of these initializers, then this flag can be added for some performance improvement.
Linting and linter tags
The libdeps linter features automatically detect certain classes of LIBDEPS usage errors. The libdeps linters are implemented as build-time linting and post-build linting procedures to maintain order in usage of the libdeps tool and the build’s library dependency graph. You will need to comply with the rules enforced by the libdeps linter, and fix issues that it raises when modifying the build scripts. There are exemption tags to prevent the linter from blocking things, however these exemption tags should only be used in extraordinary cases, and with good reason. A goal of the libdeps linter is to drive and maintain the number of exemption tags in use to zero.
Exemption Tags
There are a number of existing issues that need to be addressed, but they will be addressed in future tickets. In the meantime, the use of specific strings in the LIBDEPS_TAGS variable can allow the libdeps linter to skip certain issues on given libraries. For example, to have the linter skip enforcement of the lint rule against bidirectional edges for "some_library":
env.Library(
target=’some_library’
...
LIBDEPS_TAGS=[‘lint-allow-bidirectional-edges’]
)
build-time Libdeps Linter
If there is a build-time issue, the build will fail until it is addressed. This linting feature will be on by default and takes about half a second to complete in a full enterprise build (at the time of writing this), but can be turned off by using the --libdeps-linting=off option on your SCons invocation.
The current rules and there exemptions are listed below:
-
A 'Program' can not link a non-public dependency, it can only have LIBDEPS links.
Example
env.Program( target=’some_program’, ... LIBDEPS=[‘lib1’], # OK LIBDEPS_PRIVATE=[‘lib2’], # This is a Program, BAD )
Rationale
A Program can not be linked into anything else, and there for the transitiveness does not apply. A default value of LIBDEPS was selected for consistency since most Program's were already doing this at the time the rule was created.
Exemption
'lint-allow-program-links-private' on the target node
-
A 'Node' can only directly link a given library once.
Example
env.Library( target=’some_library’, ... LIBDEPS=[‘lib1’], # Linked once, OK LIBDEPS_PRIVATE=[‘lib1’], # Also linked in LIBDEPS, BAD LIBDEPS_INTERFACE=[‘lib2’, 'lib2'], # Linked twice, BAD )
Rationale
Libdeps will ignore duplicate links, so this rule is mostly for consistency and neatness in the build scripts.
Exemption
'lint-allow-dup-libdeps' on the target node
-
A 'Node' which uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS can only have LIBDEPS_PRIVATE links.
Example
env.Library( target=’some_library’, ... LIBDEPS_DEPENDENTS=['lib3'], LIBDEPS=[‘lib1’], # LIBDEPS_DEPENDENTS is in use, BAD LIBDEPS_PRIVATE=[‘lib2’], # OK )
Rationale
The node that the library is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT to inject its dependency onto should be conditional, therefore there should not be transitiveness for that dependency since it cannot be the source of any resolved symbols.
Exemption
'lint-allow-nonprivate-on-deps-dependents' on the target node
-
A 'Node' can not link directly to a library that uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS.
Example
env.Library( target='other_library', ... LIBDEPS=['lib1'], # BAD, 'lib1' has LIBDEPS_DEPENDENTS env.Library( target=’lib1’, ... LIBDEPS_DEPENDENTS=['lib3'], )
Rationale
A library that is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT should only be used for reverse dependency edges. If a node does need to link directly to a library that does have reverse dependency edges, that indicates the library should be split into two separate libraries, containing its direct dependency content and its conditional reverse dependency content.
Exemption
'lint-allow-bidirectional-edges' on the target node
-
All libdeps environment vars must be assigned as lists.
Example
env.Library( target='some_library', ... LIBDEPS='lib1', # not a list, BAD LIBDEPS_PRIVATE=['lib2'], # OK )
Rationale
Libdeps will handle non-list environment variables, so this is more for consistency and neatness in the build scripts.
Exemption
'lint-allow-nonlist-libdeps' on the target node
-
Libdeps with the tag 'lint-leaf-node-no-deps' shall not link any libdeps.
Example
env.Library( target='lib2', ... LIBDEPS_TAGS=[ 'lint-leaf-node-allowed-dep' ] ) env.Library( target='some_library', ... LIBDEPS=['lib1'], # BAD, should have no LIBDEPS LIBDEPS_PRIVATE=['lib2'], # OK, has exemption tag LIBDEPS_TAGS=[ 'lint-leaf-node-no-deps' ] )
Rationale
The special tag allows certain nodes to be marked and programmatically checked that they remain lead nodes. An example use-case is when we want to make sure certain nodes never link mongodb code.
Exemption
'lint-leaf-node-allowed-dep' on the exempted libdep
Inclusion
'lint-leaf-node-no-deps' on the target node
-
Libdeps with the tag 'lint-no-public-deps' shall not link any libdeps.
Example
env.Library( target='lib2', ... LIBDEPS_TAGS=[ 'lint-public-dep-allowed' ] ) env.Library( target='some_library', ... LIBDEPS=[ 'lib1' # BAD 'lib2' # OK, has exemption tag ], LIBDEPS_TAGS=[ 'lint-no-public-deps' ] )
Rationale
The special tag allows certain nodes to be marked and programmatically checked that they do not link publicly. Some nodes such as mongod_main have special requirements that this programmatically checks.
Exemption
'lint-public-dep-allowed' on the exempted libdep
Inclusion
'lint-no-public-deps' on the target node
-
Libdeps shall be sorted alphabetically in LIBDEPS lists in the SCons files.
Example
env.Library( target='lib2', ... LIBDEPS=[ '$BUILD/mongo/db/d', # OK, $ comes before c 'c', # OK, c comes before s 'src/a', # BAD, s should be after b 'b', # BAD, b should be before c ] )
Rationale
Keeping the SCons files neat and ordered allows for easier Code Review diffs and generally better maintainability.
Exemption
'lint-allow-non-alphabetic' on the exempted libdep
The build-time print Option
The libdeps linter also has the --libdeps-linting=print
option which will perform linting, and instead of failing the build on an issue, just print and continue on. It will also ignore exemption tags, and still print the issue because it will not fail the build. This is a good way to see the entirety of existing issues that are exempted by tags, as well as printing other metrics such as time spent linting.
post-build linting and analysis
The dependency graph can be analyzed post-build by leveraging the completeness of the graph to perform more extensive analysis. You will need to install the libdeps requirements file to python when attempting to use the post-build analysis tools:
python3 -m poetry install --no-root --sync -E libdeps
The command line interface tool (gacli) has a comprehensive help text which will describe the available analysis options and interface. The visualizer tool includes a GUI which displays the available analysis options graphically. These tools will be briefly covered in the following sections.
Generating the graph file
To generate the full graph, build the target generate-libdeps-graph
. This will build all things involving libdeps and construct a graphml file representing the library dependency graph. The graph can be used in the command line interface tool or the visualizer web service tool. The minimal set of required SCons arguments to build the graph file is shown below:
python3 buildscripts/scons.py --link-model=dynamic --build-tools=next generate-libdeps-graph --linker=gold --modules=
The graph file by default will be generate to build/opt/libdeps/libdeps.graphml
(where build/opt
is the $BUILD_DIR
).
General libdeps analyzer API usage
Below is a basic example of usage of the libdeps analyzer API:
import libdeps
libdeps_graph = libdeps.graph.load_libdeps_graph('path/to/libdeps.graphml')
list_of_analysis_to_run = [
libdeps.analyzer.NodeCounter(libdeps_graph),
libdeps.analyzer.DirectDependencies(libdeps_graph, node='path/to/library'),
]
analysis_results = libdeps.graph.LibdepsGraphAnalysis(list_of_analysis_to_run)
libdeps.analyzer.GaPrettyPrinter(analysis_after_run).print()
Walking through this example, first the graph is loaded from file. Then a list of desired Analyzer instances is created. Some example analyzer classes are instantiated in the example above, but there are many others to choose from. Specific Analyzers have different interfaces and should be supplied an argument list corresponding to that analyzer.
Note: The graph file will contain the build dir that the graph data was created with and it expects all node arguments to be relative to the build dir. If you are using the libdeps module generically in some app, you can extract the build dir from the libdeps graph and append it to any generic library path.
Once the list of analyzers is created, they can be used to create a LibdepsGraphAnalysis instance, which will upon instantiation, run the analysis list provided. Once the instance is created, it contains the results, and optionally can be fed into different printer classes. In this case, a human readable format printer called GaPrettyPrinter is used to print to the console.
Using the gacli tool
The command line interface tool can be used from the command line to run analysis on a given graph. The only required argument is the graph file. The default with no args will run all the counters and linters on the graph. Here is an example output:
(venv) Apr.20 02:46 ubuntu[mongo]: python buildscripts/libdeps/gacli.py --graph-file build/cached/libdeps/libdeps.graphml
Loading graph data...Loaded!
Graph built from git hash:
1358cdc6ff0e53e4f4c01ea0e6fcf544fa7e1672
Graph Schema version:
2
Build invocation:
"/home/ubuntu/venv/bin/python" "buildscripts/scons.py" "--variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars" "--cache=all" "--cache-dir=/home/ubuntu/scons-cache" "--link-model=dynamic" "--build-tools=next" "ICECC=icecc" "CCACHE=ccache" "-j200" "--cache-signature-mode=validate" "--cache-debug=-" "generate-libdeps-graph"
Nodes in Graph: 867
Edges in Graph: 90706
Direct Edges in Graph: 5948
Transitive Edges in Graph: 84758
Direct Public Edges in Graph: 3483
Public Edges in Graph: 88241
Private Edges in Graph: 2440
Interface Edges in Graph: 25
Shim Nodes in Graph: 20
Program Nodes in Graph: 136
Library Nodes in Graph: 731
LibdepsLinter: PUBLIC libdeps that could be PRIVATE: 0
Use the --help
option to see detailing information about all the available options.
Using the graph visualizer Tool
The graph visualizer tools starts up a web service to provide a frontend GUI for navigating and examining the graph files. The Visualizer uses a Python Flask backend and React/Redux Javascript frontend.
For installing the dependencies for the frontend, you will need node >= 12.0.0 and npm installed and in the PATH. To install the dependencies navigate to directory where package.json lives, and run:
cd buildscripts/libdeps/graph_visualizer_web_stack && npm install
Alternatively if you are on linux, you can use the setup_node_env.sh script to automatically download node 12 and npm, setup the local environment and install the dependencies. Run the command:
source buildscripts/libdeps/graph_visualizer_web_stack/setup_node_env.sh install
Assuming you are on a remote workstation and using defaults, you will need to make ssh tunnels to the web service to access the service in your local browser. The frontend and backend both use a port (this case 3000 is the frontend and 5000 is the backend), and the default host is localhost, so you will need to open two tunnels so the frontend running in your local web browser can communicate with the backend. If you are using the default host and port the tunnel command will look like this:
ssh -L 3000:localhost:3000 -L 5000:localhost:5000 ubuntu@workstation.hostname
Next we need to start the web service. It will require you to pass a directory where it will search for .graphml
files which contain the graph data for various commits:
python3 buildscripts/libdeps/graph_visualizer.py --graphml-dir build/opt/libdeps
The script will launch the backend and then build the optimized production frontend and launch it. You can supply the --debug
argument to work in development load which starts up much faster and allows real time updates as files are modified, with a small cost to performance on the frontend. Other options allow more configuration and can be viewed in the --help
text.
After the server has started up, it should notify you via the terminal that you can access it at http://localhost:3000 locally in your browser.
Build system configuration
SCons configuration
Frequently used flags and variables
MongoDB build configuration
Frequently used flags and variables
MONGO_GIT_HASH
The MONGO_GIT_HASH
SCons variable controls the value of the git hash
which will be interpolated into the build to identify the commit
currently being built. If not overridden, this defaults to the git
hash of the current commit.
MONGO_VERSION
The MONGO_VERSION
SCons variable controls the value which will be
interpolated into the build to identify the version of the software
currently being built. If not overridden, this defaults to the result
of git describe
, which will use the local tags to derive a version.