0
0
mirror of https://github.com/mongodb/mongo.git synced 2024-11-24 08:30:56 +01:00
mongodb/docs/build_system_reference.md
Alex Neben b665258d9d SERVER-88970 Added yaml formatting to server repo
GitOrigin-RevId: 35db3811d8f749edd5b79ba910adcbc1ceb54cc4
2024-04-06 05:23:20 +00:00

22 KiB
Raw Blame History

MongoDB Build System Reference

MongoDB Build System Requirements

Python modules

External libraries

Enterprise module requirements

Testing requirements

MongoDB customizations

SCons modules

Development tools

Compilation database generator

Build tools

IDL Compiler

Auxiliary tools

Ninja generator

Icecream tool

ccache tool

LIBDEPS

Libdeps is a subsystem within the build, which is centered around the LIBrary DEPendency graph. It tracks and maintains the dependency graph as well as lints, analyzes and provides useful metrics about the graph.

Design

The libdeps subsystem is divided into several stages, described in order of use as follows.

SConscript LIBDEPS definitions and built time linting

During the build, the SConscripts are read and all the library relationships are setup via the LIBDEPS variables. Some issues can be identified early during processing of the SConscripts via the build-time linter. Most of these will be style and usage issues which can be realized without needing the full graph. This component lives within the build and is executed through the SCons emitters added via the libdeps subsystem.

Libdeps graph generation for post-build analysis

For more advanced analysis and linting, a full graph is necessary. The build target generate-libdeps-graph builds all libdeps and things which use libdeps, and generates the graph to a file in graphml format.

The libdeps analyzer python module

The libdeps analyzer module is a python library which provides and Application Programming Interface (API) to analyze and lint the graph. The library internally leverages the networkx python module for the generic graph interfaces.

The CLI and Visualizer tools

The libdeps analyzer module is used in the libdeps Graph Analysis Command Line Interface (gacli) tool and the libdeps Graph Visualizer web service. Both tools read in the graph file generated from the build and provide the Human Machine Interface (HMI) for analysis and linting.

The LIBDEPS variables

The variables include several types of lists to be added to libraries per a SCons builder instance:

Variable Use
LIBDEPS transitive dependencies
LIBDEPS_PRIVATE local dependencies
LIBDEPS_INTERFACE transitive dependencies excluding self
LIBDEPS_DEPENDENTS reverse dependencies
PROGDEPS_DEPENDENTS reverse dependencies for Programs

LIBDEPS is the 'public' type, such that libraries that are added to this list become a dependency of the current library, and also become dependencies of libraries which may depend on the current library. This propagation also includes not just the libraries in the LIBDEPS list, but all LIBDEPS of those LIBDEPS recursively, meaning that all dependencies of the LIBDEPS libraries, also become dependencies of the current library and libraries which depend on it.

LIBDEPS_PRIVATE should be a list of libraries which creates dependencies only between the current library and the libraries in the list. However, in static linking builds, this will behave the same as LIBDEPS due to the nature of static linking.

LIBDEPS_INTERFACE is very similar to LIBDEPS, however it does not create propagating dependencies for the libraries themselves in the LIBDEPS_INTERFACE list. Only the dependencies of those LIBDEPS_INTERFACE libraries are propagated forward.

LIBDEPS_DEPENDENTS are added to libraries which will force themselves as dependencies of the libraries in the supplied list. This is conceptually a reverse dependency, where the library which is a dependency is the one declaring itself as the dependency of some other library. By default this creates a LIBDEPS_PRIVATE like relationship, but a tuple can be used to force it to a LIBDEPS like or other relationship.

PROGDEPS_DEPENDENTS are the same as LIBDEPS_DEPENDENTS, but intended for use only with Program builders.

LIBDEPS_TAGS

The LIBDEPS_TAGS variable is used to mark certain libdeps for various reasons. Some LIBDEPS_TAGS are used to mark certain libraries for LIBDEPS_TAG_EXPANSIONS variable which is used to create a function which can expand to a string on the command line. Below is a table of available LIBDEPS tags:

Tag Description
illegal_cyclic_or_unresolved_dependencies_allowlisted SCons subst expansion tag to handle dependency cycles
init-no-global-side-effects SCons subst expansion tag for causing linkers to avoid pulling in all symbols
lint-public-dep-allowed Linting exemption tag exempting the lint-no-public-deps tag
lint-no-public-deps Linting inclusion tag ensuring a libdep has no LIBDEPS declared
lint-allow-non-alphabetic Linting exemption tag allowing LIBDEPS variable lists to be non-alphabetic
lint-leaf-node-allowed-dep Linting exemption tag exempting the lint-leaf-node-no-deps tag
lint-leaf-node-no-deps Linting inclusion tag ensuring a libdep has no libdeps and is a leaf node
lint-allow-nonlist-libdeps Linting exemption tag allowing a LIBDEPS variable to not be a list lint-allow-bidirectional-edges Linting exemption tag allowing reverse dependencies to also be a forward dependencies
lint-allow-nonprivate-on-deps-dependents Linting exemption tag allowing reverse dependencies to be transitive
lint-allow-dup-libdeps Linting exemption tag allowing LIBDEPS variables to contain duplicate libdeps on a given library
lint-allow-program-links-private Linting exemption tag allowing Programs to have PRIVATE_LIBDEPS
The illegal_cyclic_or_unresolved_dependencies_allowlisted tag

This tag should not be used anymore because the library dependency graph has been successfully converted to a Directed Acyclic Graph (DAG). Prior to this accomplishment, it was necessary to handle cycles specifically with platform specific options on the command line.

The init-no-global-side-effects tag

Adding this flag to a library turns on platform specific compiler flags which will cause the linker to pull in just the symbols it needs. Note that by default, the build is configured to pull in all symbols from libraries because of the use of static initializers, however if a library is known to not have any of these initializers, then this flag can be added for some performance improvement.

Linting and linter tags

The libdeps linter features automatically detect certain classes of LIBDEPS usage errors. The libdeps linters are implemented as build-time linting and post-build linting procedures to maintain order in usage of the libdeps tool and the builds library dependency graph. You will need to comply with the rules enforced by the libdeps linter, and fix issues that it raises when modifying the build scripts. There are exemption tags to prevent the linter from blocking things, however these exemption tags should only be used in extraordinary cases, and with good reason. A goal of the libdeps linter is to drive and maintain the number of exemption tags in use to zero.

Exemption Tags

There are a number of existing issues that need to be addressed, but they will be addressed in future tickets. In the meantime, the use of specific strings in the LIBDEPS_TAGS variable can allow the libdeps linter to skip certain issues on given libraries. For example, to have the linter skip enforcement of the lint rule against bidirectional edges for "some_library":

env.Library(
    target=some_library
    ...
    LIBDEPS_TAGS=[lint-allow-bidirectional-edges]
)

build-time Libdeps Linter

If there is a build-time issue, the build will fail until it is addressed. This linting feature will be on by default and takes about half a second to complete in a full enterprise build (at the time of writing this), but can be turned off by using the --libdeps-linting=off option on your SCons invocation.

The current rules and there exemptions are listed below:

  1. A 'Program' can not link a non-public dependency, it can only have LIBDEPS links.

    Example
    env.Program(
        target=some_program,
        ...
        LIBDEPS=[lib1], # OK
        LIBDEPS_PRIVATE=[lib2], # This is a Program, BAD
    )
    
    Rationale

    A Program can not be linked into anything else, and there for the transitiveness does not apply. A default value of LIBDEPS was selected for consistency since most Program's were already doing this at the time the rule was created.

    Exemption

    'lint-allow-program-links-private' on the target node

  2. A 'Node' can only directly link a given library once.

    Example
    env.Library(
        target=some_library,
        ...
        LIBDEPS=[lib1], # Linked once, OK
        LIBDEPS_PRIVATE=[lib1], # Also linked in LIBDEPS, BAD
        LIBDEPS_INTERFACE=[lib2, 'lib2'], # Linked twice, BAD
    )
    
    Rationale

    Libdeps will ignore duplicate links, so this rule is mostly for consistency and neatness in the build scripts.

    Exemption

    'lint-allow-dup-libdeps' on the target node

  3. A 'Node' which uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS can only have LIBDEPS_PRIVATE links.

    Example
    env.Library(
        target=some_library,
        ...
        LIBDEPS_DEPENDENTS=['lib3'],
        LIBDEPS=[lib1], # LIBDEPS_DEPENDENTS is in use, BAD
        LIBDEPS_PRIVATE=[lib2], # OK
    )
    
    Rationale

    The node that the library is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT to inject its dependency onto should be conditional, therefore there should not be transitiveness for that dependency since it cannot be the source of any resolved symbols.

    Exemption

    'lint-allow-nonprivate-on-deps-dependents' on the target node

  4. A 'Node' can not link directly to a library that uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS.

    Example
    env.Library(
        target='other_library',
        ...
        LIBDEPS=['lib1'], # BAD, 'lib1' has LIBDEPS_DEPENDENTS
    
    env.Library(
        target=lib1,
        ...
        LIBDEPS_DEPENDENTS=['lib3'],
    )
    
    Rationale

    A library that is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT should only be used for reverse dependency edges. If a node does need to link directly to a library that does have reverse dependency edges, that indicates the library should be split into two separate libraries, containing its direct dependency content and its conditional reverse dependency content.

    Exemption

    'lint-allow-bidirectional-edges' on the target node

  5. All libdeps environment vars must be assigned as lists.

    Example
    env.Library(
        target='some_library',
        ...
        LIBDEPS='lib1', # not a list, BAD
        LIBDEPS_PRIVATE=['lib2'], # OK
    )
    
    Rationale

    Libdeps will handle non-list environment variables, so this is more for consistency and neatness in the build scripts.

    Exemption

    'lint-allow-nonlist-libdeps' on the target node

  6. Libdeps with the tag 'lint-leaf-node-no-deps' shall not link any libdeps.

    Example
    env.Library(
        target='lib2',
        ...
        LIBDEPS_TAGS=[
            'lint-leaf-node-allowed-dep'
        ]
    )
    
    env.Library(
        target='some_library',
        ...
        LIBDEPS=['lib1'], # BAD, should have no LIBDEPS
        LIBDEPS_PRIVATE=['lib2'], # OK, has exemption tag
        LIBDEPS_TAGS=[
            'lint-leaf-node-no-deps'
        ]
    )
    
    Rationale

    The special tag allows certain nodes to be marked and programmatically checked that they remain lead nodes. An example use-case is when we want to make sure certain nodes never link mongodb code.

    Exemption

    'lint-leaf-node-allowed-dep' on the exempted libdep

    Inclusion

    'lint-leaf-node-no-deps' on the target node

  7. Libdeps with the tag 'lint-no-public-deps' shall not link any libdeps.

    Example
    env.Library(
        target='lib2',
        ...
        LIBDEPS_TAGS=[
            'lint-public-dep-allowed'
        ]
    )
    
    env.Library(
        target='some_library',
        ...
        LIBDEPS=[
            'lib1' # BAD
            'lib2' # OK, has exemption tag
        ],
        LIBDEPS_TAGS=[
            'lint-no-public-deps'
        ]
    )
    
    Rationale

    The special tag allows certain nodes to be marked and programmatically checked that they do not link publicly. Some nodes such as mongod_main have special requirements that this programmatically checks.

    Exemption

    'lint-public-dep-allowed' on the exempted libdep

    Inclusion

    'lint-no-public-deps' on the target node

  8. Libdeps shall be sorted alphabetically in LIBDEPS lists in the SCons files.

    Example
    env.Library(
        target='lib2',
        ...
        LIBDEPS=[
            '$BUILD/mongo/db/d', # OK, $ comes before c
            'c', # OK, c comes before s
            'src/a', # BAD, s should be after b
            'b', # BAD, b should be before c
        ]
    )
    
    Rationale

    Keeping the SCons files neat and ordered allows for easier Code Review diffs and generally better maintainability.

    Exemption

    'lint-allow-non-alphabetic' on the exempted libdep

The build-time print Option

The libdeps linter also has the --libdeps-linting=print option which will perform linting, and instead of failing the build on an issue, just print and continue on. It will also ignore exemption tags, and still print the issue because it will not fail the build. This is a good way to see the entirety of existing issues that are exempted by tags, as well as printing other metrics such as time spent linting.

post-build linting and analysis

The dependency graph can be analyzed post-build by leveraging the completeness of the graph to perform more extensive analysis. You will need to install the libdeps requirements file to python when attempting to use the post-build analysis tools:

python3 -m poetry install --no-root --sync -E libdeps

The command line interface tool (gacli) has a comprehensive help text which will describe the available analysis options and interface. The visualizer tool includes a GUI which displays the available analysis options graphically. These tools will be briefly covered in the following sections.

Generating the graph file

To generate the full graph, build the target generate-libdeps-graph. This will build all things involving libdeps and construct a graphml file representing the library dependency graph. The graph can be used in the command line interface tool or the visualizer web service tool. The minimal set of required SCons arguments to build the graph file is shown below:

python3 buildscripts/scons.py --link-model=dynamic --build-tools=next generate-libdeps-graph --linker=gold --modules=

The graph file by default will be generate to build/opt/libdeps/libdeps.graphml (where build/opt is the $BUILD_DIR).

General libdeps analyzer API usage

Below is a basic example of usage of the libdeps analyzer API:

import libdeps

libdeps_graph = libdeps.graph.load_libdeps_graph('path/to/libdeps.graphml')

list_of_analysis_to_run = [
    libdeps.analyzer.NodeCounter(libdeps_graph),
    libdeps.analyzer.DirectDependencies(libdeps_graph, node='path/to/library'),
]

analysis_results = libdeps.graph.LibdepsGraphAnalysis(list_of_analysis_to_run)
libdeps.analyzer.GaPrettyPrinter(analysis_after_run).print()

Walking through this example, first the graph is loaded from file. Then a list of desired Analyzer instances is created. Some example analyzer classes are instantiated in the example above, but there are many others to choose from. Specific Analyzers have different interfaces and should be supplied an argument list corresponding to that analyzer.

Note: The graph file will contain the build dir that the graph data was created with and it expects all node arguments to be relative to the build dir. If you are using the libdeps module generically in some app, you can extract the build dir from the libdeps graph and append it to any generic library path.

Once the list of analyzers is created, they can be used to create a LibdepsGraphAnalysis instance, which will upon instantiation, run the analysis list provided. Once the instance is created, it contains the results, and optionally can be fed into different printer classes. In this case, a human readable format printer called GaPrettyPrinter is used to print to the console.

Using the gacli tool

The command line interface tool can be used from the command line to run analysis on a given graph. The only required argument is the graph file. The default with no args will run all the counters and linters on the graph. Here is an example output:

(venv) Apr.20 02:46 ubuntu[mongo]: python buildscripts/libdeps/gacli.py --graph-file build/cached/libdeps/libdeps.graphml
Loading graph data...Loaded!


Graph built from git hash:
1358cdc6ff0e53e4f4c01ea0e6fcf544fa7e1672

Graph Schema version:
2

Build invocation:
"/home/ubuntu/venv/bin/python" "buildscripts/scons.py" "--variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars" "--cache=all" "--cache-dir=/home/ubuntu/scons-cache" "--link-model=dynamic" "--build-tools=next" "ICECC=icecc" "CCACHE=ccache" "-j200" "--cache-signature-mode=validate" "--cache-debug=-" "generate-libdeps-graph"

Nodes in Graph: 867
Edges in Graph: 90706
Direct Edges in Graph: 5948
Transitive Edges in Graph: 84758
Direct Public Edges in Graph: 3483
Public Edges in Graph: 88241
Private Edges in Graph: 2440
Interface Edges in Graph: 25
Shim Nodes in Graph: 20
Program Nodes in Graph: 136
Library Nodes in Graph: 731

LibdepsLinter: PUBLIC libdeps that could be PRIVATE: 0

Use the --help option to see detailing information about all the available options.

Using the graph visualizer Tool

The graph visualizer tools starts up a web service to provide a frontend GUI for navigating and examining the graph files. The Visualizer uses a Python Flask backend and React/Redux Javascript frontend.

For installing the dependencies for the frontend, you will need node >= 12.0.0 and npm installed and in the PATH. To install the dependencies navigate to directory where package.json lives, and run:

cd buildscripts/libdeps/graph_visualizer_web_stack && npm install

Alternatively if you are on linux, you can use the setup_node_env.sh script to automatically download node 12 and npm, setup the local environment and install the dependencies. Run the command:

source buildscripts/libdeps/graph_visualizer_web_stack/setup_node_env.sh install

Assuming you are on a remote workstation and using defaults, you will need to make ssh tunnels to the web service to access the service in your local browser. The frontend and backend both use a port (this case 3000 is the frontend and 5000 is the backend), and the default host is localhost, so you will need to open two tunnels so the frontend running in your local web browser can communicate with the backend. If you are using the default host and port the tunnel command will look like this:

ssh -L 3000:localhost:3000 -L 5000:localhost:5000 ubuntu@workstation.hostname

Next we need to start the web service. It will require you to pass a directory where it will search for .graphml files which contain the graph data for various commits:

python3 buildscripts/libdeps/graph_visualizer.py --graphml-dir build/opt/libdeps

The script will launch the backend and then build the optimized production frontend and launch it. You can supply the --debug argument to work in development load which starts up much faster and allows real time updates as files are modified, with a small cost to performance on the frontend. Other options allow more configuration and can be viewed in the --help text.

After the server has started up, it should notify you via the terminal that you can access it at http://localhost:3000 locally in your browser.

Build system configuration

SCons configuration

Frequently used flags and variables

MongoDB build configuration

Frequently used flags and variables

MONGO_GIT_HASH

The MONGO_GIT_HASH SCons variable controls the value of the git hash which will be interpolated into the build to identify the commit currently being built. If not overridden, this defaults to the git hash of the current commit.

MONGO_VERSION

The MONGO_VERSION SCons variable controls the value which will be interpolated into the build to identify the version of the software currently being built. If not overridden, this defaults to the result of git describe, which will use the local tags to derive a version.

Targets and Aliases

Build artifacts and installation

Hygienic builds

AutoInstall

AutoArchive

MongoDB SCons style guide

Sconscript Formatting Guidelines

Vertical list style

Alphabetize everything

Environment Isolation

Declaring Targets (Program, Library, and CppUnitTest)

Invoking external tools correctly with Commands

Customizing an Environment for a target

Invoking subordinate SConscripts

Imports and Exports

A Model SConscript with Comments