0
0
mirror of https://github.com/mongodb/mongo.git synced 2024-11-30 17:10:48 +01:00
mongodb/docs/libfuzzer.md
Alex Neben b665258d9d SERVER-88970 Added yaml formatting to server repo
GitOrigin-RevId: 35db3811d8f749edd5b79ba910adcbc1ceb54cc4
2024-04-06 05:23:20 +00:00

3.2 KiB

title
LibFuzzer

LibFuzzer is a tool for performing coverage guided fuzzing of C/C++ code. LibFuzzer will try to trigger AUBSAN failures in a function you provide, by repeatedly calling it with a carefully crafted byte array as input. Each input will be assigned a "score". Byte arrays which exercise new or more regions of code will score better. LibFuzzer will merge and mutate high scoring inputs in order to gradually cover more and more possible behavior.

When to use LibFuzzer

LibFuzzer is great for testing functions which accept a opaque blob of untrusted user-provided data.

How to use LibFuzzer

LibFuzzer implements int main, and expects to be linked with an object file which provides the function under test. You will achieve this by writing a cpp file which implements

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
 // Your code here
}

LLVMFuzzerTestOneInput will be called repeatedly, with fuzzer generated bytes in Data. Size will always truthfully tell your implementation how many bytes are in Data. If your function crashes or induces an AUBSAN fault, LibFuzzer will consider that to be a finding worth reporting.

Keep in mind that your function will often "just" be adapting Data to whatever format our internal C++ functions requires. However, you have a lot of freedom in exactly what you choose to do. Just make sure your function crashes or produces an invariant when something interesting happens! As just a few ideas:

  • You might choose to call multiple implementations of a single operation, and validate that they produce the same output when presented the same input.
  • You could tease out individual bytes from Data and provide them as different arguments to the function under test.

Finally, your cpp file will need a SCons target. There is a method which defines fuzzer targets, much like how we define unittests. For example:

  env.CppLibfuzzerTest(
    target='op_msg_fuzzer',
    source=[
        'op_msg_fuzzer.cpp',
    ],
    LIBDEPS=[
        '$BUILD_DIR/mongo/base',
        'op_msg_fuzzer_fixture',
    ],
)

Running LibFuzzer

Your test's object file and all of its dependencies must be compiled with the "fuzzer" sanitizer, plus a set of sanitizers which might produce interesting runtime errors like AUBSAN. Evergreen has a build variant, whose name will include the string "FUZZER", which will compile and run all of the fuzzer tests.

The fuzzers can be built locally, for development and debugging. Check our Evergreen configuration for the current SCons arguments.

LibFuzzer binaries will accept a path to a directory containing its "corpus". A corpus is a list of examples known to produce interesting outputs. LibFuzzer will start producing interesting results more quickly if starts off with a set of inputs which it can begin mutating. When its done, it will write down any new inputs it discovered into its corpus. Re-using a corpus across executions is a good way to make LibFuzzer return more results in less time. Our Evergreen tasks will try to acquire and re-use a corpus from an earlier commit, if it can.

References