mirror of
https://github.com/mongodb/mongo.git
synced 2024-11-24 00:17:37 +01:00
b665258d9d
GitOrigin-RevId: 35db3811d8f749edd5b79ba910adcbc1ceb54cc4
88 lines
3.2 KiB
Markdown
88 lines
3.2 KiB
Markdown
---
|
|
title: LibFuzzer
|
|
---
|
|
|
|
LibFuzzer is a tool for performing coverage guided fuzzing of C/C++
|
|
code. LibFuzzer will try to trigger AUBSAN failures in a function you
|
|
provide, by repeatedly calling it with a carefully crafted byte array as
|
|
input. Each input will be assigned a "score". Byte arrays which exercise
|
|
new or more regions of code will score better. LibFuzzer will merge and
|
|
mutate high scoring inputs in order to gradually cover more and more
|
|
possible behavior.
|
|
|
|
# When to use LibFuzzer
|
|
|
|
LibFuzzer is great for testing functions which accept a opaque blob of
|
|
untrusted user-provided data.
|
|
|
|
# How to use LibFuzzer
|
|
|
|
LibFuzzer implements `int main`, and expects to be linked with an object
|
|
file which provides the function under test. You will achieve this by
|
|
writing a cpp file which implements
|
|
|
|
```cpp
|
|
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
|
// Your code here
|
|
}
|
|
```
|
|
|
|
`LLVMFuzzerTestOneInput` will be called repeatedly, with fuzzer
|
|
generated bytes in `Data`. `Size` will always truthfully tell your
|
|
implementation how many bytes are in `Data`. If your function crashes or
|
|
induces an AUBSAN fault, LibFuzzer will consider that to be a finding
|
|
worth reporting.
|
|
|
|
Keep in mind that your function will often "just" be adapting `Data` to
|
|
whatever format our internal C++ functions requires. However, you have a
|
|
lot of freedom in exactly what you choose to do. Just make sure your
|
|
function crashes or produces an invariant when something interesting
|
|
happens! As just a few ideas:
|
|
|
|
- You might choose to call multiple implementations of a single
|
|
operation, and validate that they produce the same output when
|
|
presented the same input.
|
|
- You could tease out individual bytes from `Data` and provide them as
|
|
different arguments to the function under test.
|
|
|
|
Finally, your cpp file will need a SCons target. There is a method which
|
|
defines fuzzer targets, much like how we define unittests. For example:
|
|
|
|
```python
|
|
env.CppLibfuzzerTest(
|
|
target='op_msg_fuzzer',
|
|
source=[
|
|
'op_msg_fuzzer.cpp',
|
|
],
|
|
LIBDEPS=[
|
|
'$BUILD_DIR/mongo/base',
|
|
'op_msg_fuzzer_fixture',
|
|
],
|
|
)
|
|
```
|
|
|
|
# Running LibFuzzer
|
|
|
|
Your test's object file and **all** of its dependencies must be compiled
|
|
with the "fuzzer" sanitizer, plus a set of sanitizers which might
|
|
produce interesting runtime errors like AUBSAN. Evergreen has a build
|
|
variant, whose name will include the string "FUZZER", which will compile
|
|
and run all of the fuzzer tests.
|
|
|
|
The fuzzers can be built locally, for development and debugging. Check
|
|
our Evergreen configuration for the current SCons arguments.
|
|
|
|
LibFuzzer binaries will accept a path to a directory containing its
|
|
"corpus". A corpus is a list of examples known to produce interesting
|
|
outputs. LibFuzzer will start producing interesting results more quickly
|
|
if starts off with a set of inputs which it can begin mutating. When its
|
|
done, it will write down any new inputs it discovered into its corpus.
|
|
Re-using a corpus across executions is a good way to make LibFuzzer
|
|
return more results in less time. Our Evergreen tasks will try to
|
|
acquire and re-use a corpus from an earlier commit, if it can.
|
|
|
|
# References
|
|
|
|
- [LibFuzzer's official
|
|
documentation](https://llvm.org/docs/LibFuzzer.html)
|