reference, declarationdefinition
definition → references, declarations, derived classes, virtual overrides
reference to multiple definitions → definitions
unreferenced
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
   63
   64
   65
   66
   67
   68
   69
   70
   71
   72
   73
   74
   75
   76
   77
   78
   79
   80
   81
   82
   83
   84
   85
   86
   87
   88
   89
   90
   91
   92
   93
   94
   95
   96
   97
   98
   99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203
  204
  205
  206
  207
  208
  209
  210
  211
  212
  213
  214
  215
  216
  217
  218
  219
  220
  221
  222
  223
  224
  225
  226
  227
  228
  229
  230
  231
  232
  233
  234
  235
  236
  237
  238
  239
  240
  241
  242
  243
  244
  245
  246
  247
  248
  249
  250
  251
  252
  253
  254
  255
  256
  257
  258
  259
  260
  261
  262
  263
  264
  265
  266
  267
  268
  269
  270
  271
  272
  273
  274
================================
Fuzzing LLVM libraries and tools
================================

.. contents::
   :local:
   :depth: 2

Introduction
============

The LLVM tree includes a number of fuzzers for various components. These are
built on top of :doc:`LibFuzzer <LibFuzzer>`.


Available Fuzzers
=================

clang-fuzzer
------------

A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
tracker`__.

__ https://llvm.org/pr23057
__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer

clang-proto-fuzzer
------------------

A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
class that describes a subset of the C++ language.

This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
For example, the following command will fuzz clang with a higher optimization
level:

.. code-block:: shell

   % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3

clang-format-fuzzer
-------------------

A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
bugs this fuzzer has reported are `on bugzilla`__
and `on OSS Fuzz's tracker`__.

.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
__ https://llvm.org/pr23052
__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer

llvm-as-fuzzer
--------------

A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
Some of the bugs this fuzzer has reported are `on bugzilla`__.

__ https://llvm.org/pr24639

llvm-dwarfdump-fuzzer
---------------------

A |generic fuzzer| that interprets inputs as object files and runs
:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
this fuzzer has reported are `on OSS Fuzz's tracker`__

__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer

llvm-demangle-fuzzer
---------------------

A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
function!

llvm-isel-fuzzer
----------------

A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.

This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
the following command would fuzz AArch64 with :doc:`GlobalISel`:

.. code-block:: shell

   % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0

Some flags can also be specified in the binary name itself in order to support
OSS Fuzz, which has trouble with required arguments. To do this, you can copy
or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
from the binary name using "--". The valid options are architecture names
(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
keywords, like ``gisel`` for enabling global instruction selection. In this
mode, the same example could be run like so:

.. code-block:: shell

   % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>

llvm-opt-fuzzer
---------------

A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.

It receives optimzation pipeline and runs it for each fuzzer input.

Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
``mtriple`` and ``passes`` arguments are required. Passes are specified in a
format suitable for the new pass manager.

.. code-block:: shell

   % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine

Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
might be embedded directly into the binary file name:

.. code-block:: shell

   % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>

llvm-mc-assemble-fuzzer
-----------------------

A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
target specific assembly.

Note that this fuzzer has an unusual command line interface which is not fully
compatible with all of libFuzzer's features. Fuzzer arguments must be passed
after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
example, to fuzz the AArch64 assembler you might use the following command:

.. code-block:: console

  llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4

This scheme will likely change in the future.

llvm-mc-disassemble-fuzzer
--------------------------

A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
as assembled binary data.

Note that this fuzzer has an unusual command line interface which is not fully
compatible with all of libFuzzer's features. See the notes above about
``llvm-mc-assemble-fuzzer`` for details.


.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
.. |protobuf fuzzer|
   replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
.. |LLVM IR fuzzer|
   replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`


Mutators and Input Generators
=============================

The inputs for a fuzz target are generated via random mutations of a
:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
mutations that a fuzzer in LLVM might want.

.. _fuzzing-llvm-generic:

Generic Random Fuzzing
----------------------

The most basic form of input mutation is to use the built in mutators of
LibFuzzer. These simply treat the input corpus as a bag of bits and make random
mutations. This type of fuzzer is good for stressing the surface layers of a
program, and is good at testing things like lexers, parsers, or binary
protocols.

Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.

.. _fuzzing-llvm-protobuf:

Structured Fuzzing using ``libprotobuf-mutator``
------------------------------------------------

We can use libprotobuf-mutator_ in order to perform structured fuzzing and
stress deeper layers of programs. This works by defining a protobuf class that
translates arbitrary data into structurally interesting input. Specifically, we
use this to work with a subset of the C++ language and perform mutations that
produce valid C++ programs in order to exercise parts of clang that are more
interesting than parser error handling.

To build this kind of fuzzer you need `protobuf`_ and its dependencies
installed, and you need to specify some extra flags when configuring the build
with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
:ref:`building-fuzzers`.

The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
`clang-proto-fuzzer`_.

.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
.. _protobuf: https://github.com/google/protobuf

.. _fuzzing-llvm-ir:

Structured Fuzzing of LLVM IR
-----------------------------

We also use a more direct form of structured fuzzing for fuzzers that take
:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
library, which was `discussed at EuroLLVM 2017`_.

The ``FuzzMutate`` library is used to structurally fuzz backends in
`llvm-isel-fuzzer`_.

.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg


Building and Running
====================

.. _building-fuzzers:

Configuring LLVM to Build Fuzzers
---------------------------------

Fuzzers will be built and linked to libFuzzer by default as long as you build
LLVM with sanitizer coverage enabled. You would typically also enable at least
one sanitizer to find bugs faster. The most common way to build the fuzzers is
by adding the following two flags to your CMake invocation:
``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.

.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
          with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
          to avoid building the sanitizers themselves with sanitizers enabled.

Continuously Running and Finding Bugs
-------------------------------------

There used to be a public buildbot running LLVM fuzzers continuously, and while
this did find issues, it didn't have a very good way to report problems in an
actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
instead.

You can browse the `LLVM project issue list`_ for the bugs found by
`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
list`_.

.. _OSS Fuzz: https://github.com/google/oss-fuzz
.. _LLVM project issue list:
   https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
.. _LLVM on OSS Fuzz:
   https://github.com/google/oss-fuzz/blob/master/projects/llvm
.. _llvm-bugs mailing list:
   http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


Utilities for Writing Fuzzers
=============================

There are some utilities available for writing fuzzers in LLVM.

Some helpers for handling the command line interface are available in
``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
line options in a consistent way and to implement standalone main functions so
your fuzzer can be built and tested when not built against libFuzzer.

There is also some handling of the CMake config for fuzzers, where you should
use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
similarly to functions such as ``add_llvm_tool``, but they take care of linking
to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
enable standalone testing.