reference, declarationdefinition
definition → references, declarations, derived classes, virtual overrides
reference to multiple definitions → definitions
unreferenced
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
   63
   64
   65
   66
   67
   68
   69
   70
   71
   72
   73
   74
   75
   76
   77
   78
   79
   80
   81
   82
   83
   84
   85
   86
   87
   88
   89
   90
   91
   92
   93
   94
   95
   96
   97
   98
   99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203
  204
  205
  206
  207
  208
  209
  210
  211
  212
  213
  214
  215
  216
  217
  218
  219
  220
  221
  222
  223
  224
  225
  226
  227
  228
  229
  230
  231
  232
  233
  234
  235
  236
  237
  238
  239
  240
  241
  242
  243
  244
  245
  246
  247
  248
  249
  250
  251
  252
  253
  254
  255
  256
  257
  258
  259
  260
  261
  262
  263
  264
  265
  266
  267
  268
  269
  270
  271
  272
  273
  274
  275
  276
  277
  278
  279
  280
  281
  282
  283
  284
  285
  286
  287
  288
  289
  290
  291
  292
  293
  294
  295
  296
  297
  298
  299
  300
  301
  302
  303
  304
  305
  306
  307
  308
  309
  310
  311
  312
  313
  314
  315
  316
  317
  318
  319
  320
  321
  322
  323
  324
  325
  326
  327
  328
  329
  330
  331
  332
  333
  334
  335
  336
  337
  338
  339
  340
  341
  342
  343
  344
  345
  346
  347
  348
  349
  350
  351
  352
  353
  354
  355
  356
  357
  358
  359
  360
  361
  362
  363
  364
  365
  366
  367
  368
  369
  370
  371
  372
  373
  374
  375
  376
  377
  378
  379
  380
  381
  382
  383
  384
  385
  386
  387
  388
  389
  390
  391
  392
  393
  394
  395
  396
  397
  398
  399
  400
  401
  402
  403
  404
  405
  406
  407
  408
  409
  410
  411
  412
  413
  414
  415
  416
  417
  418
  419
  420
  421
  422
  423
  424
  425
  426
  427
  428
  429
  430
  431
  432
  433
  434
  435
  436
  437
  438
  439
============
CMake Primer
============

.. contents::
   :local:

.. warning::
   Disclaimer: This documentation is written by LLVM project contributors `not`
   anyone affiliated with the CMake project. This document may contain
   inaccurate terminology, phrasing, or technical details. It is provided with
   the best intentions.


Introduction
============

The LLVM project and many of the core projects built on LLVM build using CMake.
This document aims to provide a brief overview of CMake for developers modifying
LLVM projects or building their own projects on top of LLVM.

The official CMake language references is available in the cmake-language
manpage and `cmake-language online documentation
<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.

10,000 ft View
==============

CMake is a tool that reads script files in its own language that describe how a
software project builds. As CMake evaluates the scripts it constructs an
internal representation of the software project. Once the scripts have been
fully processed, if there are no errors, CMake will generate build files to
actually build the project. CMake supports generating build files for a variety
of command line build tools as well as for popular IDEs.

When a user runs CMake it performs a variety of checks similar to how autoconf
worked historically. During the checks and the evaluation of the build
description scripts CMake caches values into the CMakeCache. This is useful
because it allows the build system to skip long-running checks during
incremental development. CMake caching also has some drawbacks, but that will be
discussed later.

Scripting Overview
==================

CMake's scripting language has a very simple grammar. Every language construct
is a command that matches the pattern _name_(_args_). Commands come in three
primary types: language-defined (commands implemented in C++ in CMake), defined
functions, and defined macros. The CMake distribution also contains a suite of
CMake modules that contain definitions for useful functionality.

The example below is the full CMake build for building a C++ "Hello World"
program. The example uses only CMake language-defined functions.

.. code-block:: cmake

   cmake_minimum_required(VERSION 3.2)
   project(HelloWorld)
   add_executable(HelloWorld HelloWorld.cpp)

The CMake language provides control flow constructs in the form of foreach loops
and if blocks. To make the example above more complicated you could add an if
block to define "APPLE" when targeting Apple platforms:

.. code-block:: cmake

   cmake_minimum_required(VERSION 3.2)
   project(HelloWorld)
   add_executable(HelloWorld HelloWorld.cpp)
   if(APPLE)
     target_compile_definitions(HelloWorld PUBLIC APPLE)
   endif()
   
Variables, Types, and Scope
===========================

Dereferencing
-------------

In CMake variables are "stringly" typed. All variables are represented as
strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
and results in a literal substitution of the name for the value. CMake refers to
this as "variable evaluation" in their documentation. Dereferences are performed
*before* the command being called receives the arguments. This means
dereferencing a list results in multiple separate arguments being passed to the
command.

Variable dereferences can be nested and be used to model complex data. For
example:

.. code-block:: cmake

   set(var_name var1)
   set(${var_name} foo) # same as "set(var1 foo)"
   set(${${var_name}}_var bar) # same as "set(foo_var bar)"
   
Dereferencing an unset variable results in an empty expansion. It is a common
pattern in CMake to conditionally set variables knowing that it will be used in
code paths that the variable isn't set. There are examples of this throughout
the LLVM CMake build system.

An example of variable empty expansion is:

.. code-block:: cmake

   if(APPLE)
     set(extra_sources Apple.cpp)
   endif()
   add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
   
In this example the ``extra_sources`` variable is only defined if you're
targeting an Apple platform. For all other targets the ``extra_sources`` will be
evaluated as empty before add_executable is given its arguments.

Lists
-----

In CMake lists are semi-colon delimited strings, and it is strongly advised that
you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
defining lists:

.. code-block:: cmake

   # Creates a list with members a, b, c, and d
   set(my_list a b c d)
   set(my_list "a;b;c;d")
   
   # Creates a string "a b c d"
   set(my_string "a b c d")

Lists of Lists
--------------

One of the more complicated patterns in CMake is lists of lists. Because a list
cannot contain an element with a semi-colon to construct a list of lists you
make a list of variable names that refer to other lists. For example:

.. code-block:: cmake

   set(list_of_lists a b c)
   set(a 1 2 3)
   set(b 4 5 6)
   set(c 7 8 9)
   
With this layout you can iterate through the list of lists printing each value
with the following code:

.. code-block:: cmake

   foreach(list_name IN LISTS list_of_lists)
     foreach(value IN LISTS ${list_name})
       message(${value})
     endforeach()
   endforeach()
   
You'll notice that the inner foreach loop's list is doubly dereferenced. This is
because the first dereference turns ``list_name`` into the name of the sub-list
(a, b, or c in the example), then the second dereference is to get the value of
the list.

This pattern is used throughout CMake, the most common example is the compiler
flags options, which CMake refers to using the following variable expansions:
CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.

Other Types
-----------

Variables that are cached or specified on the command line can have types
associated with them. The variable's type is used by CMake's UI tool to display
the right input field. A variable's type generally doesn't impact evaluation,
however CMake does have special handling for some variables such as PATH.
You can read more about the special handling in `CMake's set documentation
<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.

Scope
-----

CMake inherently has a directory-based scoping. Setting a variable in a
CMakeLists file, will set the variable for that file, and all subdirectories.
Variables set in a CMake module that is included in a CMakeLists file will be
set in the scope they are included from, and all subdirectories.

When a variable that is already set is set again in a subdirectory it overrides
the value in that scope and any deeper subdirectories.

The CMake set command provides two scope-related options. PARENT_SCOPE sets a
variable into the parent scope, and not the current scope. The CACHE option sets
the variable in the CMakeCache, which results in it being set in all scopes. The
CACHE option will not set a variable that already exists in the CACHE unless the
FORCE option is specified.

In addition to directory-based scope, CMake functions also have their own scope.
This means variables set inside functions do not bleed into the parent scope.
This is not true of macros, and it is for this reason LLVM prefers functions
over macros whenever reasonable.

.. note::
  Unlike C-based languages, CMake's loop and control flow blocks do not have
  their own scopes.

Control Flow
============

CMake features the same basic control flow constructs you would expect in any
scripting language, but there are a few quirks because, as with everything in
CMake, control flow constructs are commands.

If, ElseIf, Else
----------------

.. note::
  For the full documentation on the CMake if command go
  `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
  far more complete.

In general CMake if blocks work the way you'd expect:

.. code-block:: cmake

  if(<condition>)
    message("do stuff")
  elseif(<condition>)
    message("do other stuff")
  else()
    message("do other other stuff")
  endif()

The single most important thing to know about CMake's if blocks coming from a C
background is that they do not have their own scope. Variables set inside
conditional blocks persist after the ``endif()``.

Loops
-----

The most common form of the CMake ``foreach`` block is:

.. code-block:: cmake

  foreach(var ...)
    message("do stuff")
  endforeach()

The variable argument portion of the ``foreach`` block can contain dereferenced
lists, values to iterate, or a mix of both:

.. code-block:: cmake

  foreach(var foo bar baz)
    message(${var})
  endforeach()
  # prints:
  #  foo
  #  bar
  #  baz

  set(my_list 1 2 3)
  foreach(var ${my_list})
    message(${var})
  endforeach()
  # prints:
  #  1
  #  2
  #  3

  foreach(var ${my_list} out_of_bounds)
    message(${var})
  endforeach()
  # prints:
  #  1
  #  2
  #  3
  #  out_of_bounds

There is also a more modern CMake foreach syntax. The code below is equivalent
to the code above:

.. code-block:: cmake

  foreach(var IN ITEMS foo bar baz)
    message(${var})
  endforeach()
  # prints:
  #  foo
  #  bar
  #  baz

  set(my_list 1 2 3)
  foreach(var IN LISTS my_list)
    message(${var})
  endforeach()
  # prints:
  #  1
  #  2
  #  3

  foreach(var IN LISTS my_list ITEMS out_of_bounds)
    message(${var})
  endforeach()
  # prints:
  #  1
  #  2
  #  3
  #  out_of_bounds

Similar to the conditional statements, these generally behave how you would
expect, and they do not have their own scope.

CMake also supports ``while`` loops, although they are not widely used in LLVM.

Modules, Functions and Macros
=============================

Modules
-------

Modules are CMake's vehicle for enabling code reuse. CMake modules are just
CMake script files. They can contain code to execute on include as well as
definitions for commands.

In CMake macros and functions are universally referred to as commands, and they
are the primary method of defining code that can be called multiple times.

In LLVM we have several CMake modules that are included as part of our
distribution for developers who don't build our project from source. Those
modules are the fundamental pieces needed to build LLVM-based projects with
CMake. We also rely on modules as a way of organizing the build system's
functionality for maintainability and re-use within LLVM projects.

Argument Handling
-----------------

When defining a CMake command handling arguments is very useful. The examples
in this section will all use the CMake ``function`` block, but this all applies
to the ``macro`` block as well.

CMake commands can have named arguments that are requried at every call site. In
addition, all commands will implicitly accept a variable number of extra
arguments (In C parlance, all commands are varargs functions). When a command is
invoked with extra arguments (beyond the named ones) CMake will store the full
list of arguments (both named and unnamed) in a list named ``ARGV``, and the
sublist of unnamed arguments in ``ARGN``. Below is a trivial example of
providing a wrapper function for CMake's built in function ``add_dependencies``.

.. code-block:: cmake

   function(add_deps target)
     add_dependencies(${target} ${ARGN})
   endfunction()

This example defines a new macro named ``add_deps`` which takes a required first
argument, and just calls another function passing through the first argument and
all trailing arguments.

CMake provides a module ``CMakeParseArguments`` which provides an implementation
of advanced argument parsing. We use this all over LLVM, and it is recommended
for any function that has complex argument-based behaviors or optional
arguments. CMake's official documentation for the module is in the
``cmake-modules`` manpage, and is also available at the
`cmake-modules online documentation
<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.

.. note::
  As of CMake 3.5 the cmake_parse_arguments command has become a native command
  and the CMakeParseArguments module is empty and only left around for
  compatibility.

Functions Vs Macros
-------------------

Functions and Macros look very similar in how they are used, but there is one
fundamental difference between the two. Functions have their own scope, and
macros don't. This means variables set in macros will bleed out into the calling
scope. That makes macros suitable for defining very small bits of functionality
only.

The other difference between CMake functions and macros is how arguments are
passed. Arguments to macros are not set as variables, instead dereferences to
the parameters are resolved across the macro before executing it. This can
result in some unexpected behavior if using unreferenced variables. For example:

.. code-block:: cmake

   macro(print_list my_list)
     foreach(var IN LISTS my_list)
       message("${var}")
     endforeach()
   endmacro()
   
   set(my_list a b c d)
   set(my_list_of_numbers 1 2 3 4)
   print_list(my_list_of_numbers)
   # prints:
   # a
   # b
   # c
   # d

Generally speaking this issue is uncommon because it requires using
non-dereferenced variables with names that overlap in the parent scope, but it
is important to be aware of because it can lead to subtle bugs.

LLVM Project Wrappers
=====================

LLVM projects provide lots of wrappers around critical CMake built-in commands.
We use these wrappers to provide consistent behaviors across LLVM components
and to reduce code duplication.

We generally (but not always) follow the convention that commands prefaced with
``llvm_`` are intended to be used only as building blocks for other commands.
Wrapper commands that are intended for direct use are generally named following
with the project in the middle of the command name (i.e. ``add_llvm_executable``
is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
distribution. It can be included and used by any LLVM sub-project that requires
LLVM.

.. note::

   Not all LLVM projects require LLVM for all use cases. For example compiler-rt
   can be built without LLVM, and the compiler-rt sanitizer libraries are used
   with GCC.

Useful Built-in Commands
========================

CMake has a bunch of useful built-in commands. This document isn't going to
go into details about them because The CMake project has excellent
documentation. To highlight a few useful functions see:

* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_

The full documentation for CMake commands is in the ``cmake-commands`` manpage
and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_