reference, declarationdefinition
definition → references, declarations, derived classes, virtual overrides
reference to multiple definitions → definitions
unreferenced
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
   63
   64
   65
   66
   67
   68
   69
   70
   71
   72
   73
   74
   75
   76
   77
   78
   79
   80
   81
   82
   83
   84
   85
   86
   87
   88
   89
   90
   91
   92
   93
   94
   95
   96
   97
   98
   99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
===========================
LLVM Branch Weight Metadata
===========================

.. contents::
   :local:

Introduction
============

Branch Weight Metadata represents branch weights as its likeliness to be taken
(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to an
``Instruction`` that is a terminator as a ``MDNode`` of the ``MD_prof`` kind.
The first operator is always a ``MDString`` node with the string
"branch_weights".  Number of operators depends on the terminator type.

Branch weights might be fetch from the profiling file, or generated based on
`__builtin_expect`_ instruction.

All weights are represented as an unsigned 32-bit values, where higher value
indicates greater chance to be taken.

Supported Instructions
======================

``BranchInst``
^^^^^^^^^^^^^^

Metadata is only assigned to the conditional branches. There are two extra
operands for the true and the false branch.

.. code-block:: none

  !0 = metadata !{
    metadata !"branch_weights",
    i32 <TRUE_BRANCH_WEIGHT>,
    i32 <FALSE_BRANCH_WEIGHT>
  }

``SwitchInst``
^^^^^^^^^^^^^^

Branch weights are assigned to every case (including the ``default`` case which
is always case #0).

.. code-block:: none

  !0 = metadata !{
    metadata !"branch_weights",
    i32 <DEFAULT_BRANCH_WEIGHT>
    [ , i32 <CASE_BRANCH_WEIGHT> ... ]
  }

``IndirectBrInst``
^^^^^^^^^^^^^^^^^^

Branch weights are assigned to every destination.

.. code-block:: none

  !0 = metadata !{
    metadata !"branch_weights",
    i32 <LABEL_BRANCH_WEIGHT>
    [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
  }

``CallInst``
^^^^^^^^^^^^^^^^^^

Calls may have branch weight metadata, containing the execution count of
the call. It is currently used in SamplePGO mode only, to augment the
block and entry counts which may not be accurate with sampling.

.. code-block:: none

  !0 = metadata !{
    metadata !"branch_weights",
    i32 <CALL_BRANCH_WEIGHT>
  }

Other
^^^^^

Other terminator instructions are not allowed to contain Branch Weight Metadata.

.. _\__builtin_expect:

Built-in ``expect`` Instructions
================================

``__builtin_expect(long exp, long c)`` instruction provides branch prediction
information. The return value is the value of ``exp``.

It is especially useful in conditional statements. Currently Clang supports two
conditional statements:

``if`` statement
^^^^^^^^^^^^^^^^

The ``exp`` parameter is the condition. The ``c`` parameter is the expected
comparison value. If it is equal to 1 (true), the condition is likely to be
true, in other case condition is likely to be false. For example:

.. code-block:: c++

  if (__builtin_expect(x > 0, 1)) {
    // This block is likely to be taken.
  }

``switch`` statement
^^^^^^^^^^^^^^^^^^^^

The ``exp`` parameter is the value. The ``c`` parameter is the expected
value. If the expected value doesn't show on the cases list, the ``default``
case is assumed to be likely taken.

.. code-block:: c++

  switch (__builtin_expect(x, 5)) {
  default: break;
  case 0:  // ...
  case 3:  // ...
  case 5:  // This case is likely to be taken.
  }

CFG Modifications
=================

Branch Weight Metatada is not proof against CFG changes. If terminator operands'
are changed some action should be taken. In other case some misoptimizations may
occur due to incorrect branch prediction information.

Function Entry Counts
=====================

To allow comparing different functions during inter-procedural analysis and
optimization, ``MD_prof`` nodes can also be assigned to a function definition.
The first operand is a string indicating the name of the associated counter.

Currently, one counter is supported: "function_entry_count". The second operand
is a 64-bit counter that indicates the number of times that this function was
invoked (in the case of instrumentation-based profiles). In the case of
sampling-based profiles, this operand is an approximation of how many times
the function was invoked.

For example, in the code below, the instrumentation for function foo()
indicates that it was called 2,590 times at runtime.

.. code-block:: llvm

  define i32 @foo() !prof !1 {
    ret i32 0
  }
  !1 = !{!"function_entry_count", i64 2590}

If "function_entry_count" has more than 2 operands, the later operands are
the GUID of the functions that needs to be imported by ThinLTO. This is only
set by sampling based profile. It is needed because the sampling based profile
was collected on a binary that had already imported and inlined these functions,
and we need to ensure the IR matches in the ThinLTO backends for profile
annotation. The reason why we cannot annotate this on the callsite is that it
can only goes down 1 level in the call chain. For the cases where
foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels
in the call chain to import both bar_in_b_cc and baz_in_c_cc.