reference, declarationdefinition
definition → references, declarations, derived classes, virtual overrides
reference to multiple definitions → definitions
unreferenced
    1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
   63
   64
   65
   66
   67
   68
   69
   70
   71
   72
   73
   74
   75
   76
   77
   78
   79
   80
   81
   82
   83
   84
   85
   86
   87
   88
   89
   90
   91
   92
   93
   94
   95
   96
   97
   98
   99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203
  204
  205
  206
  207
  208
  209
  210
  211
  212
  213
  214
  215
  216
  217
  218
  219
  220
  221
  222
  223
  224
  225
  226
  227
  228
  229
  230
  231
  232
  233
  234
  235
  236
  237
  238
  239
  240
  241
  242
  243
  244
  245
  246
  247
  248
  249
  250
  251
  252
  253
  254
  255
  256
  257
  258
  259
  260
  261
  262
  263
  264
  265
  266
  267
  268
  269
  270
  271
  272
  273
  274
  275
  276
  277
  278
  279
  280
  281
  282
  283
  284
  285
  286
  287
  288
  289
  290
  291
  292
  293
  294
  295
  296
  297
  298
  299
  300
  301
  302
  303
  304
  305
  306
  307
  308
  309
  310
  311
  312
  313
  314
  315
  316
  317
  318
  319
  320
  321
  322
  323
  324
  325
  326
  327
  328
  329
  330
  331
  332
  333
  334
  335
  336
  337
  338
  339
  340
  341
  342
  343
  344
  345
  346
  347
  348
  349
  350
  351
  352
  353
  354
  355
  356
  357
  358
  359
  360
  361
  362
  363
  364
  365
  366
  367
  368
  369
  370
  371
  372
  373
  374
  375
  376
  377
  378
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
          "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
  <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
  <title>Clang - Features and Goals</title>
  <link type="text/css" rel="stylesheet" href="menu.css">
  <link type="text/css" rel="stylesheet" href="content.css">
  <style type="text/css">
</style>
</head>
<body>

<!--#include virtual="menu.html.incl"-->

<div id="content">

<!--*************************************************************************-->
<h1>Clang - Features and Goals</h1>
<!--*************************************************************************-->

<p>
This page describes the <a href="index.html#goals">features and goals</a> of
Clang in more detail and gives a more broad explanation about what we mean.
These features are:
</p>

<p>End-User Features:</p>

<ul>
<li><a href="#performance">Fast compiles and low memory use</a></li>
<li><a href="#expressivediags">Expressive diagnostics</a></li>
<li><a href="#gcccompat">GCC compatibility</a></li>
</ul>

<p>Utility and Applications:</p>

<ul>
<li><a href="#libraryarch">Library based architecture</a></li>
<li><a href="#diverseclients">Support diverse clients</a></li>
<li><a href="#ideintegration">Integration with IDEs</a></li>
<li><a href="#license">Use the LLVM 'BSD' License</a></li>
</ul>

<p>Internal Design and Implementation:</p>

<ul>
<li><a href="#real">A real-world, production quality compiler</a></li>
<li><a href="#simplecode">A simple and hackable code base</a></li>
<li><a href="#unifiedparser">A single unified parser for C, Objective C, C++,
    and Objective C++</a></li>
<li><a href="#conformance">Conformance with C/C++/ObjC and their
    variants</a></li>
</ul>

<!--*************************************************************************-->
<h2><a name="enduser">End-User Features</a></h2>
<!--*************************************************************************-->


<!--=======================================================================-->
<h3><a name="performance">Fast compiles and Low Memory Use</a></h3>
<!--=======================================================================-->

<p>A major focus of our work on clang is to make it fast, light and scalable.
The library-based architecture of clang makes it straight-forward to time and
profile the cost of each layer of the stack, and the driver has a number of
options for performance analysis. Many detailed benchmarks can be found online.</p>

<p>Compile time performance is important, but when using clang as an API, often
memory use is even more so: the less memory the code takes the more code you can
fit into memory at a time (useful for whole program analysis tools, for
example).</p>

<p>In addition to being efficient when pitted head-to-head against GCC in batch
mode, clang is built with a <a href="#libraryarch">library based
architecture</a> that makes it relatively easy to adapt it and build new tools
with it.  This means that it is often possible to apply out-of-the-box thinking
and novel techniques to improve compilation in various ways.</p>


<!--=======================================================================-->
<h3><a name="expressivediags">Expressive Diagnostics</a></h3>
<!--=======================================================================-->

<p>In addition to being fast and functional, we aim to make Clang extremely user
friendly.  As far as a command-line compiler goes, this basically boils down to
making the diagnostics (error and warning messages) generated by the compiler
be as useful as possible.  There are several ways that we do this, but the
most important are pinpointing exactly what is wrong in the program,
highlighting related information so that it is easy to understand at a glance,
and making the wording as clear as possible.</p>

<p>Here is one simple example that illustrates the difference between a typical
GCC and Clang diagnostic:</p>

<pre>
  $ <b>gcc-4.9 -fsyntax-only t.c</b>
  t.c: In function 'int f(int, int)':
  t.c:7:39: error: invalid operands to binary + (have 'int' and 'struct A')
     return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);
                                         ^
  $ <b>clang -fsyntax-only t.c</b>
  t.c:7:39: error: invalid operands to binary expression ('int' and 'struct A')
  <span style="color:darkgreen">  return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</span>
  <span style="color:blue">                       ~~~~~~~~~~~~~~ ^ ~~~~~</span>
</pre>

<p>Here you can see that you don't even need to see the original source code to
understand what is wrong based on the Clang error: Because Clang prints a
caret, you know exactly <em>which</em> plus it is complaining about.  The range
information highlights the left and right side of the plus which makes it
immediately obvious what the compiler is talking about, which is very useful for
cases involving precedence issues and many other situations.</p>

<p>Clang diagnostics are very polished and have many features.  For more
information and examples, please see the <a href="diagnostics.html">Expressive
Diagnostics</a> page.</p>

<!--=======================================================================-->
<h3><a name="gcccompat">GCC Compatibility</a></h3>
<!--=======================================================================-->

<p>GCC is currently the defacto-standard open source compiler today, and it
routinely compiles a huge volume of code.  GCC supports a huge number of
extensions and features (many of which are undocumented) and a lot of
code and header files depend on these features in order to build.</p>

<p>While it would be nice to be able to ignore these extensions and focus on
implementing the language standards to the letter, pragmatics force us to
support the GCC extensions that see the most use.  Many users just want their
code to compile, they don't care to argue about whether it is pedantically C99
or not.</p>

<p>As mentioned above, all
extensions are explicitly recognized as such and marked with extension
diagnostics, which can be mapped to warnings, errors, or just ignored.
</p>


<!--*************************************************************************-->
<h2><a name="applications">Utility and Applications</a></h2>
<!--*************************************************************************-->

<!--=======================================================================-->
<h3><a name="libraryarch">Library Based Architecture</a></h3>
<!--=======================================================================-->

<p>A major design concept for clang is its use of a library-based
architecture.  In this design, various parts of the front-end can be cleanly
divided into separate libraries which can then be mixed up for different needs
and uses.  In addition, the library-based approach encourages good interfaces
and makes it easier for new developers to get involved (because they only need
to understand small pieces of the big picture).</p>

<blockquote><p>
"The world needs better compiler tools, tools which are built as libraries.
This design point allows reuse of the tools in new and novel ways. However,
building the tools as libraries isn't enough: they must have clean APIs, be as
decoupled from each other as possible, and be easy to modify/extend. This
requires clean layering, decent design, and keeping the libraries independent of
any specific client."</p></blockquote>

<p>
Currently, clang is divided into the following libraries and tool:
</p>

<ul>
<li><b>libsupport</b> - Basic support library, from LLVM.</li>
<li><b>libsystem</b> - System abstraction library, from LLVM.</li>
<li><b>libbasic</b> - Diagnostics, SourceLocations, SourceBuffer abstraction,
    file system caching for input source files.</li>
<li><b>libast</b> - Provides classes to represent the C AST, the C type system,
    builtin functions, and various helpers for analyzing and manipulating the
    AST (visitors, pretty printers, etc).</li>
<li><b>liblex</b> - Lexing and preprocessing, identifier hash table, pragma
    handling, tokens, and macro expansion.</li>
<li><b>libparse</b> - Parsing. This library invokes coarse-grained 'Actions'
    provided by the client (e.g. libsema builds ASTs) but knows nothing about
    ASTs or other client-specific data structures.</li>
<li><b>libsema</b> - Semantic Analysis.  This provides a set of parser actions
    to build a standardized AST for programs.</li>
<li><b>libcodegen</b> - Lower the AST to LLVM IR for optimization &amp; code
    generation.</li>
<li><b>librewrite</b> - Editing of text buffers (important for code rewriting
    transformation, like refactoring).</li>
<li><b>libanalysis</b> - Static analysis support.</li>
<li><b>clang</b> - A driver program, client of the libraries at various
    levels.</li>
</ul>

<p>As an example of the power of this library based design....  If you wanted to
build a preprocessor, you would take the Basic and Lexer libraries. If you want
an indexer, you would take the previous two and add the Parser library and
some actions for indexing. If you want a refactoring, static analysis, or
source-to-source compiler tool, you would then add the AST building and
semantic analyzer libraries.</p>

<p>For more information about the low-level implementation details of the
various clang libraries, please see the <a href="docs/InternalsManual.html">
clang Internals Manual</a>.</p>

<!--=======================================================================-->
<h3><a name="diverseclients">Support Diverse Clients</a></h3>
<!--=======================================================================-->

<p>Clang is designed and built with many grand plans for how we can use it.  The
driving force is the fact that we use C and C++ daily, and have to suffer due to
a lack of good tools available for it.  We believe that the C and C++ tools
ecosystem has been significantly limited by how difficult it is to parse and
represent the source code for these languages, and we aim to rectify this
problem in clang.</p>

<p>The problem with this goal is that different clients have very different
requirements.  Consider code generation, for example: a simple front-end that
parses for code generation must analyze the code for validity and emit code
in some intermediate form to pass off to a optimizer or backend.  Because
validity analysis and code generation can largely be done on the fly, there is
not hard requirement that the front-end actually build up a full AST for all
the expressions and statements in the code.  TCC and GCC are examples of
compilers that either build no real AST (in the former case) or build a stripped
down and simplified AST (in the later case) because they focus primarily on
codegen.</p>

<p>On the opposite side of the spectrum, some clients (like refactoring) want
highly detailed information about the original source code and want a complete
AST to describe it with.  Refactoring wants to have information about macro
expansions, the location of every paren expression '(((x)))' vs 'x', full
position information, and much more.  Further, refactoring wants to look
<em>across the whole program</em> to ensure that it is making transformations
that are safe.  Making this efficient and getting this right requires a
significant amount of engineering and algorithmic work that simply are
unnecessary for a simple static compiler.</p>

<p>The beauty of the clang approach is that it does not restrict how you use it.
In particular, it is possible to use the clang preprocessor and parser to build
an extremely quick and light-weight on-the-fly code generator (similar to TCC)
that does not build an AST at all.   As an intermediate step, clang supports
using the current AST generation and semantic analysis code and having a code
generation client free the AST for each function after code generation. Finally,
clang provides support for building and retaining fully-fledged ASTs, and even
supports writing them out to disk.</p>

<p>Designing the libraries with clean and simple APIs allows these high-level
policy decisions to be determined in the client, instead of forcing "one true
way" in the implementation of any of these libraries.  Getting this right is
hard, and we don't always get it right the first time, but we fix any problems
when we realize we made a mistake.</p>

<!--=======================================================================-->
<h3 id="ideintegration">Integration with IDEs</h3>
<!--=======================================================================-->

<p>
We believe that Integrated Development Environments (IDE's) are a great way
to pull together various pieces of the development puzzle, and aim to make clang
work well in such an environment.  The chief advantage of an IDE is that they
typically have visibility across your entire project and are long-lived
processes, whereas stand-alone compiler tools are typically invoked on each
individual file in the project, and thus have limited scope.</p>

<p>There are many implications of this difference, but a significant one has to
do with efficiency and caching: sharing an address space across different files
in a project, means that you can use intelligent caching and other techniques to
dramatically reduce analysis/compilation time.</p>

<p>A further difference between IDEs and batch compiler is that they often
impose very different requirements on the front-end: they depend on high
performance in order to provide a "snappy" experience, and thus really want
techniques like "incremental compilation", "fuzzy parsing", etc.  Finally, IDEs
often have very different requirements than code generation, often requiring
information that a codegen-only frontend can throw away.  Clang is
specifically designed and built to capture this information.
</p>


<!--=======================================================================-->
<h3><a name="license">Use the LLVM 'BSD' License</a></h3>
<!--=======================================================================-->

<p>We actively intend for clang (and LLVM as a whole) to be used for
commercial projects, not only as a stand-alone compiler but also as a library
embedded inside a proprietary application.  The BSD license is the simplest way
to allow this.  We feel that the license encourages contributors to pick up the
source and work with it, and believe that those individuals and organizations
will contribute back their work if they do not want to have to maintain a fork
forever (which is time consuming and expensive when merges are involved).
Further, nobody makes money on compilers these days, but many people need them
to get bigger goals accomplished: it makes sense for everyone to work
together.</p>

<p>For more information about the LLVM/clang license, please see the <a
href="https://llvm.org/docs/DeveloperPolicy.html#copyright-license-and-patents">LLVM License
Description</a> for more information.</p>



<!--*************************************************************************-->
<h2><a name="design">Internal Design and Implementation</a></h2>
<!--*************************************************************************-->

<!--=======================================================================-->
<h3><a name="real">A real-world, production quality compiler</a></h3>
<!--=======================================================================-->

<p>
Clang is designed and built by experienced compiler developers who
are increasingly frustrated with the problems that <a
href="comparison.html">existing open source compilers</a> have.  Clang is
carefully and thoughtfully designed and built to provide the foundation of a
whole new generation of C/C++/Objective C development tools, and we intend for
it to be production quality.</p>

<p>Being a production quality compiler means many things: it means being high
performance, being solid and (relatively) bug free, and it means eventually
being used and depended on by a broad range of people.  While we are still in
the early development stages, we strongly believe that this will become a
reality.</p>

<!--=======================================================================-->
<h3><a name="simplecode">A simple and hackable code base</a></h3>
<!--=======================================================================-->

<p>Our goal is to make it possible for anyone with a basic understanding
of compilers and working knowledge of the C/C++/ObjC languages to understand and
extend the clang source base.  A large part of this falls out of our decision to
make the AST mirror the languages as closely as possible: you have your friendly
if statement, for statement, parenthesis expression, structs, unions, etc, all
represented in a simple and explicit way.</p>

<p>In addition to a simple design, we work to make the source base approachable
by commenting it well, including citations of the language standards where
appropriate, and designing the code for simplicity.  Beyond that, clang offers
a set of AST dumpers, printers, and visualizers that make it easy to put code in
and see how it is represented.</p>

<!--=======================================================================-->
<h3><a name="unifiedparser">A single unified parser for C, Objective C, C++,
and Objective C++</a></h3>
<!--=======================================================================-->

<p>Clang is the "C Language Family Front-end", which means we intend to support
the most popular members of the C family.  We are convinced that the right
parsing technology for this class of languages is a hand-built recursive-descent
parser.  Because it is plain C++ code, recursive descent makes it very easy for
new developers to understand the code, it easily supports ad-hoc rules and other
strange hacks required by C/C++, and makes it straight-forward to implement
excellent diagnostics and error recovery.</p>

<p>We believe that implementing C/C++/ObjC in a single unified parser makes the
end result easier to maintain and evolve than maintaining a separate C and C++
parser which must be bugfixed and maintained independently of each other.</p>

<!--=======================================================================-->
<h3><a name="conformance">Conformance with C/C++/ObjC and their
 variants</a></h3>
<!--=======================================================================-->

<p>When you start work on implementing a language, you find out that there is a
huge gap between how the language works and how most people understand it to
work.  This gap is the difference between a normal programmer and a (scary?
super-natural?) "language lawyer", who knows the ins and outs of the language
and can grok standardese with ease.</p>

<p>In practice, being conformant with the languages means that we aim to support
the full language, including the dark and dusty corners (like trigraphs,
preprocessor arcana, C99 VLAs, etc).  Where we support extensions above and
beyond what the standard officially allows, we make an effort to explicitly call
this out in the code and emit warnings about it (which are disabled by default,
but can optionally be mapped to either warnings or errors), allowing you to use
clang in "strict" mode if you desire.</p>

<p>We also intend to support "dialects" of these languages, such as C89, K&amp;R
C, C++'03, Objective-C 2, etc.</p>

</div>
</body>
</html>