Tooling Primer

v. 2024-09-05 21:58

Compiler Flags

Our compilers have a variety of flags that enable checks and diagnostics that help us catch bugs early. We should use them as desired.

-Wall: enable "all" warnings. This is not actually all warnings. See below.

-Wextra: enable extra warnings, warnings that are outside the set of -Wall.

-Werror: treat warnings as errors (compilation failes if warnings are raised). Note that -Werror can be annoying if multiple people are developing a project and using different compiler versions. Some compilers may issues warnings when others do not, so the build may fail for one person but not another. Use with discretion.

-Wconversion: warn for any implicit conversion that may change a value, for example casting a float to an int.

-g3: emit debugging symbols, in particular more debugging information that -g.

If you want to turn a specific warning off, you can pass -Wno-<option-name>.

Build Systems

We saw in the C Techniques Primer how to split our code across multiple files. We also saw that the compiler invocation can become complicated when we have to explicitly mention every file and include path that participates in the build. Additionally, we saw that large projects will likely feature additional executables in the form of tests.

Build systems help manage this complexity. cmake and Bazel are among the most popular choices at the time of this writing, but there are many more build systems. We'll cover Bazel.

Bazel is a complex build system, and doing complicated things with Bazel has a relatively steep learning curve. We will focus on the basics that allow us to build the DSM.

Assume the following source tree:

bin/
  |_ main.c
lib/
  |_ foo/
    |_ foo.c
    |_ foo.h
    |_ tests/
        |_ test_foo.c

This project will have two executables: one built from source in bin/ and one built from source in lib/foo/tests/. This project also has one library, one built from the sources lib/foo/foo.*.

To use Bazel, we first mark the root of the repository by creating an empty file called WORKSPACE in the root.

Now, we will create a build target for our foo library. Start by creating a file called BUILD.bazel in lib/foo/. Add the below contents to the new BUILD.bazel file:

cc_library(
  name = "libfoo",
  srcs = ["foo.c"],
  hdrs = ["foo.h"],
  visibility = ["//visibility:public"],
)

This creates a new build target specifically to compile our library. A library is composed of source (implementation) files and headers. We explicitly list all source files and header files composing our library. We also set the visibility, which is a Bazel property configuring which other Bazel targets are allowed to depend on this library. We want to share this library across our project, so we set the visibility to be "public."

Our library also has a test that we will create a build target for. In the same BUILD.bazel file, add the following:

cc_test(
  name = "test_libfoo",
  srcs = ["tests/test_foo.c"],
  deps = [":libfoo"],
  timeout = "short",
)

A cc_test is an executable, and executables don't expose header files for anyone else to include, so there is no hdrs property. Similarly, no one else is going to depend on our test target, so we don't need to specify the visibility. This test does need to exercise our libfoo, though, so we declare a dependency on :libfoo. The : prefix is Bazel syntax indicating a target in this same build file. libfoo is the name we gave to our foo library target above.

Lastly we need to create a build target for our main executable. First, create a new BUILD.bazel file in bin/. Then, add the following contents:

cc_binary(
  name = "main",
  srcs = ["main.c"],
  deps = ["//lib/foo:libfoo"],
)

We already understand all the properties, but the value for the dependency has a form we haven't seen before. Our main binary uses our foo library, but that build target is defined in another build file, so we need to reference the target using its qualified name. We first indicate the location of the build file via //lib/foo. // indicates the root of the project. Once we indicate where the build file lives, we select the target we want to depend on in the build file via :libfoo.

We can now build our entire project by invoking bazel build //.... The //... portion says to build everything (...) starting from the roo (//) of the workspace.

We can run our main binary using bazel run //bin:main. The form of the pathing is similar to how the dependency in bin/BUILD.bazel was qualified.

We can run all (one of) of our tests by invoking bazel test //.... We can run a specific test by using the qualified name of the test target, bazel test //lib/foo:test_libfoo.

If a test failure occurs, Bazel may hide the failure details. To see the failure details, pass the following to bazel test: --test_output=errors.

If you need to build binaries with debug symbols, pass the following to bazel build: --strip=never -c dbg.

Lastly, if you need to fix the language (in this case, C) version being used at compile time, create a .bazelrc file in the same directory as WORKSPACE and add the following content:

--copt='-std=c23'

That configures the compiler to use the C23 language version. Make sure your compiler supports such a version, or else use a different version. You can put other command-line options for your compiler here as well. For each option, you need to provide another --copt='' (on the same or on another line). Note that Bazel suggests that we use --conlyopt for any command-line flags that are C only (not shared with C++).

The more scalable way to fix the language version used is to define a formal Bazel toolchain, but that is outside the scope of this primer. Search online for official documentation and tutorials.

Pay attention to the structure of the example project, namely that header files are located right next to implementation files. When installing a project on some machine, typically header files are copied into some special directory on the user's system, and compiled artifacts are copied to some other directory. This means that files would be reorganized when installed. At the time of this writing, Bazel does not perform this reorganization for you (because it is a build system, not a deployment system). This is one downside to keep in mind about the particular source code organization and installation overhead. (Build systems like cmake currently offer more built-in features around installation.)

See the official Bazel documentation on building a C++ project (also applicable to C) and associated documentation.

Documentation Generators

The defacto standard documentation generator for C projects is Doxygen. We won't cover its usage here, but it's useful to know it exists, and it's also useful to know that there are alternatives such as adapters to use Sphinx (in particular, to adapt Doxygen output to Sphinx) should you want to use a different documentation generator.

Debuggers

printf-based debugging works fine for many problems, but other times you will need more support to fix complicated problems. Debuggers are essential for most efficiently solving difficult issues. Two of the more popular debuggers are gdb (the GNU debugger) and lldb (the Low-Level Debugger, part of the LLVM suite). Both debuggers have very similar capabilities and similar user interfaces.

We will not cover how to use either one in depth here. There are plenty of suitable tutorials online for that. We do recommend you familiarize yourself with at least the following debugger concepts:

breakpoints
next (go to the next line in the source file, stepping over function calls)
step (go to the next line in the source file, stepping into function calls)
printing values
printing the stack trace
moving up and down the stack
printing the lines of source code around the line you're currently stopped on

Static Analyzers

Static analyzers are programs that evaluate your source code to detect coding style violations and potentially bugs in your implementation. Among the most popular is clang-tidy. We won't cover its usage here, but we recommend reading its documentation and using it to analyze your source files.

Sanitizers

Sanitizers are compiler features that instrument your programs with special information to catch particular issues at runtime. While powerful, because they operate at runtime, they introduce an overhead (in time and memory).

Two popular sanitizers relevant to our project are ubsan (undefined behavior sanitizer) and asan (address sanitizer). These sanitizers are available on at least GCC and Clang, but you may need to specially configure your runtime environment or even rebuild your toolchain to access certain features. There are plenty of instructions online on how to set these tools up.

The command-line flags to enable these sanitizers are:

-fsanitize=address -fsanitize=undefined

Valgrind is another popular runtime checker. It's memcheck feature is particularly well-known and used. Valgrind will detect memory leaks or other memory usage errors in your program. For more on valgrind, see here.