C Techniques Primer
v. 2024-09-05 21:56
Introduction
This page covers some software engineering techniques used when developing software in C. It assumes that you have at least read the C Language Primer.
Pragmas
In the C Language Primer, we learned about preprocessor directives such as
#include
. Another directive is #pragma
, which stands for "pragmatic
information." Pragmas are another way to pass information to the compiler during
preprocessing. #pragma once
is a particularly popular pragma that tells the
preprocessor to include the contents of a specific file only once. It is used as
a shorthand replacement for the typical
#ifndef MY_MODULE_FILE_H_
#define MY_MODULE_FILE_H_
// ...
#endif MY_MODULE_FILE_H_
Note that while most popular compilers support pragma
, pragma
is not part of
the C standard and so is not considered portable code. The longer-form
#ifndef ...
it replaces is the portable form.
Data-Oriented Programming
Unlike object-oriented languages that feature classes and methods, C has only structures and functions. One way of associating functions with data in C is to prefix function names with the data type they operate on and have the functions take as their first argument an instance (or pointer to an instance) of the data the function is written for. For example:
#pragma once
#include <stdbool.h>
#include <stddef.h>
struct Buffer {
char* _buffer; // A place to store the data
size_t _element_size; // The byte size of an individual element in @c _buffer
size_t _capacity; // The maximum number of elements @c _buffer can hold
size_t _size; // The current number of elements in @c _buffer
};
struct Buffer buffer_new(size_t num_elements, size_t sizeof_element);
bool buffer_push(struct Buffer* b, char* element);
char* buffer_at(struct Buffer* b, size_t index);
void buffer_free(struct Buffer* b);
(Aside: We would typically use void*
where char*
is used above, but we have
not introduced void*
for the Deepcode Stack Machine. If you want to learn more
about void*
, you can read up on it
here or look up "type
erasure in C".)
C also does not have access control, a way to indicate that certain struct fields are "private" implementation details that should not be accessed or modified by users. Such fields are idiomatically prefixed with an underscore as done above, but it is up to the programmers to respect this convention.
Design by Contract
Design by contract is a software construction technique that formalizes software interfaces with so-called contracts. Contracts are expressed in a combination of the interface's signature and documentation. The documentation outlines preconditions, postconditions, and invariants. Pre-conditions are things that must have happened before the inferface is used, invariants are things that must be true about the data coming into the interface, and post-conditions are promises that the interface makes to the end user, for example about data coming out of the interface or the state of the program after the function is called.
This design practice is very useful in C, where you have to manage the runtime
memory on your own, and where there is no language support to express memory
ownership (for example, Rust's references or C++'s std::unique_ptr
). Design
by contract helps clarify inteface expectations and memory management
requirements.
Consider the following example.
/** A struct containing something stateful. */
struct StateThing {
// ...
};
/**
* Initialize the @c StateThing library.
*
* Preconditions:
* - @c The runtime must not have called @c statething_library_init already.
*
* Postconditions:
* - It is invalid to call @c statething_library_init again in the same runtime.
*/
void statething_library_init(void);
/**
* Perform foo with a @c StateThing instance.
*
* Preconditions:
* - @c statething_library_init has been called in the runtime.
*
* @param st The @c StateThing instance to operate on. Cannot be @c NULL .
* @return Some data resulting from doing foo. The caller owns the memory.
*/
char* statething_foo(struct StateThing* st);
Above we see that preconditions and postconditions are called out in their own
sections in the signature documentation. The invariants are mentioned in the
@param
sections. Memory ownership is called out in the @return
section of
statething_foo
, but it could have also been stated as a postcondition.
This is one way of organizing the information; you are free to organize it
however you would like.
If an interface is used but its contract is not fulfilled, the contract is said
to be broken or violated. In this case either the program enters an invalid
state (relative to the interface expectations, not necessarily "undefined
behavior" as outlined in the C standard) or (preferably) the interface raises an
error. If the interface exposes error information, the errors may be raised in
this manner, or the implementation may simply assert
and stop the program.
Implementations using design by contract may also choose to only test their interfaces within the bounds of their contract to make sure that the claimed behavior works as advertised. Sometimes this is enough; other times it is prudent to test outside the bounds of the contract, for example to ensure graceful failure modes. The necessary level of testing depends on the context you deploy your software in.
Testing
Tests are written as additional small programs that exercise your code. It is
therefore helpful to write as much of your code as possible as libraries. This
way you can easily import and run your code in test programs. Most modern build
systems have first-class support for registering binaries as tests. For example,
Bazel provides cc_test and
bazel test //...
, and CMake provides
add_test and
make test
(if you have CMake generate Makefiles).