Resource Progress Division Year
Learn C The Hard Way 6/55 Exercises 2011
Wikibooks: C Programming 5/33 Sections
C Development on Linux 3.3/7 Sections 2012
The C Programming Language 1.5/8 Chapters 1990
Cprogramming.com Tutorial -/16 Sections
The C Puzzle Book -/29 Problems 1998
Harvard Extension School Unix Systems Programming 0/14 Class 2012

There are countless online resources to learn the C programming langauge. Learn C The Hard Way is an incomplete web text by Zed Shaw that attempts to teach “good modern” C programming practices, which follows his popular text, Learn Python The Hard Way. Wikibooks: C Programming is a featured book consisting of thousands of edits made by members of the Wikibooks community. C Development on Linux is a relatively short series of articles introducing C development on Linux/Unix systems. It requires a basic understanding of programming. Cprogramming.com Tutorial is another online resource.

There are numerous resources for C in print, although many are somewhat outdated. The C Programming Language is the authoritative reference on ANSI C, written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the langauge. Zed Shaw, author of Learn Python The Hard Way, discusses this text from a modern perspective in a chapter entitled, Deconstructing “K&R C”. The C Puzzle Book by Alan R. Feuer is an excellent resource for extending a basic knowledge of ANSI C.

Pointers and addresses

A pointer is a variable that contains the address of a another variable. They usually lead to more compact and efficient code. To create a pointer simply declare it with a type as you would any variable. In addition, add the unary indirection or dereferencing operator * in front of the name of the pointer.

int *x_p;

As with a variable, once a pointer is declared it should be initialized with some value. Instead of assigning it an integer, as with a variable, we will assign the address of a variable using the unary & operator.

int x = 1;
x_p = &x;

In this example, the address of x (e.g. -1081367532) is assigned to x_p, which means that using x_p directly would return -1081367532. If you would like to use the value 1 later x_p must be called with the dereferencing operator (i.e. *x_p).

int y = *x_p;

In this case, the variable y is initialized with the value 1. That means that the integer 1 exists twice in memory, as the value of x and of y. The compiler generates machine code that follows the address provided by x_p, which leads it to the variable x. Then it copies the value of x and assigns it to y.

Also note, it is possible to declare a pointer, variable and function all on one line.

double *dp, *atof();

It is important to note that the function named atof is returning a pointer, which contains the address of a variable of type double.

Length of things

Java and Python store the length of objects so that it is easily retrievable. In C the length of an array needs to be calculated. By dividing the number of bytes that make up the array by the number of bytes that make up one element of the array we can determine it’s length. This can be simplified using a macro as follows:

#define NELEMS(x)  (sizeof(x) / sizeof((x)[0]))

This solution does not work for arrays on the stack since sizeof() will return the number of bytes that are used to store the pointer to the array.

Found in the wild

The following was uncovered in a proprietary Linux device driver:

char (*param)[66] = kmalloc(sizeof(*param) * 65, GFP_KERNEL);

Ternary operator

For example, the variable c will be assigned whichever contains the greater value; a or b.

c = (a>b) ? a : b;

memset

The C standard library function memset is often associated with bugs because a few common errors are made when using memset.

The following code does not dereference the pointer, foo, so memset will use the size of the pointer to zero out the struct, leaving the remainder of the struct to be initialized with whatever happens to be in memory.

static inline void foo_init(foo_t *foo) {
    memset(foo, 0, sizeof(foo) * (FOO_MAX + 1));

memset may get optimized away by the compiler if the memory it modifies does not get used again. In which case it is often encouraged to use memset_s, which prohibits such optimizations and guarantees to perform the memory write. This is particularlly important for security critical contexts, where secure data is being removed from memory.

Linux system programming

Robert Love’s Linux System Programming available in its entirety on Safari Books Online, Scribd and Google Books.

On January 24, 2012 I began the Unix Systems Programming course at the Harvard Extension School in Cambridge, MA. The course consists of a weekly two hour lecture and an additional weekly one hour section meeting.

Embedded programming

Global variables that are modified in a interrupt handler should be marked as volatile, which tells the compiler that the value can change unexpectedly and therefore should not be optimized out. Failing to mark a variable appropriately may lead to issues when processor optimizations or interrupts are enabled. Additionally, drivers or RTOS tasks when run in parallel may be have flaky (see How to Use C’s volatile Keyword.

Note that globals represent a clear example, but should be limited in their use and should be protected against race conditions (e.g. by using a mutex).

Overflow checking

Checking is required to prevent overflows, which is particularly dangerous when allocating memory because it can lead to leaking information (see Integer Overflow into Information Disclosure).

if (nmemb && size > -1UL / nmemb)
    return 0;

Debugging

Additional resources

Data structure alignment

Projects